[DOCS] Updates ML/anomaly detection terms in the Kibana guide (#41965)

This commit is contained in:
István Zoltán Szabó 2019-07-30 09:59:56 +02:00
parent 1965dfa6d5
commit 7481c2cf50
4 changed files with 54 additions and 51 deletions

View file

@ -1,3 +1,4 @@
[role="xpack"]
[[creating-df-kib]]
== Creating {dataframe-transforms}

View file

@ -1,8 +1,8 @@
[role="xpack"]
[[ml-jobs]]
== Creating machine learning jobs
== Creating {anomaly-jobs}
Machine learning jobs contain the configuration information and metadata
{anomaly-jobs-cap} contain the configuration information and metadata
necessary to perform an analytics task.
{kib} provides the following wizards to make it easier to create jobs:
@ -33,7 +33,7 @@ appears:
[role="screenshot"]
image::ml/images/ml-data-recognizer-sample.jpg[A screenshot of the {kib} sample data web log job creation wizard]
TIP: Alternatively, after you load a sample data set on the {kib} home page, you can click *View data* > *ML jobs*. There are {ml} jobs for both the sample eCommerce orders data set and the sample web logs data set.
TIP: Alternatively, after you load a sample data set on the {kib} home page, you can click *View data* > *ML jobs*. There are {anomaly-jobs} for both the sample eCommerce orders data set and the sample web logs data set.
If you use {filebeat-ref}/index.html[{filebeat}]
to ship access logs from your
@ -57,17 +57,17 @@ wizards appear:
[role="screenshot"]
image::ml/images/ml-data-recognizer-metricbeat.jpg[A screenshot of the {metricbeat} job creation wizards]
These wizards create {ml} jobs, dashboards, searches, and visualizations that
are customized to help you analyze your {auditbeat}, {filebeat}, and
These wizards create {anomaly-jobs}, dashboards, searches, and visualizations
that are customized to help you analyze your {auditbeat}, {filebeat}, and
{metricbeat} data.
[NOTE]
===============================
If your data is located outside of {es}, you cannot use {kib} to create
your jobs and you cannot use {dfeeds} to retrieve your data in real time.
Machine learning analysis is still possible, however, by using APIs to
{anomal-detect-cap} is still possible, however, by using APIs to
create and manage jobs and post data to them. For more information, see
{ref}/ml-apis.html[Machine Learning APIs].
{ref}/ml-apis.html[{ml-cap} {anomaly-detect} APIs].
===============================
////

View file

@ -1,35 +1,36 @@
[role="xpack"]
[[xpack-ml]]
= Machine Learning
= {ml-cap}
[partintro]
--
As datasets increase in size and complexity, the human effort required to
inspect dashboards or maintain rules for spotting infrastructure problems,
cyber attacks, or business issues becomes impractical. The Elastic {ml-features}
automatically model the normal behavior of your time series data — learning
trends, periodicity, and more — in real time to identify anomalies, streamline
root cause analysis, and reduce false positives.
cyber attacks, or business issues becomes impractical. The Elastic {ml}
{anomaly-detect} feature automatically models the normal behavior of your time
series data — learning trends, periodicity, and more — in real time to identify
anomalies, streamline root cause analysis, and reduce false positives.
The {ml-features} run in and scale with {es}, and include an
intuitive UI on the {kib} *Machine Learning* page for creating anomaly detection
jobs and understanding results.
{anomaly-detect-cap} runs in and scales with {es}, and includes an
intuitive UI on the {kib} *Machine Learning* page for creating {anomaly-jobs}
and understanding results.
If you have a basic license, you can use the *Data Visualizer* to learn more
about your data. In particular, if your data is stored in {es} and contains a
time field, you can use the *Data Visualizer* to identify possible fields for
{ml} analysis:
{anomaly-detect}:
[role="screenshot"]
image::ml/images/ml-data-visualizer-sample.jpg[Data Visualizer for sample flight data]
experimental[] You can also upload a CSV, NDJSON, or log file (up to 100 MB in size).
The {ml-features} identify the file format and field mappings. You can then
optionally import that data into an {es} index.
experimental[] You can also upload a CSV, NDJSON, or log file (up to 100 MB in
size). The *Data Visualizer* identifies the file format and field mappings. You
can then optionally import that data into an {es} index.
If you have a trial or platinum license, you can <<ml-jobs,create {ml} jobs>>
and manage jobs and {dfeeds} from the *Job Management* pane:
If you have a trial or platinum license, you can
<<ml-jobs,create {anomaly-jobs}>> and manage jobs and {dfeeds} from the *Job
Management* pane:
[role="screenshot"]
image::ml/images/ml-job-management.jpg[Job Management]
@ -42,7 +43,7 @@ You can use the *Settings* pane to create and edit
image::ml/images/ml-settings.jpg[Calendar Management]
The *Anomaly Explorer* and *Single Metric Viewer* display the results of your
{ml} jobs. For example:
{anomaly-jobs}. For example:
[role="screenshot"]
image::ml/images/ml-single-metric-viewer.jpg[Single Metric Viewer]
@ -56,17 +57,17 @@ occurring in your operational environment at that time:
image::ml/images/ml-annotations-list.jpg[Single Metric Viewer with annotations]
In some circumstances, annotations are also added automatically. For example, if
the {ml} analytics detect that there is missing data, it annotates the affected
the {anomaly-job} detects that there is missing data, it annotates the affected
time period. For more information, see
{stack-ov}/ml-delayed-data-detection.html[Handling delayed data].
The *Job Management* pane shows the full list of annotations for each job.
{stack-ov}/ml-delayed-data-detection.html[Handling delayed data]. The
*Job Management* pane shows the full list of annotations for each job.
NOTE: The {kib} {ml-features} use pop-ups. You must configure your
web browser so that it does not block pop-up windows or create an exception for
your {kib} URL.
NOTE: The {kib} {ml-features} use pop-ups. You must configure your web
browser so that it does not block pop-up windows or create an exception for your
{kib} URL.
For more information about {ml}, see
{stack-ov}/xpack-ml.html[Machine learning in the {stack}].
For more information about the {anomaly-detect} feature, see
{stack-ov}/xpack-ml.html[{ml-cap} {anomaly-detect}].
--

View file

@ -5,16 +5,17 @@
<titleabbrev>Job tips</titleabbrev>
++++
When you are creating a job in {kib}, the job creation wizards can provide
advice based on the characteristics of your data. By heeding these suggestions,
you can create jobs that are more likely to produce insightful {ml} results.
When you create an {anomaly-job} in {kib}, the job creation wizards can
provide advice based on the characteristics of your data. By heeding these
suggestions, you can create jobs that are more likely to produce insightful {ml}
results.
[[bucket-span]]
==== Bucket span
The bucket span is the time interval that {ml} analytics use to summarize and
model data for your job. When you create a job in {kib}, you can choose to
estimate a bucket span value based on your data characteristics.
model data for your job. When you create an {anomaly-job} in {kib}, you can
choose to estimate a bucket span value based on your data characteristics.
NOTE: The bucket span must contain a valid time interval. For more information,
see {ref}/ml-job-resource.html#ml-analysisconfig[Analysis configuration objects].
@ -22,7 +23,7 @@ see {ref}/ml-job-resource.html#ml-analysisconfig[Analysis configuration objects]
If you choose a value that is larger than one day or is significantly different
than the estimated value, you receive an informational message. For more
information about choosing an appropriate bucket span, see
{xpack-ref}/ml-buckets.html[Buckets].
{stack-ov}/ml-buckets.html[Buckets].
[[cardinality]]
==== Cardinality
@ -40,14 +41,14 @@ job uses more memory resources. In particular, if the cardinality of the
Likewise if you are performing population analysis and the cardinality of the
`over_field_name` is below 10, you are advised that this might not be a suitable
field to use. For more information, see
{xpack-ref}/ml-configuring-pop.html[Performing Population Analysis].
{stack-ov}/ml-configuring-pop.html[Performing Population Analysis].
[[detectors]]
==== Detectors
Each job must have one or more _detectors_. A detector applies an analytical
function to specific fields in your data. If your job does not contain a
detector or the detector does not contain a
Each {anomaly-job} must have one or more _detectors_. A detector applies an
analytical function to specific fields in your data. If your job does not
contain a detector or the detector does not contain a
{stack-ov}/ml-functions.html[valid function], you receive an error.
If a job contains duplicate detectors, you also receive an error. Detectors are
@ -57,9 +58,9 @@ duplicates if they have the same `function`, `field_name`, `by_field_name`,
[[influencers]]
==== Influencers
When you create a job, you can specify _influencers_, which are also sometimes
referred to as _key fields_. Picking an influencer is strongly recommended for
the following reasons:
When you create an {anomaly-job}, you can specify _influencers_, which are also
sometimes referred to as _key fields_. Picking an influencer is strongly
recommended for the following reasons:
* It allows you to more easily assign blame for the anomaly
* It simplifies and aggregates the results
@ -78,11 +79,11 @@ The job creation wizards in {kib} can suggest which fields to use as influencers
[[model-memory-limits]]
==== Model memory limits
For each job, you can optionally specify a `model_memory_limit`, which is the
approximate maximum amount of memory resources that are required for analytical
processing. The default value is 1 GB. Once this limit is approached, data
pruning becomes more aggressive. Upon exceeding this limit, new entities are not
modeled.
For each {anomaly-job}, you can optionally specify a `model_memory_limit`, which
is the approximate maximum amount of memory resources that are required for
analytical processing. The default value is 1 GB. Once this limit is approached,
data pruning becomes more aggressive. Upon exceeding this limit, new entities
are not modeled.
You can also optionally specify the `xpack.ml.max_model_memory_limit` setting.
By default, it's not set, which means there is no upper bound on the acceptable
@ -92,9 +93,9 @@ TIP: If you set the `model_memory_limit` too high, it will be impossible to open
the job; jobs cannot be allocated to nodes that have insufficient memory to run
them.
If the estimated model memory limit for a job is greater than the model memory
limit for the job or the maximum model memory limit for the cluster, the job
creation wizards in {kib} generate a warning. If the estimated memory
If the estimated model memory limit for an {anomaly-job} is greater than the
model memory limit for the job or the maximum model memory limit for the cluster,
the job creation wizards in {kib} generate a warning. If the estimated memory
requirement is only a little higher than the `model_memory_limit`, the job will
probably produce useful results. Otherwise, the actions you take to address
these warnings vary depending on the resources available in your cluster: