[DOCS] Adds docs for Ingest Node Pipelines (#66822) (#67390)

* [DOCS] Adds docs for Ingest Node Pipelines

* [DOCS] Incorporates review comments

* [DOCS] Fixes formatting of log file
This commit is contained in:
gchaps 2020-05-26 15:29:12 -07:00 committed by GitHub
parent 9678dedd65
commit 3a15a042d8
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
5 changed files with 153 additions and 1 deletions

Binary file not shown.

After

Width:  |  Height:  |  Size: 85 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 55 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 192 KiB

View file

@ -0,0 +1,144 @@
[role="xpack"]
[[ingest-node-pipelines]]
== Ingest Node Pipelines
*Ingest Node Pipelines* enables you to create and manage {es}
pipelines that perform common transformations and
enrichments on your data. For example, you might remove a field,
rename an existing field, or set a new field.
Youll find *Ingest Node Pipelines* in *Management > Elasticsearch*. With this feature, you can:
* View a list of your pipelines and drill down into details.
* Create a pipeline that defines a series of tasks, known as processors.
* Test a pipeline before feeding it with real data to ensure the pipeline works as expected.
* Delete a pipeline that is no longer needed.
[role="screenshot"]
image:management/ingest-pipelines/images/ingest-pipeline-list.png["Ingest node pipeline list"]
[float]
=== Required permissions
The minimum required permissions to access *Ingest Node Pipelines* are
the `manage_pipeline` and `cluster:monitor/nodes/info` cluster privileges.
You can add these privileges in *Management > Security > Roles*.
[role="screenshot"]
image:management/ingest-pipelines/images/ingest-pipeline-privileges.png["Privileges required for Ingest Node Pipelines"]
[float]
[[ingest-node-pipelines-manage]]
=== Manage pipelines
From the list view, you can to drill down into the details of a pipeline.
To
edit, clone, or delete a pipeline, use the *Actions* menu.
If you dont have any pipelines, you can create one using the
*Create pipeline* form. Youll define processors to transform documents
in a specific way. To handle exceptions, you can optionally define
failure processors to execute immediately after a failed processor.
Before creating the pipeline, you can verify it provides the expected output.
[float]
[[ingest-node-pipelines-example]]
==== Example: Create a pipeline
In this example, youll create a pipeline to handle server logs in the
Common Log Format. The log looks similar to this:
[source,js]
----------------------------------
212.87.37.154 - - [05/May/2020:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\"
200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\"
----------------------------------
The log contains an IP address, timestamp, and user agent. You want to give
these three items their own field in {es} for fast search and visualization.
You also want to know where the request is coming from.
. In *Ingest Node Pipelines*, click *Create a pipeline*.
. Provide a name and description for the pipeline.
. Define the processors:
+
[source,js]
----------------------------------
[
{
"grok": {
"field": "message",
"patterns": ["%{IPORHOST:clientip} %{USER:ident} %{USER:auth} \\[%{HTTPDATE:timestamp}\\] \"%{WORD:verb} %{DATA:request} HTTP/%{NUMBER:httpversion}\" %{NUMBER:response:int} (?:-|%{NUMBER:bytes:int}) %{QS:referrer} %{QS:agent}"]
}
},
{
"date": {
"field": "timestamp",
"formats": [ "dd/MMM/YYYY:HH:mm:ss Z" ]
}
},
{
"geoip": {
"field": "clientip"
}
},
{
"user_agent": {
"field": "agent"
}
}
]
----------------------------------
+
This code defines four {ref}/ingest-processors.html[processors] that run sequentially:
{ref}/grok-processor.html[grok], {ref}/date-processor.html[date],
{ref}/geoip-processor.html[geoip], and {ref}/user-agent-processor.html[user_agent].
Your form should look similar to this:
+
[role="screenshot"]
image:management/ingest-pipelines/images/ingest-pipeline-processor.png["Processors for Ingest Node Pipelines"]
. To verify that the pipeline gives the expected outcome, click *Test pipeline*.
. In the *Document* tab, provide the following sample document for testing:
+
[source,js]
----------------------------------
[
{
"_source": {
"message": "212.87.37.154 - - [05/May/2020:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\" 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\""
}
}
]
----------------------------------
. Click *Run the pipeline* and check if the pipeline worked as expected.
+
You can also
view the verbose output and refresh the output from this view.
. If everything looks correct, close the panel, and then click *Create pipeline*.
+
At this point, youre ready to use the Elasticsearch index API to load
the logs data.
. In the Kibana Console, index a document with the pipeline
you created.
+
[source,js]
----------------------------------
PUT my-index/_doc/1?pipeline=access_logs
{
"message": "212.87.37.154 - - [05/May/2020:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\" 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\""
}
----------------------------------
. To verify, run:
+
[source,js]
----------------------------------
GET my-index/_doc/1
----------------------------------

View file

@ -32,6 +32,12 @@ View index settings, mappings, and statistics and perform operations, such as re
flushing, and clearing the cache. Practicing good index management ensures
that your data is stored cost effectively.
a| <<ingest-node-pipelines, *Ingest Node Pipelines*>>
Create and manage {es}
pipelines that enable you to perform common transformations and
enrichments on your data.
| <<managing-licenses, *License Management*>>
View the status of your license, start a trial, or install a new license. For
@ -85,7 +91,7 @@ set the timespan for notification messages, and much more.
| <<managing-alerts-and-actions, *Alerts and Actions*>>
Centrally manage your alerts across {kib}. Create and manage reusable
Centrally manage your alerts across {kib}. Create and manage reusable
connectors for triggering actions.
| <<managing-fields, *Index Patterns*>>
@ -140,6 +146,8 @@ include::{kib-repo-dir}/management/index-lifecycle-policies/example-index-lifecy
include::{kib-repo-dir}/management/managing-indices.asciidoc[]
include::{kib-repo-dir}/management/ingest-pipelines/ingest-pipelines.asciidoc[]
include::{kib-repo-dir}/management/managing-fields.asciidoc[]
include::{kib-repo-dir}/management/managing-licenses.asciidoc[]