docs: APM updates for 7.14 (#104232)

Co-authored-by: Nathan L Smith <nathan.smith@elastic.co>
This commit is contained in:
Brandon Morelli 2021-07-12 15:12:56 -07:00 committed by GitHub
parent e88910a1c6
commit 9ab26cf089
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
21 changed files with 61 additions and 72 deletions

View file

@ -43,6 +43,7 @@ Supported configurations are also tagged with the image:./images/dynamic-config.
[horizontal]
Go Agent:: {apm-go-ref}/configuration.html[Configuration reference]
iOS agent:: _Not yet supported_
Java Agent:: {apm-java-ref}/configuration.html[Configuration reference]
.NET Agent:: {apm-dotnet-ref}/configuration.html[Configuration reference]
Node.js Agent:: {apm-node-ref}/configuration.html[Configuration reference]

View file

@ -1,69 +1,57 @@
[role="xpack"]
[[apm-alerts]]
=== Alerts
=== Alerts and rules
++++
<titleabbrev>Create an alert</titleabbrev>
++++
The APM app allows you to define **rules** to detect complex conditions within your APM data
and trigger built-in **actions** when those conditions are met.
The APM app integrates with Kibana's {kibana-ref}/alerting-getting-started.html[alerting and actions] feature.
It provides a set of built-in **actions** and APM specific threshold **alerts** for you to use
and enables central management of all alerts from <<management,Kibana Management>>.
The following **rules** are supported:
* Latency anomaly rule:
Alert when latency of a service is abnormal
* Transaction error rate threshold rule:
Alert when the service's transaction error rate is above the defined threshold
* Error count threshold rule:
Alert when the number of errors in a service exceeds a defined threshold
[role="screenshot"]
image::apm/images/apm-alert.png[Create an alert in the APM app]
For a walkthrough of the alert flyout panel, including detailed information on each configurable property,
see Kibana's <<create-edit-rules,defining alerts>>.
For a complete walkthrough of the **Create rule** flyout panel, including detailed information on each configurable property,
see Kibana's <<create-edit-rules,create and edit rules>>.
The APM app supports four different types of alerts:
* Transaction duration anomaly:
alerts when the service's transaction duration reaches a certain anomaly score
* Transaction duration threshold:
alerts when the service's transaction duration exceeds a given time limit over a given time frame
* Transaction error rate threshold:
alerts when the service's transaction error rate is above the selected rate over a given time frame
* Error count threshold:
alerts when service exceeds a selected number of errors over a given time frame
Below, we'll walk through the creation of two of these alerts.
Below, we'll walk through the creation of two APM rules.
[float]
[[apm-create-transaction-alert]]
=== Example: create a transaction duration alert
=== Example: create a latency anomaly rule
Transaction duration alerts trigger when the duration of a specific transaction type in a service exceeds a defined threshold.
This guide will create an alert for the `opbeans-java` service based on the following criteria:
Latency anomaly rules trigger when the latency of a service is abnormal.
This guide will create an alert for all services based on the following criteria:
* Environment: Production
* Transaction type: `transaction.type:request`
* Average request is above `1500ms` for the last 5 minutes
* Check every 10 minutes, and repeat the alert every 30 minutes
* Send the alert via Slack
* Environment: production
* Severity level: critical
* Run every five minutes
* Send an alert to a Slack channel only when the rule status changes
From the APM app, navigate to the `opbeans-java` service and select
**Alerts** > **Create threshold alert** > **Transaction duration**.
From any page in the APM app, select **Alerts and rules** > **Latency** > **Create anomaly rule**.
Change the name of the alert, but do not edit the tags.
`Transaction duration | opbeans-java` is automatically set as the name of the alert,
and `apm` and `service.name:opbeans-java` are added as tags.
It's fine to change the name of the alert, but do not edit the tags.
Based on the criteria above, define the following rule details:
Based on the alert criteria, define the following alert details:
* **Check every** - `5 minutes`
* **Notify** - "Only on status change"
* **Environment** - `all`
* **Has anomaly with severity** - `critical`
* **Check every** - `10 minutes`
* **Notify every** - `30 minutes`
* **TYPE** - `request`
* **WHEN** - `avg`
* **IS ABOVE** - `1500ms`
* **FOR THE LAST** - `5 minutes`
Select an action type.
Multiple action types can be selected, but in this example, we want to post to a Slack channel.
Next, add a connector. Multiple connectors can be selected, but in this example we're interested in Slack.
Select **Slack** > **Create a connector**.
Enter a name for the connector,
and paste the webhook URL.
and paste your Slack webhook URL.
See Slack's webhook documentation if you need to create one.
A default message is provided as a starting point for your alert.
@ -72,35 +60,32 @@ to pass additional alert values at the time a condition is detected to an action
A list of available variables can be accessed by selecting the
**add variable** button image:apm/images/add-variable.png[add variable button].
Select **Save**. The alert has been created and is now active!
Click **Save**. The rule has been created and is now active!
[float]
[[apm-create-error-alert]]
=== Example: create an error rate alert
=== Example: create an error count threshold alert
Error rate alerts trigger when the number of errors in a service exceeds a defined threshold.
This guide creates an alert for the `opbeans-python` service based on the following criteria:
The error count threshold alert triggers when the number of errors in a service exceeds a defined threshold.
This guide will create an alert for all services based on the following criteria:
* Environment: Production
* All environments
* Error rate is above 25 for the last minute
* Check every 1 minute, and repeat the alert every 10 minutes
* Send the alert via email to the `opbeans-python` team
* Check every 1 minute, and alert every time the rule is active
* Send the alert via email to the site reliability team
From the APM app, navigate to the `opbeans-python` service and select
**Alerts** > **Create threshold alert** > **Error rate**.
From any page in the APM app, select **Alerts and rules** > **Error count** > **Create threshold rule**.
Change the name of the alert, but do not edit the tags.
`Error rate | opbeans-python` is automatically set as the name of the alert,
and `apm` and `service.name:opbeans-python` are added as tags.
It's fine to change the name of the alert, but do not edit the tags.
Based on the alert criteria, define the following alert details:
Based on the criteria above, define the following rule details:
* **Check every** - `1 minute`
* **Notify every** - `10 minutes`
* **IS ABOVE** - `25 errors`
* **FOR THE LAST** - `1 minute`
* **Notify** - "Every time alert is active"
* **Environment** - `all`
* **Is above** - `25 errors`
* **For the last** - `1 minute`
Select the **Email** action type and click **Create a connector**.
Select the **Email** connector and click **Create a connector**.
Fill out the required details: sender, host, port, etc., and click **save**.
A default message is provided as a starting point for your alert.
@ -109,14 +94,14 @@ to pass additional alert values at the time a condition is detected to an action
A list of available variables can be accessed by selecting the
**add variable** button image:apm/images/add-variable.png[add variable button].
Select **Save**. The alert has been created and is now active!
Click **Save**. The alert has been created and is now active!
[float]
[[apm-alert-manage]]
=== Manage alerts and actions
=== Manage alerts and rules
From the APM app, select **Alerts** > **View active alerts** to be taken to the Kibana alerts and actions management page.
From this page, you can create, edit, disable, mute, and delete alerts, and create, edit, and disable connectors.
From the APM app, select **Alerts and rules** > **Manage rules** to be taken to the Kibana **Rules and Connectors** page.
From this page, you can disable, mute, and delete APM alerts.
[float]
[[apm-alert-more-info]]
@ -126,4 +111,4 @@ See {kibana-ref}/alerting-getting-started.html[alerting and actions] for more in
NOTE: If you are using an **on-premise** Elastic Stack deployment with security,
communication between Elasticsearch and Kibana must have TLS configured.
More information is in the alerting {kibana-ref}/alerting-setup.html#alerting-prerequisites[prerequisites].
More information is in the alerting {kibana-ref}/alerting-setup.html#alerting-prerequisites[prerequisites].

View file

@ -36,6 +36,7 @@ It's vital to be consistent when naming environments in your agents.
To learn how to configure service environments, see the specific agent documentation:
* *Go:* {apm-go-ref}/configuration.html#config-environment[`ELASTIC_APM_ENVIRONMENT`]
* *iOS agent:* _Not yet supported_
* *Java:* {apm-java-ref}/config-core.html#config-environment[`environment`]
* *.NET:* {apm-dotnet-ref}/config-core.html#config-environment[`Environment`]
* *Node.js:* {apm-node-ref}/configuration.html#environment[`environment`]

Binary file not shown.

Before

Width:  |  Height:  |  Size: 268 KiB

After

Width:  |  Height:  |  Size: 257 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 575 KiB

After

Width:  |  Height:  |  Size: 413 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 301 KiB

After

Width:  |  Height:  |  Size: 327 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 429 KiB

After

Width:  |  Height:  |  Size: 545 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 401 KiB

After

Width:  |  Height:  |  Size: 281 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 202 KiB

After

Width:  |  Height:  |  Size: 222 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 168 KiB

After

Width:  |  Height:  |  Size: 191 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 187 KiB

After

Width:  |  Height:  |  Size: 253 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 59 KiB

After

Width:  |  Height:  |  Size: 60 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 725 KiB

After

Width:  |  Height:  |  Size: 460 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 250 KiB

After

Width:  |  Height:  |  Size: 307 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 564 KiB

After

Width:  |  Height:  |  Size: 531 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 558 KiB

After

Width:  |  Height:  |  Size: 407 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 475 KiB

After

Width:  |  Height:  |  Size: 307 KiB

View file

@ -108,6 +108,7 @@ Service maps are supported for the following Agent versions:
[horizontal]
Go agent:: ≥ v1.7.0
iOS agent:: _Not yet supported_
Java agent:: ≥ v1.13.0
.NET agent:: ≥ v1.3.0
Node.js agent:: ≥ v3.6.0

View file

@ -100,22 +100,22 @@ the selected transaction group.
image::apm/images/apm-transaction-response-dist.png[Example view of response time distribution]
[[transaction-duration-distribution]]
==== Transactions duration distribution
==== Latency distribution
This chart plots all transaction durations for the given time period.
A plot of all transaction durations for the given time period.
The screenshot below shows a typical distribution,
and indicates most of our requests were served quickly -- awesome!
It's the requests on the right, the ones taking longer than average, that we probably want to focus on.
It's the requests on the right, the ones taking longer than average, that we probably need to focus on.
[role="screenshot"]
image::apm/images/apm-transaction-duration-dist.png[Example view of transactions duration distribution graph]
image::apm/images/apm-transaction-duration-dist.png[Example view of latency distribution graph]
Select a transaction duration _bucket_ to display up to ten trace samples.
Select a latency duration _bucket_ to display up to ten trace samples.
[[transaction-trace-sample]]
==== Trace sample
Trace samples are based on the _bucket_ selection in the *Transactions duration distribution* chart;
Trace samples are based on the _bucket_ selection in the *Latency distribution* chart;
update the samples by selecting a new _bucket_.
The number of requests per bucket is displayed when hovering over the graph,
and the selected bucket is highlighted to stand out.

View file

@ -15,6 +15,7 @@ don't forget to check our other troubleshooting guides or discussion forum:
* {apm-server-ref}/troubleshooting.html[APM Server troubleshooting]
* {apm-dotnet-ref}/troubleshooting.html[.NET agent troubleshooting]
* {apm-go-ref}/troubleshooting.html[Go agent troubleshooting]
* {apm-ios-ref}/troubleshooting.html[iOS agent troubleshooting]
* {apm-java-ref}/trouble-shooting.html[Java agent troubleshooting]
* {apm-node-ref}/troubleshooting.html[Node.js agent troubleshooting]
* {apm-php-ref}/troubleshooting.html[PHP agent troubleshooting]

View file

@ -18,7 +18,7 @@ It is enabled by default.
// Any changes made in this file will be seen there as well.
// tag::apm-indices-settings[]
Index defaults can be changed in Kibana. Open the main menu, then click *APM > Settings > Indices*.
Index defaults can be changed in the APM app. Select **Settings** > **Indices**.
Index settings in the APM app take precedence over those set in `kibana.yml`.
[role="screenshot"]