kibana/docs/apm/correlations.asciidoc
2021-03-17 13:02:49 -07:00

124 lines
5.5 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

[role="xpack"]
[[correlations]]
=== Find latency and error correlations
**Correlations** surface attributes of your data that are potentially correlated with high-latency or erroneous transactions.
By default, a number of attributes commonly known to cause performance issues, like version,
infrastructure, and location, are included, but all are completely customizable to your APM data.
Find something interesting? A quick click of a button will auto-query your data as you work to resolve the underlying issue.
For example, a site reliability engineer, who is responsible for keeping production systems up and running,
notices an increase in latency in certain transactions.
Analyzing metadata or tags that exist in high-latency transactions but not in lower-latency transactions
can potentially point towards the root cause.
They may find that a particular piece of hardware, like a host or pod, has failed, increasing latency.
Or, perhaps set of users, based on IP address or region, is facing increased latency due to local data center issues.
[discrete]
[[view-correlations]]
=== View correlations
With a service selected, click **View correlations**:
[role="screenshot"]
image::apm/images/correlations.png[Correlations]
Queries within the APM app apply to the correlations shown in the correlations fly-out.
If a correlated field seems noteworthy, use the **Filter** quick links:
* `+` creates a new query in the APM app for filtering transactions containing the selected value.
* `-` creates a new query in the APM app to filter out transactions containing the selected value.
[discrete]
[[correlations-latency]]
==== Find high-latency correlations
Correlations help you discover which fields are contributing to increased service latency.
A latency distribution chart visualizes the overall latency of the selected service's transactions.
Correlated attributes are sorted by _Impact_a visual representation of the
{ref}/search-aggregations-bucket-significantterms-aggregation.html[significant terms aggregation]
score that powers correlations.
Attributes with a high impact, or attributes present in a large percentage of slow transactions,
may contribute to increased latency.
To find high-latency correlations, hover over each potentially correlated attribute to
compare the latency distribution of transactions with and without the selected attribute.
For example, in the screenshot below, the field `user_agent.name` and value `HeadlessChrome`
exists primarily in higher-latency transactions between 3.7 and 8.7 seconds.
[role="screenshot"]
image::apm/images/correlations-hover.png[Correlations hover effect]
Select the `+` filter to create a new query in the APM app for transactions with
`user_agent.name: HeadlessChrome`. With the "noise" now filtered out,
you can begin viewing sample traces to continue your investigation.
As you sift through high-latency transactions, you'll likely notice other interesting attributes.
Return to the correlations fly-out and select *Customize fields* to search on these new attributes.
You may need to do this a few timeseach time filtering out more and more noise and bringing you
closer to a diagnosis.
[discrete]
[[correlations-error-rate]]
==== Find error rate correlations
Correlations help you discover which fields are contributing to failed transactions.
The Error rate over time chart visualizes the change in error rate over the selected time frame.
Correlated attributes are sorted by _Impact_a visual representation of the
{ref}/search-aggregations-bucket-significantterms-aggregation.html[significant terms aggregation]
score that powers correlations.
Attributes with a high impact, or attributes present in a large percentage of failed transactions,
may contribute to increased error rates.
To find error rate correlations, hover over each potentially correlated attribute to
compare the error rate distribution of transactions with and without the selected attribute.
For example, in the screenshot below, the field `url.original` and value `http://localhost:3100...`
existed in 100% of failed transactions between 6:00 and 10:30.
[role="screenshot"]
image::apm/images/error-rate-hover.png[Correlations errors hover effect]
Select the `+` filter to create a new query in the APM app for transactions with
`url.original: http://localhost:3100...`. With the "noise" now filtered out,
you can begin viewing sample traces to continue your investigation.
As you sift through erroneous transactions, you'll likely notice other interesting attributes.
Return to the correlations fly-out and select *Customize fields* to search on these new attributes.
You may need to do this a few timeseach time filtering out more and more noise and bringing you
closer to a diagnosis.
[discrete]
[[correlations-customize-fields]]
==== Customize fields
Correlations are only as good as the data they're searching for.
By default, a handful of attributes commonly known to cause performance issues are included.
During the course of an investigation however, you may to need to add and remove fields from
this list multiple times as you narrow in on a diagnosis.
Add and remove fields under the **Customize fields** dropdown.
The following fields are selected by default.
To keep the default list manageable, only the first six matching fields with wildcards are used.
**Frontend (RUM) agent:**
* `labels.*`
* `user.*`
* `user_agent.name`
* `user_agent.os.name`
* `url.original`
**Backend agents:**
* `labels.*`
* `host.ip`
* `service.node.name`
* `service.version`
TIP: Want to start over? Select **reset** to clear your customizations.