[DOCS] APM failed transaction correlations (#111219)

This commit is contained in:
Lisa Cawley 2021-09-09 08:28:05 -07:00 committed by GitHub
parent c6c4f52fc3
commit fdc7aac4aa
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
3 changed files with 24 additions and 58 deletions

View file

@ -58,68 +58,34 @@ out, you can begin viewing sample traces to continue your investigation.
[[correlations-error-rate]]
==== Find failed transaction correlations
The correlations on the *Error rate* tab help you discover which fields are
contributing to failed transactions.
beta::[]
By default, a number of attributes commonly known to cause performance issues,
like version, infrastructure, and location, are included, but all are completely
customizable to your APM data. Find something interesting? A quick click of a
button will auto-query your data as you work to resolve the underlying issue.
The correlations on the *Failed transaction correlations* tab help you discover
which attributes are most influential in distinguishing between transaction
failures and successes. In this context, the success or failure of a transaction
is determined by its {ecs-ref}/ecs-event.html#field-event-outcome[event.outcome]
value. For example, APM agents set the `event.outcome` to `failure` when an HTTP
transaction returns a `5xx` status code.
The error rate over time chart visualizes the change in error rate over the selected time frame.
Correlated attributes are sorted by _Impact_a visual representation of the
{ref}/search-aggregations-bucket-significantterms-aggregation.html[significant terms aggregation]
score that powers correlations.
Attributes with a high impact, or attributes present in a large percentage of failed transactions,
may contribute to increased error rates.
// The chart highlights the failed transactions in the overall latency distribution for the transaction group.
If there are attributes that have a statistically significant correlation with
failed transactions, they are listed in a table. The table is sorted by scores,
which are mapped to high, medium, or low impact levels. Attributes with high
impact levels are more likely to contribute to failed transactions.
// By default, the attribute with the highest score is added to the chart. To see a different attribute in the chart, hover over its row in the table.
To find error rate correlations, hover over each potentially correlated attribute to
compare the error rate distribution of transactions with and without the selected attribute.
For example, in the screenshot below, the field `url.original` and value `http://localhost:3100...`
existed in 100% of failed transactions between 6:00 and 10:30.
For example, in the screenshot below, the field
`kubernetes.pod.name` and value `frontend-node-59dff47885-fl5lb` has a medium
impact level and existed in 19% of the failed transactions.
[role="screenshot"]
image::apm/images/error-rate-hover.png[Correlations errors hover effect]
image::apm/images/correlations-failed-transactions.png[Failed transaction correlations]
Select the `+` filter to create a new query in the {apm-app} for transactions with
`url.original: http://localhost:3100...`. With the "noise" now filtered out,
you can begin viewing sample traces to continue your investigation.
TIP: Some details, such as the failure and success percentages, are available
only when the
<<observability-enable-inspect-es-queries,observability:enableInspectEsQueries>>
advanced setting is enabled.
As you sift through erroneous transactions, you'll likely notice other interesting attributes.
Return to the correlations fly-out and select *Customize fields* to search on these new attributes.
You may need to do this a few timeseach time filtering out more and more noise and bringing you
closer to a diagnosis.
[discrete]
[[correlations-customize-fields]]
===== Customize fields
By default, a handful of attributes commonly known to cause performance issues
are included in the analysis on the *Error rate* tab. You can add and remove
fields under the **Customize fields** dropdown.
The following fields are selected by default. To keep the default list
manageable, only the first six matching fields with wildcards are used.
**Frontend (RUM) agent:**
* `labels.*`
* `user.*`
* `user_agent.name`
* `user_agent.os.name`
* `url.original`
**Backend agents:**
* `labels.*`
* `host.ip`
* `service.node.name`
* `service.version`
[TIP]
====
* Want to start over? Select **reset** to clear your customizations.
* The *Latency* tab does not have a **Customize fields** dropdown, since it
automatically considers all relevant fields in the transactions.
====
Select the `+` filter to create a new query in the {apm-app} for transactions
with this attribute. You might do his for multiple attributes--each time
filtering out more and more noise and bringing you closer to a diagnosis.

Binary file not shown.

After

Width:  |  Height:  |  Size: 148 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 512 KiB