kibana/docs/apm/service-overview.asciidoc
2021-09-23 11:52:10 -07:00

161 lines
6 KiB
Plaintext

[role="xpack"]
[[service-overview]]
=== Service overview
Selecting a <<services,*service*>> brings you to the *Service overview*.
The *Service overview* contains a wide variety of charts and tables that provide
high-level visibility into how a service is performing across your infrastructure:
* Service details like service version, runtime version, framework, and agent name and version
* Container and orchestration information
* Cloud provider, machine type, and availability zone
* Latency, throughput, and errors over time
* Service dependencies
[discrete]
[[service-time-comparison]]
=== Time series comparison
Comparing how a service performs relative to a previous time frame can offer additional insight into
the health of your services. For example, has latency been slowly increasing over time, or did the service
experience a sudden spike--enabling a time series comparison can provide the answer.
[role="screenshot"]
image::apm/images/time-series-comparison.png[Time series comparison]
Select the *Comparison* box to enable or disable time series comparison.
The time comparison options are based on the selected time filter range:
[options="header"]
|====
|Time filter | Time comparison options
|≤ 24 hours
|One day or one week
|> 24 hours and ≤ 7 days
|One week
|> 7 days
|An identical amount of time immediately before the selected time range
|====
[discrete]
[[service-latency]]
=== Latency
Response times for the service. You can filter the *Latency* chart to display the average,
95th, or 99th percentile latency times for the service.
[role="screenshot"]
image::apm/images/latency.png[Service latency]
[discrete]
[[service-throughput-transactions]]
=== Throughput and transactions
The *Throughput* chart visualizes the average number of transactions per minute for the selected service.
The *Transactions* table displays a list of _transaction groups_ for the
selected service and includes the latency, traffic, error rate, and the impact for each transaction.
Transactions that share the same name are grouped, and only one entry is displayed for each group.
By default, transaction groups are sorted by _Impact_ to show the most used and slowest endpoints in your
service. If there is a particular endpoint you are interested in, click *View transactions* to view a
list of similar transactions on the <<transactions, transactions overview>> page.
[role="screenshot"]
image::apm/images/traffic-transactions.png[Traffic and transactions]
[discrete]
[[service-error-rates]]
=== Failed transaction rate and errors
The failed transaction rate represents the percentage of failed transactions from the perspective of the selected service.
It's useful for visualizing unexpected increases, decreases, or irregular patterns in a service's transactions.
+
[TIP]
====
HTTP **transactions** from the HTTP server perspective do not consider a `4xx` status code (client error) as a failure
because the failure was caused by the caller, not the HTTP server. Thus, `event.outcome=success` and there will be no increase in failed transaction rate.
HTTP **spans** from the client perspective however, are considered failures if the HTTP status code is ≥ 400.
These spans will set `event.outcome=failure` and increase the failed transaction rate.
If there is no HTTP status, both transactions and spans are considered successful unless an error is reported.
====
The *Errors* table provides a high-level view of each error message when it first and last occurred,
along with the total number of occurrences. This makes it very easy to quickly see which errors affect
your services and take actions to rectify them. To do so, click *View errors*.
[role="screenshot"]
image::apm/images/error-rate.png[failed transaction rate and errors]
[discrete]
[[service-span-duration]]
=== Span types average duration and dependencies
The *Time spent by span type* chart visualizes each span type's average duration and helps you determine
which spans could be slowing down transactions. The "app" label displayed under the
chart indicates that something was happening within the application. This could signal that the
agent does not have auto-instrumentation for whatever was happening during that time or that the time was spent in the
application code and not in database or external requests.
The *Dependencies* table displays a list of downstream services or external connections relevant
to the service at the selected time range. The table displays latency, throughput, failed transaction rate, and the impact of
each dependency. By default, dependencies are sorted by _Impact_ to show the most used and the slowest dependency.
If there is a particular dependency you are interested in, click *<<dependencies,View dependencies>>* to learn more about it.
NOTE: Displaying dependencies for services instrumented with the Real User Monitoring (RUM) agent
requires an agent version ≥ v5.6.3.
[role="screenshot"]
image::apm/images/spans-dependencies.png[Span type duration and dependencies]
[discrete]
[[service-instances]]
=== Instances
The *Instances* table displays a list of all the available service instances within the selected time range.
Depending on how the service runs, the instance could be a host or a container. The table displays latency, throughput,
failed transaction, CPU usage, and memory usage for each instance. By default, instances are sorted by _Throughput_.
[role="screenshot"]
image::apm/images/all-instances.png[All instances]
[discrete]
[[service-metadata]]
=== Service metadata
To view metadata relating to the service agent, and if relevant, the container and cloud provider,
click on each icon located at the top of the page beside the service name.
[role="screenshot"]
image::apm/images/metadata-icons.png[Service metadata]
*Service information*
* Service version
* Runtime name and version
* Framework name
* Agent name and version
*Container information*
* Operating system
* Containerized - Yes or no.
* Total number of instances
* Orchestration
*Cloud provider information*
* Cloud provider
* Availability zones
* Machine types
* Project ID
*Alerts*
* Recently fired alerts