0ec5148c73
Pretty sure the rules only get created in the current kibana space, but the message is plural. Switching to singular. Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
133 lines
5.8 KiB
Text
133 lines
5.8 KiB
Text
[role="xpack"]
|
|
[[kibana-alerts]]
|
|
= {kib} alerts
|
|
|
|
The {stack} {monitor-features} provide
|
|
<<alerting-getting-started,{kib} alerting rules>> out-of-the box to notify you
|
|
of potential issues in the {stack}. These rules are preconfigured based on the
|
|
best practices recommended by Elastic. However, you can tailor them to meet your
|
|
specific needs.
|
|
|
|
[role="screenshot"]
|
|
image::user/monitoring/images/monitoring-kibana-alerting-notification.png["{kib} alerting notifications in {stack-monitor-app}"]
|
|
|
|
When you open *{stack-monitor-app}* for the first time, you will be asked to acknowledge the creation of these default rules. They are initially configured to detect and notify on various
|
|
conditions across your monitored clusters. You can view notifications for: *Cluster health*, *Resource utilization*, and *Errors and exceptions* for {es}
|
|
in real time.
|
|
|
|
NOTE: The default {watcher} based "cluster alerts" for {stack-monitor-app} have
|
|
been recreated as rules in {kib} {alert-features}. For this reason, the existing
|
|
{watcher} email action
|
|
`monitoring.cluster_alerts.email_notifications.email_address` no longer works.
|
|
The default action for all {stack-monitor-app} rules is to write to {kib} logs
|
|
and display a notification in the UI.
|
|
|
|
To review and modify existing *{stack-monitor-app}* rules, click *Enter setup mode* on the *Cluster overview* page.
|
|
Alternatively, to manage all rules, including create and delete functionality go to *Stack Management > Rules and Connectors*.
|
|
|
|
[discrete]
|
|
[[kibana-alerts-cpu-threshold]]
|
|
== CPU usage threshold
|
|
|
|
This rule checks for {es} nodes that run a consistently high CPU load. By
|
|
default, the condition is set at 85% or more averaged over the last 5 minutes.
|
|
The default rule checks on a schedule time of 1 minute with a re-notify interval of 1 day.
|
|
|
|
[discrete]
|
|
[[kibana-alerts-disk-usage-threshold]]
|
|
== Disk usage threshold
|
|
|
|
This rule checks for {es} nodes that are nearly at disk capacity. By default,
|
|
the condition is set at 80% or more averaged over the last 5 minutes. The default rule
|
|
checks on a schedule time of 1 minute with a re-notify interval of 1 day.
|
|
|
|
[discrete]
|
|
[[kibana-alerts-jvm-memory-threshold]]
|
|
== JVM memory threshold
|
|
|
|
This rule checks for {es} nodes that use a high amount of JVM memory. By
|
|
default, the condition is set at 85% or more averaged over the last 5 minutes.
|
|
The default rule checks on a schedule time of 1 minute with a re-notify interval of 1 day.
|
|
|
|
[discrete]
|
|
[[kibana-alerts-missing-monitoring-data]]
|
|
== Missing monitoring data
|
|
|
|
This rule checks for {es} nodes that stop sending monitoring data. By default,
|
|
the condition is set to missing for 15 minutes looking back 1 day. The default rule checks on a schedule
|
|
time of 1 minute with a re-notify interval of 6 hours.
|
|
|
|
[discrete]
|
|
[[kibana-alerts-thread-pool-rejections]]
|
|
== Thread pool rejections (search/write)
|
|
|
|
This rule checks for {es} nodes that experience thread pool rejections. By
|
|
default, the condition is set at 300 or more over the last 5 minutes. The default rule
|
|
checks on a schedule time of 1 minute with a re-notify interval of 1 day. Thresholds can be set
|
|
independently for `search` and `write` type rejections.
|
|
|
|
[discrete]
|
|
[[kibana-alerts-ccr-read-exceptions]]
|
|
== CCR read exceptions
|
|
|
|
This rule checks for read exceptions on any of the replicated {es} clusters. The
|
|
condition is met if 1 or more read exceptions are detected in the last hour. The
|
|
default rule checks on a schedule time of 1 minute with a re-notify interval of 6 hours.
|
|
|
|
[discrete]
|
|
[[kibana-alerts-large-shard-size]]
|
|
== Large shard size
|
|
|
|
This rule checks for a large average shard size (across associated primaries) on
|
|
any of the specified index patterns in an {es} cluster. The condition is met if
|
|
an index's average shard size is 55gb or higher in the last 5 minutes. The default rule
|
|
matches the pattern of `-.*` by running checks on a schedule time of 1 minute with a re-notify interval of 12 hours.
|
|
|
|
[discrete]
|
|
[[kibana-alerts-cluster-alerts]]
|
|
== Cluster alerting
|
|
|
|
These rules check the current status of your {stack}. You can drill down into
|
|
the metrics to view more information about your cluster and specific nodes, instances, and indices.
|
|
|
|
An action is triggered if any of the following conditions are met within the
|
|
last minute:
|
|
|
|
* {es} cluster health status is yellow (missing at least one replica)
|
|
or red (missing at least one primary).
|
|
* {es} version mismatch. You have {es} nodes with
|
|
different versions in the same cluster.
|
|
* {kib} version mismatch. You have {kib} instances with different
|
|
versions running against the same {es} cluster.
|
|
* Logstash version mismatch. You have Logstash nodes with different
|
|
versions reporting stats to the same monitoring cluster.
|
|
* {es} nodes changed. You have {es} nodes that were recently added or removed.
|
|
* {es} license expiration. The cluster's license is about to expire.
|
|
+
|
|
--
|
|
If you do not preserve the data directory when upgrading a {kib} or
|
|
Logstash node, the instance is assigned a new persistent UUID and shows up
|
|
as a new instance.
|
|
--
|
|
* Subscription license expiration. When the expiration date
|
|
approaches, you will get notifications with a severity level relative to how
|
|
soon the expiration date is:
|
|
** 60 days: Informational alert
|
|
** 30 days: Low-level alert
|
|
** 15 days: Medium-level alert
|
|
** 7 days: Severe-level alert
|
|
+
|
|
The 60-day and 30-day thresholds are skipped for Trial licenses, which are only
|
|
valid for 30 days.
|
|
|
|
[discrete]
|
|
== Alerts and rules
|
|
[discrete]
|
|
=== Create default rules
|
|
This option can be used to create default rules in this kibana space. This is
|
|
useful for scenarios when you didn't choose to create these default rules initially
|
|
or anytime later if the rules were accidentally deleted.
|
|
|
|
NOTE: Some action types are subscription features, while others are free.
|
|
For a comparison of the Elastic subscription levels, see the alerting section of
|
|
the {subscriptions}[Subscriptions page].
|