kibana/x-pack/plugins/task_manager
Gidi Meir Morris 5308cc7100
[Task Manager] Monitors the Task Manager Poller and automatically recovers from failure (#75420)
Introduces a monitor around the Task Manager poller which pips through all values emitted by the poller and recovers from poller failures or stalls.
This monitor does the following:
1. Catches the poller thrown errors and recovers by proxying the error to a handler and continues listening to the poller.
2. Reacts to the poller `error` (caused by uncaught errors) and `completion` events, by starting a new poller and piping its event through to any previous subscribers (in our case, Task Manager itself).
3. Tracks the rate at which the poller emits events (this can be both work events, and `No Task` events, so polling and finding no work, still counts as an emitted event) and times out when this rate gets too long (suggesting the poller  has hung) and replaces the Poller with a new one.

We're not aware of any clear cases where Task Manager should actually get restarted by the monitor - this is definitely an error case and we have addressed all known cases.
The goal of introducing this monitor is as an insurance policy in case an unexpected error case breaks the poller in a long running production environment.
2020-08-20 21:26:56 +01:00
..
server [Task Manager] Monitors the Task Manager Poller and automatically recovers from failure (#75420) 2020-08-20 21:26:56 +01:00
kibana.json