5.9 KiB
How to monitor Synapse metrics using Prometheus
Install prometheus:
Follow instructions at http://prometheus.io/docs/introduction/install/
Enable synapse metrics:
Simply setting a (local) port number will enable it. Pick a port. prometheus itself defaults to 9090, so starting just above that for locally monitored services seems reasonable. E.g. 9092:
Add to homeserver.yaml:
metrics_port: 9092
Also ensure that
enable_metrics
is set toTrue
.Restart synapse.
Add a prometheus target for synapse.
It needs to set the
metrics_path
to a non-default value (underscrape_configs
):- job_name: "synapse" metrics_path: "/_synapse/metrics" static_configs: - targets: ["my.server.here:9092"]
If your prometheus is older than 1.5.2, you will need to replace
static_configs
in the above withtarget_groups
.Restart prometheus.
Deprecated metrics removed in 0.28.0
Synapse 0.28.0 removes all of the metrics deprecated by 0.27.0, which are those listed under "Old name" below. This has been done to reduce the bandwidth used by gathering metrics and the storage requirements for the Prometheus server, as well as reducing CPU overhead for both Synapse and Prometheus.
Administrators should update any alerts or monitoring dashboards to use the "New name" listed below.
Block and response metrics renamed for 0.27.0
Synapse 0.27.0 begins the process of rationalising the duplicate
*:count
metrics reported for the resource tracking for code
blocks and HTTP requests.
At the same time, the corresponding *:total
metrics are
being renamed, as the :total
suffix no longer makes sense
in the absence of a corresponding :count
metric.
To enable a graceful migration path, this release just adds new names for the metrics being renamed. A future release will remove the old ones.
The following table shows the new metrics, and the old metrics which they are replacing.
New name | Old name |
---|---|
synapse_util_metrics_block_count | synapse_util_metrics_block_timer:count |
synapse_util_metrics_block_count | synapse_util_metrics_block_ru_utime:count |
synapse_util_metrics_block_count | synapse_util_metrics_block_ru_stime:count |
synapse_util_metrics_block_count | synapse_util_metrics_block_db_txn_count:count |
synapse_util_metrics_block_count |
synapse_util_metrics_block_db_txn_duration:count |
synapse_util_metrics_block_time_seconds | synapse_util_metrics_block_timer:total |
synapse_util_metrics_block_ru_utime_seconds | synapse_util_metrics_block_ru_utime:total |
synapse_util_metrics_block_ru_stime_seconds | synapse_util_metrics_block_ru_stime:total |
synapse_util_metrics_block_db_txn_count | synapse_util_metrics_block_db_txn_count:total |
synapse_util_metrics_block_db_txn_duration_seconds |
synapse_util_metrics_block_db_txn_duration:total |
synapse_http_server_response_count | synapse_http_server_requests |
synapse_http_server_response_count | synapse_http_server_response_time:count |
synapse_http_server_response_count | synapse_http_server_response_ru_utime:count |
synapse_http_server_response_count | synapse_http_server_response_ru_stime:count |
synapse_http_server_response_count | synapse_http_server_response_db_txn_count:count |
synapse_http_server_response_count |
synapse_http_server_response_db_txn_duration:count |
synapse_http_server_response_time_seconds | synapse_http_server_response_time:total |
synapse_http_server_response_ru_utime_seconds | synapse_http_server_response_ru_utime:total |
synapse_http_server_response_ru_stime_seconds | synapse_http_server_response_ru_stime:total |
synapse_http_server_response_db_txn_count | synapse_http_server_response_db_txn_count:total |
synapse_http_server_response_db_txn_duration_seconds | synapse_http_server_response_db_txn_duration:total |
Standard Metric Names
As of synapse version 0.18.2, the format of the process-wide metrics has been changed to fit prometheus standard naming conventions. Additionally the units have been changed to seconds, from miliseconds.
New name | Old name |
---|---|
process_cpu_user_seconds_total | process_resource_utime / 1000 |
process_cpu_system_seconds_total | process_resource_stime / 1000 |
process_open_fds (no 'type' label) | process_fds |
The python-specific counts of garbage collector performance have been renamed.
New name | Old name |
---|---|
python_gc_time | reactor_gc_time |
python_gc_unreachable_total | reactor_gc_unreachable |
python_gc_counts | reactor_gc_counts |
The twisted-specific reactor metrics have been renamed.
New name | Old name |
---|---|
python_twisted_reactor_pending_calls | reactor_pending_calls |
python_twisted_reactor_tick_time | reactor_tick_time |