synapse/docs/metrics-howto.rst

116 lines
5.4 KiB
ReStructuredText
Raw Normal View History

2015-04-23 17:07:49 +02:00
How to monitor Synapse metrics using Prometheus
===============================================
2017-02-20 00:06:08 +01:00
1. Install prometheus:
2015-04-23 17:07:49 +02:00
2017-02-20 00:06:08 +01:00
Follow instructions at http://prometheus.io/docs/introduction/install/
2015-04-23 17:07:49 +02:00
2017-02-20 00:06:08 +01:00
2. Enable synapse metrics:
2015-04-23 17:07:49 +02:00
2017-02-20 00:06:08 +01:00
Simply setting a (local) port number will enable it. Pick a port.
prometheus itself defaults to 9090, so starting just above that for
locally monitored services seems reasonable. E.g. 9092:
2015-04-23 17:07:49 +02:00
2017-02-20 00:06:08 +01:00
Add to homeserver.yaml::
metrics_port: 9092
Also ensure that ``enable_metrics`` is set to ``True``.
2018-01-16 14:04:01 +01:00
2017-02-20 00:06:08 +01:00
Restart synapse.
2015-04-23 17:07:49 +02:00
2017-02-20 00:06:08 +01:00
3. Add a prometheus target for synapse.
It needs to set the ``metrics_path`` to a non-default value (under ``scrape_configs``)::
- job_name: "synapse"
metrics_path: "/_synapse/metrics"
static_configs:
- targets: ["my.server.here:9092"]
2016-10-31 16:06:52 +01:00
2018-01-16 14:04:01 +01:00
If your prometheus is older than 1.5.2, you will need to replace
2017-02-20 00:06:08 +01:00
``static_configs`` in the above with ``target_groups``.
2018-01-16 14:04:01 +01:00
2017-02-20 00:06:45 +01:00
Restart prometheus.
2017-02-20 00:06:08 +01:00
2018-01-16 14:04:01 +01:00
Block and response metrics renamed for 0.27.0
---------------------------------------------
Synapse 0.27.0 begins the process of rationalising the duplicate ``*:count``
metrics reported for the resource tracking for code blocks and HTTP requests.
At the same time, the corresponding ``*:total`` metrics are being renamed, as
the ``:total`` suffix no longer makes sense in the absence of a corresponding
``:count`` metric.
To enable a graceful migration path, this release just adds new names for the
metrics being renamed. A future release will remove the old ones.
The following table shows the new metrics, and the old metrics which they are
replacing.
==================================================== ===================================================
New name Old name
==================================================== ===================================================
synapse_util_metrics_block_count synapse_util_metrics_block_timer:count
synapse_util_metrics_block_count synapse_util_metrics_block_ru_utime:count
synapse_util_metrics_block_count synapse_util_metrics_block_ru_stime:count
synapse_util_metrics_block_count synapse_util_metrics_block_db_txn_count:count
synapse_util_metrics_block_count synapse_util_metrics_block_db_txn_duration:count
synapse_util_metrics_block_time_seconds synapse_util_metrics_block_timer:total
synapse_util_metrics_block_ru_utime_seconds synapse_util_metrics_block_ru_utime:total
synapse_util_metrics_block_ru_stime_seconds synapse_util_metrics_block_ru_stime:total
synapse_util_metrics_block_db_txn_count synapse_util_metrics_block_db_txn_count:total
synapse_util_metrics_block_db_txn_duration_seconds synapse_util_metrics_block_db_txn_duration:total
synapse_http_server_response_count synapse_http_server_requests
synapse_http_server_response_count synapse_http_server_response_time:count
synapse_http_server_response_count synapse_http_server_response_ru_utime:count
synapse_http_server_response_count synapse_http_server_response_ru_stime:count
synapse_http_server_response_count synapse_http_server_response_db_txn_count:count
synapse_http_server_response_count synapse_http_server_response_db_txn_duration:count
synapse_http_server_response_time_seconds synapse_http_server_response_time:total
synapse_http_server_response_ru_utime_seconds synapse_http_server_response_ru_utime:total
synapse_http_server_response_ru_stime_seconds synapse_http_server_response_ru_stime:total
synapse_http_server_response_db_txn_count synapse_http_server_response_db_txn_count:total
synapse_http_server_response_db_txn_duration_seconds synapse_http_server_response_db_txn_duration:total
==================================================== ===================================================
2016-10-31 16:06:52 +01:00
Standard Metric Names
---------------------
As of synapse version 0.18.2, the format of the process-wide metrics has been
changed to fit prometheus standard naming conventions. Additionally the units
have been changed to seconds, from miliseconds.
================================== =============================
New name Old name
2018-01-16 14:04:01 +01:00
================================== =============================
2016-10-31 16:06:52 +01:00
process_cpu_user_seconds_total process_resource_utime / 1000
process_cpu_system_seconds_total process_resource_stime / 1000
process_open_fds (no 'type' label) process_fds
================================== =============================
The python-specific counts of garbage collector performance have been renamed.
=========================== ======================
New name Old name
2018-01-16 14:04:01 +01:00
=========================== ======================
python_gc_time reactor_gc_time
2016-10-31 16:06:52 +01:00
python_gc_unreachable_total reactor_gc_unreachable
python_gc_counts reactor_gc_counts
=========================== ======================
The twisted-specific reactor metrics have been renamed.
==================================== =====================
2016-10-31 16:06:52 +01:00
New name Old name
2018-01-16 14:04:01 +01:00
==================================== =====================
python_twisted_reactor_pending_calls reactor_pending_calls
2016-10-31 16:06:52 +01:00
python_twisted_reactor_tick_time reactor_tick_time
==================================== =====================