From 9279a2c4e4c1c63e87d43fc9f6ad2c495bf47e67 Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Fri, 3 Jan 2020 13:43:55 +0100 Subject: [PATCH] Add a complete documentation of the message retention policies support --- changelog.d/6623.doc | 1 + docs/message_retention_policies.md | 191 +++++++++++++++++++++++++++++ 2 files changed, 192 insertions(+) create mode 100644 changelog.d/6623.doc create mode 100644 docs/message_retention_policies.md diff --git a/changelog.d/6623.doc b/changelog.d/6623.doc new file mode 100644 index 000000000..c8aade097 --- /dev/null +++ b/changelog.d/6623.doc @@ -0,0 +1 @@ +Add a complete documentation of the message retention policies support. diff --git a/docs/message_retention_policies.md b/docs/message_retention_policies.md new file mode 100644 index 000000000..78055b2f6 --- /dev/null +++ b/docs/message_retention_policies.md @@ -0,0 +1,191 @@ +# Message retention policies + +Synapse admins can enable support for message retention policies on +their homeserver. Message retention policies exist at a room level, +follow the semantics described in +[MSC1763](https://github.com/matrix-org/matrix-doc/blob/matthew/msc1763/proposals/1763-configurable-retention-periods.md), +and allow server and room admins to configure how long messages should +be kept in a homeserver's database before being purged from it. + +A message retention policy is mainly defined by its `max_lifetime` +parameter, which defines how long a message can be kept around after +it's been sent in the room. If a room doesn't have a message retention +policy, and there's no default one for a given server, then no message +sent in that room is ever purged on that server. + +MSC1763 also specifies semantics for a `min_lifetime` parameter which +defines the amount of time after which an event _can_ get purged (after +it's been sent to the room), but Synapse doesn't currently support it +beyond registering it. + +Both `max_lifetime` and `min_lifetime` are optional parameters. + +Note that message retention policies don't apply to state events. + +Once an event reaches its expiry date (defined as the time it was sent +plus the value for `max_lifetime` in the room), two things happen: + +* Synapse stops serving the event to clients via any endpoint. +* The message gets picked up by the next purge job (see the "Purge jobs" + section) and is removed from Synapse's database. + +Since purge jobs don't run continuously, this means that an event might +stay in a server's database for longer than the value for `max_lifetime` +in the room would allow, though hidden from clients. + +Similarly, if a server (with support for message retention policies +enabled) receives from another server an event that should have been +purged according to its room's policy, then the receiving server will +process and store that event until it's picked up by the next purge job, +though it will always hide it from clients. + + +## Room configuration + +To configure a room's message retention policy, a room's admin or +moderator needs to send a state event in that room with the type +`m.room.retention` and the following content: + +```json +{ + "max_lifetime": ... +} +``` + +In this event's content, the `max_lifetime` parameter has the same +meaning as previously described, and needs to be expressed in +milliseconds. The event's content can also include a `min_lifetime` +parameter, which has the same meaning and limited support as previously +described. + +Note that over every server in the room, only the ones with support for +message retention policies will actually remove expired events. While +we plan to eventually enable this support by default in Synapse, this +isn't currently the case. + + +## Server configuration + +Support for this feature can be enabled and configured in the +`retention` section of the Synapse configuration file (see the +[sample file](https://github.com/matrix-org/synapse/blob/v1.7.3/docs/sample_config.yaml#L332-L393)). + +To enable support for message retentions policies, set the setting +`enabled` in this section to `true`. + + +### Default policy + +A default message retention policy is a policy defined in Synapse's +configuration that is used by Synapse for every room that doesn't have a +message retention policy configured in its state. This allows server +admins to ensure that messages are never kept indefinitely in a server's +database. + +A default policy can be defined as such, in the `retention` section of +the configuration file: + +```yaml + default_policy: + min_lifetime: 1d + max_lifetime: 1y +``` + +Here, `min_lifetime` and `max_lifetime` have the same meaning and level +of support as previously described. They can be expressed either as a +duration (using the units `s` (seconds), `m` (minutes), `h` (hours), +`d` (days), `w` (weeks) and `y` (years)) or as a number of milliseconds. + + +### Purge jobs + +Purge jobs are the jobs that Synapse run in the background to purge +expired events from the database. They are only run if support for +message retention policies is enabled in the server's configuration. If +no configuration for purge jobs is configured by the server admin, +Synapse will run one daily that will handle every room with a message +retention policy (or, if the server has a default policy configured, +every room it knows), which should be enough in most cases. + +Some server admins might want a finer control on when events are removed +depending on an event's room's policy. This can be done by setting the +`purge_jobs` sub-section in the `retention` section of the configuration +file. An example of such configuration could be: + +```yaml + purge_jobs: + - longest_max_lifetime: 3d + interval: 12h + - shortest_max_lifetime: 3d + longest_max_lifetime: 1w + interval: 1d + - shortest_max_lifetime: 1w + interval: 2d +``` + +In this example, we define two jobs: + +* one that runs twice a day (every 12 hours) and purges events in rooms + which policy's `max_lifetime` is lower or equal to 3 days. +* one that runs once a day and purges events in rooms which policy's + `max_lifetime` is between 3 days and a week. +* one that runs once every 2 days and purges events in rooms which + policy's `max_lifetime` is greater than a week. + +Note that this example is tailored to show different configurations and +features slightly more jobs than it's probably necessary (in practice, a +server admin would probably consider it better to replace the two last +jobs with one that runs once a day and handles rooms which which +policy's `max_lifetime` is greater than 3 days). + +Keep in mind, when configuring these jobs, that a purge job can become +quite heavy on the server if it targets many rooms, therefore prefer +having jobs with a low interval that target a limited set of rooms. Also +make sure to include a job with no minimum and one with no maximum to +make sure your configuration handles every policy. + +As previously mentioned in this documentation, while a purge job that +runs e.g. every day means that an expired event might stay in the +database for up to a day after its expiry, Synapse hides expired events +from clients as soon as they expire, so the event is not visible to +local users between its expiry date and the moment it gets purged from +the server's database. + + +### Lifetime limits + +**Note: this feature is mainly useful within a closed federation or on +servers that don't federate, because there currently is no way to +enforce these limits in an open federation.** + +Server admins can restrict the values their local users are allowed to +use for both `min_lifetime` and `max_lifetime`. These limits can be +defined as such in the `retention` section of the configuration file: + +```yaml + allowed_lifetime_min: 1d + allowed_lifetime_max: 1y +``` + +Here, `allowed_lifetime_min` is the lowest value a local user can set +for both `min_lifetime` and `max_lifetime`, and `allowed_lifetime_max` +is the highest value. Both parameters are optional (e.g. setting +`allowed_lifetime_min` but not `allowed_lifetime_max` only enforces a +minimum and no maximum). + +Like other settings in this section, these parameters can be expressed +either as a duration or as a number of milliseconds. + + +## Note on reclaiming disk space + +While purge jobs actually delete data from the database, the disk space +used by the database might not decrease immediately on the database's +host. However, even though the database engine won't free up the disk +space, it will start writing new data into where the purged data was. + +If you want to reclaim the freed disk space anyway and return it to the +operating system, the server admin needs to run `VACUUM FULL;` on the +database (see the related +[PostgreSQL documentation](https://www.postgresql.org/docs/current/sql-vacuum.html)). +