This is a first step in dealing with #7721.
The idea is basically that rather than calculating the full set of users a device list update needs to be sent to up front, we instead simply record the rooms the user was in at the time of the change. This will allow a few things:
1. we can defer calculating the set of remote servers that need to be poked about the change; and
2. during `/sync` and `/keys/changes` we can avoid also avoid calculating users who share rooms with other users, and instead just look at the rooms that have changed.
However, care needs to be taken to correctly handle server downgrades. As such this PR writes to both `device_lists_changes_in_room` and the `device_lists_outbound_pokes` table synchronously. In a future release we can then bump the database schema compat version to `69` and then we can assume that the new `device_lists_changes_in_room` exists and is handled.
There is a temporary option to disable writing to `device_lists_outbound_pokes` synchronously, allowing us to test the new code path does work (and by implication upgrading to a future release and downgrading to this one will work correctly).
Note: Ideally we'd do the calculation of room to servers on a worker (e.g. the background worker), but currently only master can write to the `device_list_outbound_pokes` table.
There are a bunch of places we call get_success on an immediate value, which is unnecessary. Let's rip them out, and remove the redundant functionality in get_success and friends.
Switching to a sequence means there's no need to track `last_txn` on the
AS state table to generate new TXN IDs. This also means that there is
no longer contention between the AS scheduler and AS handler on updates
to the `application_services_state` table, which will prevent serialization
errors during the complete AS txn transaction.
It seems like calling `_get_state_group_for_events` for an event where the
state is unknown is an error. Accordingly, let's raise an exception rather than
silently returning an empty result.
If we're missing most of the events in the room state, then we may as well call the /state endpoint, instead of individually requesting each and every event.
The intention here is to avoid doing state lookups for outliers in
`/_matrix/federation/v1/event`. Unfortunately that's expanded into something of
a rewrite of `filter_events_for_server`, which ended up trying to do that
operation in a couple of places.
The PaginationChunk class attempted to bundle some properties
together, but really just caused callers to jump through hoops and
hid implementation details.
This endpoint was removed from MSC2675 before it was approved.
It is currently unspecified (even in any MSCs) and therefore subject to
removal. It is not implemented by any known clients.
This also changes the bundled aggregation format for `m.annotation`,
which previously included pagination tokens for the `/aggregations`
endpoint, which are no longer useful.
Document the behaviour of `LoggingTransaction.call_after` and
`LoggingTransaction.call_on_exception` when transactions are retried.
Signed-off-by: Sean Quah <seanq@element.io>
Follow-up to https://github.com/matrix-org/synapse/pull/12083
Since we are now using the new `state_event_ids` parameter to do all of the heavy lifting.
We can remove any spots where we plumbed `auth_event_ids` just for MSC2716 things in
https://github.com/matrix-org/synapse/pull/9247/files.
Removing `auth_event_ids` from following functions:
- `create_and_send_nonmember_event`
- `_local_membership_update`
- `update_membership`
- `update_membership_locked`
When we are processing a `/backfill` request from a remote server, exclude any
outliers from consideration early on. We can't return outliers anyway (since we
don't know the state at the outlier), and filtering them out earlier means that
we won't attempt to calulate the state for them.
This should speed up push rule calculations for rooms with large numbers of local users when the main push rule cache fails.
Co-authored-by: reivilibre <oliverw@matrix.org>