This supersedes #17503, given the per-connection state is being heavily
rewritten it felt easier to recreate the PR on top of that work.
This correctly handles the case of timeline limits going up and down.
This does not handle changes in `required_state`, but that can be done
as a separate PR.
Based on #17575.
---------
Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
This is some prep work ahead of correctly tracking receipts, where we
will also want to track the room status in terms of last receipt we had
sent down.
Essentially, we add two classes `PerConnectionState` and a mutable
version, and then operate on those.
---------
Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
Results in:
```
AssertionError: null
File "synapse/http/server.py", line 332, in _async_render_wrapper
callback_return = await self._async_render(request)
File "synapse/http/server.py", line 544, in _async_render
callback_return = await raw_callback_return
File "synapse/federation/transport/server/_base.py", line 369, in new_func
response = await func(
File "synapse/federation/transport/server/federation.py", line 826, in on_GET
await self.media_repo.get_local_media(
File "synapse/media/media_repository.py", line 473, in get_local_media
await respond_with_multipart_responder(
File "synapse/media/_base.py", line 353, in respond_with_multipart_responder
assert content_length is not None
```
So that clients can check for support. Note that if the feature is only
enabled for some users, the `/versions` request must be authenticated to
pick up that SSS is enabled for the user
`old_verify_keys` isn't marked as required in
https://spec.matrix.org/v1.11/server-server-api/#get_matrixkeyv2server
and there's no functional difference between an empty object and
omitting the object, so I don't think there's any reason synapse should
explode when the field is omitted.
Follow on from #17558
Basically, we want to reduce the number of threads we want to use at a
time, i.e. reduce the number of threads that are paused/blocked. We do
this by returning from the thread when the consumer pauses the producer,
rather than pausing in the thread.
---------
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
Previously, we just had very basic partial room exclusion based on
whether we were lazy-loading room members. Now with this PR, we added
`must_await_full_state(...)` with rules to check if we have a we're only
requesting `required_state` which is completely satisfied even with
partial state.
Partially-stated rooms should have all state events except for remote
membership events so if we require a remote membership event anywhere,
then we need to return `True`.
We do this by reusing the code from sync v2.
Reviewable commit-by-commit. The function `get_user_ids_changed` has
been rewritten entirely, so I would recommend not looking at the diff.
This is in response to issue #17473.
Not all the necessary handlers to deal with media requests are started
now when configuring synapse to use a media worker as per the [example
config](https://element-hq.github.io/synapse/latest/workers.html#synapseappmedia_repository).
The new media endpoints introduced with authenticated media fall under
the `client` & `federation` handlers in synapse.
This PR starts up handlers for the new media endpoints if a worker has
been configured with only the `media` resource type.
### Pull Request Checklist
<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->
* [X] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
- Use markdown where necessary, mostly for `code blocks`.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [X] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct
(run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
---------
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
Spawning from looking at a couple traces and wanting a little more info.
Follow-up to github.com/element-hq/synapse/pull/17501
The changes in this PR allow you to find slow Sliding Sync traces ignoring the
`wait_for_events` time. In Jaeger, you can now filter for the `current_sync_for_user`
operation with `RESULT.result=true` indicating that it actually returned non-empty results.
If you want to find traces for your own user, you can use
`RESULT.result=true ARG.sync_config.user="@madlittlemods:matrix.org"`
This triggers the client to start a new sliding sync connection. If we
don't do this and the client asks for the full range of rooms, we end up
sending down all rooms and their state from scratch (which can be very
slow)
This causes things like
https://github.com/element-hq/element-x-ios/issues/3115 after we restart
the server
---------
Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
Update `filters.is_encrypted` and `filters.types`/`filters.not_types` to
be robust when dealing with remote invite rooms in Sliding Sync.
Part of
[MSC3575](https://github.com/matrix-org/matrix-spec-proposals/pull/3575):
Sliding Sync
Follow-up to https://github.com/element-hq/synapse/pull/17434
We now take into account current state, fallback to stripped state
for invite/knock rooms, then historical state. If we can't determine
the info needed to filter a room (either from state or stripped state),
it is filtered out.
Rather than always including all rooms in range.
Also adds a pre-filter to rooms that checks the stream change cache to
see if anything might have happened.
Based on #17447
---------
Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
The basic idea is that we introduce a new token for a sliding sync
connection, which stores the mapping of room to room "status" (i.e. have
we sent the room down?). This token allows us to handle duplicate
requests properly. In future it can be used to store more
"per-connection" information safely.
In future this should be migrated into the DB, so its important that we
try to reduce the number of syncs where we need to update the
per-connection information. In this PoC this only happens when we: a)
send down a set of room for the first time, or b) we have previously
sent down a room and there are updates but we are not sending the room
down the sync (due to not falling in a list range)
Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
Backfill events have a negative stream ordering, and so its not useful
to use to compare with other (positive) stream orderings.
Plus, the Rust SDK currently assumes `bump_stamp` is positive.
Introduced in: #17215
This caused us a minor bit of grief as the volume of logs produced was
much higher than normal
---------
Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
This is to address an issue in which `m.presence` results on initial
sync are not returning entries of users who are currently offline.
The original behaviour was from
https://github.com/element-hq/synapse/issues/1535
This change is useful for applications that use the
presence system for tracking user profile information/updates (e.g.
https://github.com/element-hq/synapse/pull/16992 or for profile status
messages).
This is gated behind a new configuration option to avoid performance
impact for applications that don't need this, as a pragmatic solution
for now.
As it gets used in sliding sync.
We basically invalidate it in all the same places as
`get_rooms_for_user`. Most of the changes are due to needing the
arguments you pass in to be hashable (which lists aren't)
Prior to this PR, remote downloads which did not provide a
`content-length` were decremented from the remote download ratelimiter
at the max allowable size, leading to excessive ratelimiting - see
https://github.com/element-hq/synapse/issues/17394.
This PR adds a linearizer to limit concurrent remote downloads to 6 per
IP address, and decrements remote downloads without a `content-length`
from the ratelimiter *after* the download is complete and the response
length is known.
Also adds logic to ensure that responses with a known length respect the
`max_download_size`.
This removes the `enable_media_repo` attribute on the server config in
favour of always using the `can_load_media_repo` in the media config.
This should avoid issues like in #17420 in the future
Add room subscriptions to Sliding Sync `/sync`
Based on
[MSC3575](https://github.com/matrix-org/matrix-spec-proposals/pull/3575):
Sliding Sync
Currently, you can only subscribe to rooms you have had *any* membership
in before.
In the future, we will allow `world_readable` rooms to be subscribed to
without joining.
We can only fetch room types for rooms the server is in, so we need to
only filter rooms that we're joined to.
Also includes a perf fix to bulk fetch room types.
This looks like a copy/paste error: the function doesn't reject
anything, but instead allows the action count to go through regardless.
The remainder of the function's documentation appears correct.
Added RHEL/Rocky install instructions (PyPI). Instructions cover
versions 8 and 9 which are the only supported ones - except for RHEL7
which is now on extended life cycle support phase.
Large part of the guide is for installing Python 3.11 or 3.12. RHEL8
ships with Python 3.6 and RHEL9 ships with 3.9. Newer Python versions
can be installed easily as they don't interfere with OS software that
still relies on the default Python version.
I was first planning to add prerequisites part to the prerequisites
section and then install instructions on the top of the page but that
section is for pre-built packages so it just didn't sound right. So I
just dumped everything to the PyPI section of the page. But suggestions
to change are welcome.
I also didn't combine these with Fedora section. I haven't tested those
packages on RHEL and Fedora ships with Python 3.12 out-of-box.
We don't necessarily have `instance_name` for old events (before we
support multiple event persisters). We treat those as if the
`instance_name` was "master".
---------
Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
`bump_stamp` corresponds to the `stream_ordering` of the latest `DEFAULT_BUMP_EVENT_TYPES` in the room. This helps clients sort more readily without them needing to pull in a bunch of the timeline to determine the last activity. `bump_event_types` is a thing because for example, we don't want display name changes to mark the room as unread and bump it to the top. For encrypted rooms, we just have to consider any activity as a bump because we can't see the content and the client has to figure it out for themselves.
Outside of Synapse, `bump_stamp` is just a free-form counter so other implementations could use `received_ts`or `origin_server_ts` (see the [*Security considerations* section in MSC3575 about the potential pitfalls of using `origin_server_ts`](https://github.com/matrix-org/matrix-spec-proposals/blob/kegan/sync-v3/proposals/3575-sync.md#security-considerations)). It doesn't have any guarantee about always going up. In the Synapse case, it could go down if an event was redacted/removed (or purged in cases of retention policies).
In the future, we could add `bump_event_types` as [MSC3575](https://github.com/matrix-org/matrix-spec-proposals/pull/3575) mentions if people need to customize the event types.
---
In the Sliding Sync proxy, a similar [`timestamp` field was added](https://github.com/matrix-org/sliding-sync/pull/247) for the same purpose but the name is not obvious what it pertains to or what it's for.
The `timestamp` field was also added to Ruma in https://github.com/ruma/ruma/pull/1622
Follows on from @H-Shay's great work at
https://github.com/matrix-org/synapse/pull/15344 and MSC4026.
Also enables its use for MSC3881, mainly as an easy but concrete example
of how to use it.
This can help ensure that the rooms are eventually purged if the other
local users also forget them. Synapse already clears some of the room
information as part of the `_background_remove_left_rooms` background
task, but this doesn't catch `events`, `event_json`, etc.
Fixes https://github.com/element-hq/synapse/issues/17274, hopefully.
Basically, old versions of Synapse could advance streams without
persisting anything in the DB (fixed in #17229). On restart those
updates would get lost, and so the position of the stream would revert
to an older position. If this happened across an upgrade to a later
Synapse version which included #17215, then sync could get blocked
indefinitely (until the stream advanced to the position in the token).
We fix this by bounding the stream positions we'll wait for to the
maximum position of the underlying stream ID generator.
Fixes https://github.com/element-hq/synapse/issues/17274, hopefully.
Basically, old versions of Synapse could advance streams without
persisting anything in the DB (fixed in #17229). On restart those
updates would get lost, and so the position of the stream would revert
to an older position. If this happened across an upgrade to a later
Synapse version which included #17215, then sync could get blocked
indefinitely (until the stream advanced to the position in the token).
We fix this by bounding the stream positions we'll wait for to the
maximum position of the underlying stream ID generator.
If we leave the `.so` in place it causes the tests to fail, as it gets
picked up (instead of the newly built .so) and so fails with mismatched
GLIBC errors.
Sid now defaults to python3.12, and our pinned version of cffi (1.5.1)
does not have wheels for 3.12. This installing cffi to fail as we did
not have the correct libs installed to build from source.
Fix bug where we don't get new to-device from remote if they resent a
message we've already persisted and have recorded in the DB twice.
`device_federation_inbox` table doesn't have a unique index, and so we
can race and store an entry in there twice. If we do so then
`simple_select_one_txn` will throw an error due to the query returning
more than one row. We should add an unique index, but it doesn't really
matter so lets just handle the case of multiple rows correctly for now.
Fixes up #17333, where we failed to actually send less data (the
`DISTINCT` didn't work due to `stream_id` being different).
We fix this by making it so that every device list outbound poke for a
given user ID has the same stream ID. We can't change the query to only
return e.g. max stream ID as the receivers look up the destinations to
send to by doing `SELECT WHERE stream_id = ?`
A simple change to update the docs where default values were missing.
### Pull Request Checklist
<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->
* [X] Pull request is based on the develop branch
* [X] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
- Use markdown where necessary, mostly for `code blocks`.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [X] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct
(run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
---------
Co-authored-by: Kim Brose <2803622+HarHarLinks@users.noreply.github.com>
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
This is #17291 (which got reverted), with some added fixups, and change
so that tests actually pick up the error.
The problem was that we were not calculating any new chain IDs due to a
missing `not` in a condition.
Reduce the replication traffic of device lists, by not sending every
destination that needs to be sent the device list update over
replication. Instead a "hosts to send to have been calculated"
notification over replication, and then federation senders read the
destinations from the DB.
For non federation senders this should heavily reduce the impact of a
user in many large rooms changing a device.
The parse_integer function was previously made to reject negative values by
default in https://github.com/element-hq/synapse/pull/16920, but the
documentation stated otherwise. This fixes the documentation and also:
- Removes explicit negative=False parameters from call sites.
- Brings the negative default of parse_integer_from_args in alignment with
parse_integer.
This reverts commit bdf82efea5 (#17291)
This seems to have stopped persisting auth chains for new events, and so
is causing state res to fall back to the slow methods
We calculate the auth chain links outside of the main persist event
transaction to ensure that we do not block other event sending during
the calculation.
Sort is no longer configurable and we always sort rooms by the `stream_ordering` of the last event in the room or the point where the user can see up to in cases of leave/ban/invite/knock.
Add `event.internal_metadata.instance_name` (the worker instance that persisted the event) to go alongside the existing `event.internal_metadata.stream_ordering`.
`instance_name` is useful to properly compare and query for events with a token since you need to compare both the `stream_ordering` and `instance_name` against the vector clock/`instance_map` in the `RoomStreamToken`.
This is pre-requisite work and may be used in https://github.com/element-hq/synapse/pull/17293
Adding `event.internal_metadata.instance_name` was first mentioned in the initial Sliding Sync PR while pairing with @erikjohnston, see 09609cb0db (diff-5cd773fb307aa754bd3948871ba118b1ef0303f4d72d42a2d21e38242bf4e096R405-R410)
PR where this was introduced: https://github.com/matrix-org/synapse/pull/14817
### What does this affect?
`get_last_event_in_room_before_stream_ordering(...)` is used in Sync v2 in a lot of different state calculations.
`get_last_event_in_room_before_stream_ordering(...)` is also used in `/rooms/{roomId}/members`