It can be useful to always return the fallback key when attempting to
claim keys. This adds an unstable endpoint for `/keys/claim` which
always returns fallback keys in addition to one-time-keys.
The fallback key(s) are not marked as "used" unless there are no
corresponding OTKs.
This is currently defined in MSC3983 (although likely to be split out
to a separate MSC). The endpoint shape may change or be requested
differently (i.e. a keyword parameter on the current endpoint), but the
core logic should be reasonable.
* Revert "Fix registering a device on an account with lots of devices (#15348)"
This reverts commit f0d8f66eaa.
* Revert "Delete stale non-e2e devices for users, take 3 (#15183)"
This reverts commit 78cdb72cd6.
If enabled, for users which are exclusively owned by an application
service then the appservice will be queried for devices in addition
to any information stored in the Synapse database.
This should help reduce the number of devices e.g. simple bots the repeatedly login rack up.
We only delete non-e2e devices as they should be safe to delete, whereas if we delete e2e devices for a user we may accidentally break their ability to receive e2e keys for a message.
Experimental support for MSC3983 is behind a configuration flag.
If enabled, for users which are exclusively owned by an application
service then the appservice will be queried for one-time keys *if*
there are none uploaded to Synapse.
* Scaffolding for background process to refresh profiles
* Add scaffolding for background process to refresh profiles for a given server
* Implement the code to select servers to refresh from
* Ensure we don't build up multiple looping calls
* Make `get_profile` able to respect backoffs
* Add logic for refreshing users
* When backing off, schedule a refresh when the backoff is over
* Wake up the background processes when we receive an interesting state event
* Add tests
* Newsfile
Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
* Add comment about 1<<62
---------
Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
* Update mypy and mypy-zope
* Remove unused ignores
These used to suppress
```
synapse/storage/engines/__init__.py:28: error: "__new__" must return a
class instance (got "NoReturn") [misc]
```
and
```
synapse/http/matrixfederationclient.py:1270: error: "BaseException" has no attribute "reasons" [attr-defined]
```
(note that we check `hasattr(e, "reasons")` above)
* Avoid empty body warnings, sometimes by marking methods as abstract
E.g.
```
tests/handlers/test_register.py:58: error: Missing return statement [empty-body]
tests/handlers/test_register.py:108: error: Missing return statement [empty-body]
```
* Suppress false positive about `JaegerConfig`
Complaint was
```
synapse/logging/opentracing.py:450: error: Function "Type[Config]" could always be true in boolean context [truthy-function]
```
* Fix not calling `is_state()`
Oops!
```
tests/rest/client/test_third_party_rules.py:428: error: Function "Callable[[], bool]" could always be true in boolean context [truthy-function]
```
* Suppress false positives from ParamSpecs
````
synapse/logging/opentracing.py:971: error: Argument 2 to "_custom_sync_async_decorator" has incompatible type "Callable[[Arg(Callable[P, R], 'func'), **P], _GeneratorContextManager[None]]"; expected "Callable[[Callable[P, R], **P], _GeneratorContextManager[None]]" [arg-type]
synapse/logging/opentracing.py:1017: error: Argument 2 to "_custom_sync_async_decorator" has incompatible type "Callable[[Arg(Callable[P, R], 'func'), **P], _GeneratorContextManager[None]]"; expected "Callable[[Callable[P, R], **P], _GeneratorContextManager[None]]" [arg-type]
````
* Drive-by improvement to `wrapping_logic` annotation
* Workaround false "unreachable" positives
See https://github.com/Shoobx/mypy-zope/issues/91
```
tests/http/test_proxyagent.py:626: error: Statement is unreachable [unreachable]
tests/http/test_proxyagent.py:762: error: Statement is unreachable [unreachable]
tests/http/test_proxyagent.py:826: error: Statement is unreachable [unreachable]
tests/http/test_proxyagent.py:838: error: Statement is unreachable [unreachable]
tests/http/test_proxyagent.py:845: error: Statement is unreachable [unreachable]
tests/http/federation/test_matrix_federation_agent.py:151: error: Statement is unreachable [unreachable]
tests/http/federation/test_matrix_federation_agent.py:452: error: Statement is unreachable [unreachable]
tests/logging/test_remote_handler.py:60: error: Statement is unreachable [unreachable]
tests/logging/test_remote_handler.py:93: error: Statement is unreachable [unreachable]
tests/logging/test_remote_handler.py:127: error: Statement is unreachable [unreachable]
tests/logging/test_remote_handler.py:152: error: Statement is unreachable [unreachable]
```
* Changelog
* Tweak DBAPI2 Protocol to be accepted by mypy 1.0
Some extra context in:
- https://github.com/matrix-org/python-canonicaljson/pull/57
- https://github.com/python/mypy/issues/6002
- https://mypy.readthedocs.io/en/latest/common_issues.html#covariant-subtyping-of-mutable-protocol-members-is-rejected
* Pull in updated canonicaljson lib
so the protocol check just works
* Improve comments in opentracing
I tried to workaround the ignores but found it too much trouble.
I think the corresponding issue is
https://github.com/python/mypy/issues/12909. The mypy repo has a PR
claiming to fix this (https://github.com/python/mypy/pull/14677) which
might mean this gets resolved soon?
* Better annotation for INTERACTIVE_AUTH_CHECKERS
* Drive-by AUTH_TYPE annotation, to remove an ignore
Fixes#12801.
Complement tests are at
https://github.com/matrix-org/complement/pull/567.
Avoid blocking on full state when handling a subsequent join into a
partial state room.
Also always perform a remote join into partial state rooms, since we do
not know whether the joining user has been banned and want to avoid
leaking history to banned users.
Signed-off-by: Mathieu Velten <mathieuv@matrix.org>
Co-authored-by: Sean Quah <seanq@matrix.org>
Co-authored-by: David Robertson <davidr@element.io>
Ensure that the list of servers in a partial state room always contains
the server we joined off.
Also refactor `get_partial_state_servers_at_join` to return `None` when
the given room is no longer partial stated, to explicitly indicate when
the room has partial state. Otherwise it's not clear whether an empty
list means that the room has full state, or the room is partial stated,
but the server we joined off told us that there are no servers in the
room.
Signed-off-by: Sean Quah <seanq@matrix.org>
Currently, we will try to start a new partial state sync every time we
perform a remote join, which is undesirable if there is already one
running for a given room.
We intend to perform remote joins whenever additional local users wish
to join a partial state room, so let's ensure that we do not start more
than one concurrent partial state sync for any given room.
------------------------------------------------------------------------
There is a race condition where the homeserver leaves a room and later
rejoins while the partial state sync from the previous membership is
still running. There is no guarantee that the previous partial state
sync will process the latest join, so we restart it if needed.
Signed-off-by: Sean Quah <seanq@matrix.org>
PKCE can protect against certain attacks and is enabled by default. Support
can be controlled manually by setting the pkce_method of each oidc_providers
entry to 'auto' (default), 'always', or 'never'.
This is required by Twitter OAuth 2.0 support.
This should help reduce the number of devices e.g. simple bots the repeatedly login rack up.
We only delete non-e2e devices as they should be safe to delete, whereas if we delete e2e devices for a user we may accidentally break their ability to receive e2e keys for a message.
A batch of changes intended to make it easier to trace to-device messages through the system.
The intention here is that a client can set a property org.matrix.msgid in any to-device message it sends. That ID is then included in any tracing or logging related to the message. (Suggestions as to where this field should be documented welcome. I'm not enthusiastic about speccing it - it's very much an optional extra to help with debugging.)
I've also generally improved the data we send to opentracing for these messages.
This should help reduce the number of devices e.g. simple bots the repeatedly login rack up.
We only delete non-e2e devices as they should be safe to delete, whereas if we delete e2e devices for a user we may accidentally break their ability to receive e2e keys for a message.
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
* Use `device_one_time_keys_count` to match MSC3202
Rename the `device_one_time_key_counts` key in responses to
`device_one_time_keys_count` to match the name specified by MSC3202.
Also change related variable/class names for consistency.
Signed-off-by: Andrew Ferrazzutti <andrewf@element.io>
* Update changelog.d/14565.misc
* Revert name change for `one_time_key_counts` key
as this is a different key altogether from `device_one_time_keys_count`,
which is used for `/sync` instead of appservice transactions.
Signed-off-by: Andrew Ferrazzutti <andrewf@element.io>
This commit adds support for handling a provided avatar picture URL
when logging in via SSO.
Signed-off-by: Ashish Kumar <ashfame@users.noreply.github.com>
Fixes#9357.
This was the last untyped handler from the HomeServer object. Since
it was being treated as Any (and thus unchecked) it was being used
incorrectly in a few places.
Fixes check_avatar_size_and_mime_type() to successfully update avatars on homeservers running on non-default ports which it would mistakenly treat as remote homeserver while validating the avatar's size and mime type.
Signed-off-by: Ashish Kumar ashfame@users.noreply.github.com
This implements a fake OIDC server, which intercepts calls to the HTTP client.
Improves accuracy of tests by covering more internal methods.
One particular example was the ID token validation, which previously mocked.
This uncovered an incorrect dependency: Synapse actually requires at least
authlib 0.15.1, not 0.14.0.
While https://github.com/matrix-org/synapse/pull/13635 stops us from doing the slow thing after we've already done it once, this PR stops us from doing one of the slow things in the first place.
Related to
- https://github.com/matrix-org/synapse/issues/13622
- https://github.com/matrix-org/synapse/pull/13635
- https://github.com/matrix-org/synapse/issues/13676
Part of https://github.com/matrix-org/synapse/issues/13356
Follow-up to https://github.com/matrix-org/synapse/pull/13815 which tracks event signature failures.
With this PR, we avoid the call to the costly `_get_state_ids_after_missing_prev_event` because the signature failure will count as an attempt before and we filter events based on the backoff before calling `_get_state_ids_after_missing_prev_event` now.
For example, this will save us 156s out of the 185s total that this `matrix.org` `/messages` request. If you want to see the full Jaeger trace of this, you can drag and drop this `trace.json` into your own Jaeger, https://gist.github.com/MadLittleMods/4b12d0d0afe88c2f65ffcc907306b761
To explain this exact scenario around `/messages` -> backfill, we call `/backfill` and first check the signatures of the 100 events. We see bad signature for `$luA4l7QHhf_jadH3mI-AyFqho0U2Q-IXXUbGSMq6h6M` and `$zuOn2Rd2vsC7SUia3Hp3r6JSkSFKcc5j3QTTqW_0jDw` (both member events). Then we process the 98 events remaining that have valid signatures but one of the events references `$luA4l7QHhf_jadH3mI-AyFqho0U2Q-IXXUbGSMq6h6M` as a `prev_event`. So we have to do the whole `_get_state_ids_after_missing_prev_event` rigmarole which pulls in those same events which fail again because the signatures are still invalid.
- `backfill`
- `outgoing-federation-request` `/backfill`
- `_check_sigs_and_hash_and_fetch`
- `_check_sigs_and_hash_and_fetch_one` for each event received over backfill
- ❗ `$luA4l7QHhf_jadH3mI-AyFqho0U2Q-IXXUbGSMq6h6M` fails with `Signature on retrieved event was invalid.`: `unable to verify signature for sender domain xxx: 401: Failed to find any key to satisfy: _FetchKeyRequest(...)`
- ❗ `$zuOn2Rd2vsC7SUia3Hp3r6JSkSFKcc5j3QTTqW_0jDw` fails with `Signature on retrieved event was invalid.`: `unable to verify signature for sender domain xxx: 401: Failed to find any key to satisfy: _FetchKeyRequest(...)`
- `_process_pulled_events`
- `_process_pulled_event` for each validated event
- ❗ Event `$Q0iMdqtz3IJYfZQU2Xk2WjB5NDF8Gg8cFSYYyKQgKJ0` references `$luA4l7QHhf_jadH3mI-AyFqho0U2Q-IXXUbGSMq6h6M` as a `prev_event` which is missing so we try to get it
- `_get_state_ids_after_missing_prev_event`
- `outgoing-federation-request` `/state_ids`
- ❗ `get_pdu` for `$luA4l7QHhf_jadH3mI-AyFqho0U2Q-IXXUbGSMq6h6M` which fails the signature check again
- ❗ `get_pdu` for `$zuOn2Rd2vsC7SUia3Hp3r6JSkSFKcc5j3QTTqW_0jDw` which fails the signature check
This is also using the partial state approximation if needed so we do
not block here during a fast join.
Signed-off-by: Mathieu Velten <mathieuv@matrix.org>
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
When a remote user leaves the last room shared with the homeserver, we
have to mark their device list as unsubscribed, otherwise we would hold
on to a stale device list in our cache. Crucially, the device list would
remain cached even after the remote user rejoined the room, which could
lead to E2EE failures until the next change to the remote user's device
list.
Fixes#13651.
Signed-off-by: Sean Quah <seanq@matrix.org>
Part of #13019
This changes all the permission-related methods to rely on the Requester instead of the UserID. This is a first step towards enabling scoped access tokens at some point, since I expect the Requester to have scope-related informations in it.
It also changes methods which figure out the user/device/appservice out of the access token to return a Requester instead of something else. This avoids having store-related objects in the methods signatures.
This improves load times for push rules:
| Version | Time per user | Time for 1k users |
| -------------------- | ------------- | ----------------- |
| Before | 138 µs | 138ms |
| Now (with custom) | 2.11 µs | 2.11ms |
| Now (without custom) | 49.7 ns | 0.05 ms |
This therefore has a large impact on send times for rooms
with large numbers of local users in the room.
This adds support for the stable identifiers of MSC2285 while
continuing to support the unstable identifiers behind the configuration
flag. These will be removed in a future version.
Previously, `_resolve_state_at_missing_prevs` returned the resolved
state before an event and a partial state flag. These were unwieldy to
carry around would only ever be used to build an event context. Build
the event context directly instead.
Signed-off-by: Sean Quah <seanq@matrix.org>
Avoid blocking on full state in `_resolve_state_at_missing_prevs` and
return a new flag indicating whether the resolved state is partial.
Thread that flag around so that it makes it into the event context.
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
Previously, TLS could only be used with STARTTLS.
Add a new option `force_tls`, where TLS is used from the start.
Implicit TLS is recommended over STARTLS,
see https://datatracker.ietf.org/doc/html/rfc8314Fixes#8046.
Signed-off-by: Jan Schär <jan@jschaer.ch>
Some experimental prep work to enable external event caching based on #9379 & #12955. Doesn't actually move the cache at all, just lays the groundwork for async implemented caches.
Signed off by Nick @ Beeper (@Fizzadar)
* Replace `get_new_events_for_appservice` with `get_all_new_events_stream`
The functions were near identical and this brings the AS worker closer
to the way federation senders work which can allow for multiple workers
to handle AS traffic.
* Pull received TS alongside events when processing the stream
This avoids an extra query -per event- when both federation sender
and appservice pusher process events.
There is a corner in `_check_event_auth` (long known as "the weird corner") where, if we get an event with auth_events which don't match those we were expecting, we attempt to resolve the diffence between our state and the remote's with a state resolution.
This isn't specced, and there's general agreement we shouldn't be doing it.
However, it turns out that the faster-joins code was relying on it, so we need to introduce something similar (but rather simpler) for that.
This simplifies the access token verification logic by removing the `rights`
parameter which was only ever used for the unsubscribe link in email
notifications. The latter has been moved under the `/_synapse` namespace,
since it is not a standard API.
This also makes the email verification link more secure, by embedding the
app_id and pushkey in the macaroon and verifying it. This prevents the user
from tampering the query parameters of that unsubscribe link.
Macaroon generation is refactored:
- Centralised all macaroon generation and verification logic to the
`MacaroonGenerator`
- Moved to `synapse.utils`
- Changed the constructor to require only a `Clock`, hostname, and a secret key
(instead of a full `Homeserver`).
- Added tests for all methods.
The `room_id` field was removed from MSC2946 before
it was accepted. It was initially kept for backwards compatibility
and should be removed now that the stable form of the API
is used.
This change only stops Synapse from validating that it is returned,
a future PR will remove returning it as part of the response.
By always using delete_devices and sometimes passing a list
with a single device ID.
Previously these methods had gotten out of sync with each
other and it seems there's little benefit to the single-device
variant.
* Update worker docs to remove group endpoints.
* Removes an unused parameter to `ApplicationService`.
* Break dependency between media repo and groups.
* Avoid copying `m.room.related_groups` state events during room upgrades.
A minor optimization to avoid unnecessary copying/building
identical dictionaries when filtering private read receipts.
Also clarifies comments and cleans-up some tests.
Refactor how the `EventContext` class works, with the intention of reducing the amount of state we fetch from the DB during event processing.
The idea here is to get rid of the cached `current_state_ids` and `prev_state_ids` that live in the `EventContext`, and instead defer straight to the database (and its caching).
One change that may have a noticeable effect is that we now no longer prefill the `get_current_state_ids` cache on a state change. However, that query is relatively light, since its just a case of reading a table from the DB (unlike fetching state at an event which is more heavyweight). For deployments with workers this cache isn't even used.
Part of #12684
getClientIP was deprecated in Twisted 18.4.0, which also added
getClientAddress. The Synapse minimum version for Twisted is
currently 18.9.0, so all supported versions have the new API.
* Changes hidden read receipts to be a separate receipt type
(instead of a field on `m.read`).
* Updates the `/receipts` endpoint to accept `m.fully_read`.