tileradotorg/synapse

Author	SHA1	Message	Date
Sean Quah	6f80fe1e1b	Tweak changelog formatting	2022-08-31 12:51:57 +01:00
Sean Quah	838d722eba	Move notice from 1.66.0rc1 to 1.66.0 section in changelog	2022-08-31 12:40:14 +01:00
Sean Quah	c01f21d31d	Tweak changelog wording	2022-08-31 12:35:25 +01:00
Sean Quah	d1fb46fbc9	Improve clarity on deprecation of TCP replication Borrows some text from https://github.com/matrix-org/synapse/pull/13647 for the changelog.	2022-08-31 12:19:40 +01:00
Nick Mills-Barrett	42b11d5565	Remove cached wrap on `_get_joined_users_from_context` method (#13569 ) The method doesn't actually do any data fetching and the method that does, `_get_joined_profile_from_event_id`, has its own cache. Signed off by Nick @ Beeper (@Fizzadar).	2022-08-31 12:19:39 +01:00
reivilibre	7bc110a19e	Generalise the `@cancellable` annotation so it can be used on functions other than just servlet methods. (#13662 )	2022-08-31 11:16:05 +00:00
Sean Quah	90c99fb3aa	Fix dead link in 1.18.0 upgrade notes	2022-08-31 11:53:30 +01:00
David Robertson	a160406d24	Fix admin List Room API return type on sqlite (#13509 )	2022-08-31 10:38:16 +00:00
Sean Quah	5634267d33	Update changelog to link to the Synapse docs instead of markdown	2022-08-31 11:37:15 +01:00
Sean Quah	ef88bc0775	1.66.0	2022-08-31 11:21:09 +01:00
Sean Quah	d48b70fd37	Update changelog for v1.62.0	2022-08-31 11:18:56 +01:00
Jörg Behrmann	b9924df264	Change dpkg-statoverride to use --force-statoverride-add (#13638 ) The --force flag of dpkg-statoverride has been deprecated (apparently starting with the dpkg version in Debian buster). It offers --force-all as q quick fix, but the usage in the Debian postinst script is probably covered by --force-statoverride-add. Fixes: #8391 Signed-off-by: Jörg Behrmann <behrmann@physik.fu-berlin.de>	2022-08-31 11:15:28 +01:00
Patrick Cloke	61b37ddd37	Remind people that direct TCP replication is disabled. (#13674 )	2022-08-31 10:43:00 +01:00
Eric Eastwood	92c5817e34	Give the correct next event when the message timestamps are the same - MSC3030 (#13658 ) Discovered while working on https://github.com/matrix-org/synapse/pull/13589 and I had all the messages at the same timestamp in the tests. Part of https://github.com/matrix-org/matrix-spec-proposals/pull/3030 Complement tests: https://github.com/matrix-org/complement/pull/457	2022-08-30 14:50:06 -05:00
Shay	20c76cecb9	Drop unused column `application_services_state.last_txn` (#13627 )	2022-08-30 10:29:16 -07:00
Richard van der Hoff	372136d3a8	Remove documentation of legacy `frontend_proxy` worker app (#13645 ) This has been the same as a generic_worker since #6964, so let's get rid of it. Fixes #3717	2022-08-30 18:01:51 +01:00
David Robertson	4249082eed	Merge branch 'release-v1.66' into develop	2022-08-30 15:31:51 +01:00
David Robertson	31f2a3fbc3	Update changes	2022-08-30 14:19:52 +01:00
Patrick Cloke	e761e8b475	Clarify documentation about replication traffic. (#13656 ) It can be authenticated with the worker_replication_secret setting, but is always unencrypted.	2022-08-30 12:21:19 +00:00
David Robertson	8f6aa015a8	1.66.0rc2	2022-08-30 12:25:44 +01:00
Erik Johnston	1c26acd815	Fix bug where we wedge media plugins if clients disconnect early (#13660 ) We incorrectly didn't use the returned `Responder` if the client had disconnected, which meant that the resource used by the Responder wasn't correctly released. In particular, this exhausted the thread pools so that all requests timed out.	2022-08-30 12:17:48 +01:00
Patrick Cloke	303b40b988	Do not wait for background updates to complete do expire URL cache. (#13657 ) Media downloaded as part of a URL preview is normally deleted after two days. However, while a background database migration is running, the process is stopped. A long-running database migration can therefore cause the media store to fill up with old preview files. This logic was added in #2697 to make sure that we didn't try to run the expiry without an index on `local_media_repository.created_ts`; the original logic that needs that index was added in #2478 (in `get_url_cache_media_before`, as amended by `93247a424a`), and is still present. Given that the background update was added before Synapse v1.0.0, just drop this check and assume the index exists.	2022-08-30 07:15:54 -04:00
Patrick Cloke	20df96a7a7	Speed up inserting `event_push_actions_staging`. (#13634 ) By using `execute_values` instead of `execute_batch`.	2022-08-30 07:12:48 -04:00
Eric Eastwood	1eea73b413	Fix rate limit metrics registering twice and misreporting (#13649 ) * Fix rate limit metrics registering twice and misreporting Fix https://github.com/matrix-org/synapse/issues/13641 * Fix lints * Add changelog * Document `metrics_name=None`.	2022-08-30 12:08:29 +01:00
Dirk Klimpel	682dfcfc0d	Fix that user cannot `/forget` rooms after the last member has left (#13546 )	2022-08-30 09:58:38 +00:00
Eric Eastwood	51d732db3b	Optimize how we calculate `likely_domains` during backfill (#13575 ) Optimize how we calculate `likely_domains` during backfill because I've seen this take 17s in production just to `get_current_state` which is used to `get_domains_from_state` (see case [2. Loading tons of events in the `/messages` investigation issue](https://github.com/matrix-org/synapse/issues/13356)). There are 3 ways we currently calculate hosts that are in the room: 1. `get_current_state` -> `get_domains_from_state` - Used in `backfill` to calculate `likely_domains` and `/timestamp_to_event` because it was cargo-culted from `backfill` - This one is being eliminated in favor of `get_current_hosts_in_room` in this PR 🕳 1. `get_current_hosts_in_room` - Used for other federation things like sending read receipts and typing indicators 1. `get_hosts_in_room_at_events` - Used when pushing out events over federation to other servers in the `_process_event_queue_loop` Fix https://github.com/matrix-org/synapse/issues/13626 Part of https://github.com/matrix-org/synapse/issues/13356 Mentioned in [internal doc](https://docs.google.com/document/d/1lvUoVfYUiy6UaHB6Rb4HicjaJAU40-APue9Q4vzuW3c/edit#bookmark=id.2tvwz3yhcafh) ### Query performance #### Before The query from `get_current_state` sucks just because we have to get all 80k events. And we see almost the exact same performance locally trying to get all of these events (16s vs 17s): ``` synapse=# SELECT type, state_key, event_id FROM current_state_events WHERE room_id = '!OGEhHVWSdvArJzumhm:matrix.org'; Time: 16035.612 ms (00:16.036) synapse=# SELECT type, state_key, event_id FROM current_state_events WHERE room_id = '!OGEhHVWSdvArJzumhm:matrix.org'; Time: 4243.237 ms (00:04.243) ``` But what about `get_current_hosts_in_room`: When there is 8M rows in the `current_state_events` table, the previous query in `get_current_hosts_in_room` took 13s from complete freshness (when the events were first added). But takes 930ms after a Postgres restart or 390ms if running back to back to back. ```sh $ psql synapse synapse=# \timing on synapse=# SELECT COUNT(DISTINCT substring(state_key FROM '@[^:]:(.)$')) FROM current_state_events WHERE type = 'm.room.member' AND membership = 'join' AND room_id = '!OGEhHVWSdvArJzumhm:matrix.org'; count ------- 4130 (1 row) Time: 13181.598 ms (00:13.182) synapse=# SELECT COUNT() from current_state_events where room_id = '!OGEhHVWSdvArJzumhm:matrix.org'; count ------- 80814 synapse=# SELECT COUNT() from current_state_events; count --------- 8162847 synapse=# SELECT pg_size_pretty( pg_total_relation_size('current_state_events') ); pg_size_pretty ---------------- 4702 MB ``` #### After I'm not sure how long it takes from complete freshness as I only really get that opportunity once (maybe restarting computer but that's cumbersome) and it's not really relevant to normal operating times. Maybe you get closer to the fresh times the more access variability there is so that Postgres caches aren't as exact. Update: The longest I've seen this run for is 6.4s and 4.5s after a computer restart. After a Postgres restart, it takes 330ms and running back to back takes 260ms. ```sh $ psql synapse synapse=# \timing on Timing is on. synapse=# SELECT substring(c.state_key FROM '@[^:]:(.)$') as host FROM current_state_events c /* Get the depth of the event from the events table */ INNER JOIN events AS e USING (event_id) WHERE c.type = 'm.room.member' AND c.membership = 'join' AND c.room_id = '!OGEhHVWSdvArJzumhm:matrix.org' GROUP BY host ORDER BY min(e.depth) ASC; Time: 333.800 ms ``` #### Going further To improve things further we could add a `limit` parameter to `get_current_hosts_in_room`. Realistically, we don't need 4k domains to choose from because there is no way we're going to query that many before we a) probably get an answer or b) we give up. Another thing we can do is optimize the query to use a index skip scan: - https://wiki.postgresql.org/wiki/Loose_indexscan - Index Skip Scan, https://commitfest.postgresql.org/37/1741/ - https://www.timescale.com/blog/how-we-made-distinct-queries-up-to-8000x-faster-on-postgresql/	2022-08-30 01:38:14 -05:00
Richard van der Hoff	4f6de33f41	Print complement failure results last (#13639 ) Since github always scrolls to the bottom of any test output, let's put the failed tests last and hide any successful packages.	2022-08-28 20:05:30 +00:00
Richard van der Hoff	c4e29b6908	Improve documentation around user registration (#13640 ) Update a bunch of the documentation for user registration, add some cross links, etc.	2022-08-26 13:29:31 +00:00
Richard van der Hoff	5e5c8150d7	Generate missing configuration files at startup (#13615 ) If things like the signing key file are missing, let's just try to generate them on startup. Again, this is useful for k8s-like deployments where we just want to generate keys on the first run.	2022-08-26 11:26:06 +00:00
Jörg Behrmann	998e211836	Update debhelper (#13594 ) * Update debian packaging to debhelper version 12 Don't call dh_installinit anymore, because it has been deprecated, and use dh_installsystemd instead of dh_systemd_enable for the same reason. Signed-off-by: Jörg Behrmann <behrmann@physik.fu-berlin.de> * Drop preinst script It was used for reasons of interactions of dh_systemd_start and dh_installinit, which have both be deprecated Signed-off-by: Jörg Behrmann <behrmann@physik.fu-berlin.de> * Drop /etc/default file It was no longer being installed. * Remove debian/compat file This is managed by the control file nowadays	2022-08-26 08:10:54 +00:00
Brad Murray	967d7bad6c	Move the execution of the retention purge_jobs to the main worker (#13632 ) Fixes #9927 Signed-off-by: Brad Murray brad@beeper.com	2022-08-26 08:38:10 +01:00
Jörg Behrmann	978666a088	Debian packaging: explicitly allocate a group for the system user (#13593 ) Otherwise the files of the synapse user are readable by the nobody user, which is unsafe. Signed-off-by: Jörg Behrmann <behrmann@physik.fu-berlin.de>	2022-08-25 16:56:55 +00:00
Richard van der Hoff	d092e6f32a	Support `registration_shared_secret` in a file (#13614 ) A new `registration_shared_secret_path` option. This is kinda handy for k8s deployments and things.	2022-08-25 16:27:46 +00:00
Richard van der Hoff	a2ce614447	register_new_matrix_user: read server url from config (#13616 ) Fixes https://github.com/matrix-org/synapse/issues/3672: `https://localhost:8448` is virtually never right.	2022-08-25 15:29:08 +01:00
Kat Gerasimova	a282446502	Update automation for incoming issues (#13629 ) GitHub appears to be deprecating addProjectNextItem by not allowing it to be used alongside projectV2 to get the project ID, so switching to using addProjectV2ItemById instead.	2022-08-25 12:09:23 +01:00
Eric Eastwood	0bf180cbb4	Comment about a better future where we can get the state diff between two events (#13586 ) Split off from https://github.com/matrix-org/synapse/pull/13561 Part of https://github.com/matrix-org/synapse/issues/13356 Mentioned in [internal doc](https://docs.google.com/document/d/1lvUoVfYUiy6UaHB6Rb4HicjaJAU40-APue9Q4vzuW3c/edit#bookmark=id.2tvwz3yhcafh)	2022-08-24 18:59:27 -05:00
David Robertson	c406d50d2d	Rename `event_map` to `unpersisted_events` (#13603 )	2022-08-24 21:06:31 +01:00
Eric Eastwood	1a209efdb2	Update `get_users_in_room` mis-use to get hosts with dedicated `get_current_hosts_in_room` (#13605 ) See https://github.com/matrix-org/synapse/pull/13575#discussion_r953023755	2022-08-24 14:15:37 -05:00
Eric Eastwood	d58615c82c	Directly lookup local membership instead of getting all members in a room first (`get_users_in_room` mis-use) (#13608 ) See https://github.com/matrix-org/synapse/pull/13575#discussion_r953023755	2022-08-24 14:13:12 -05:00
Eric Eastwood	b93bd95e8a	When loading current ids, sort by `stream_id` to avoid incorrect overwrite and avoid errors caused by sorting alphabetical instance name which can be `null` (#13585 ) When loading current ids, sort by stream ID so that we don't want to overwrite the `current_position` of an instance to a lower stream ID than we're actually at ([discussion](https://github.com/matrix-org/synapse/pull/13585#discussion_r951795379)). Previously, it sorted alphabetically by instance name which can be `null` and throw errors but more importantly, accomplishes nothing. Fixes the following startup error which is why I started looking into this area: ``` $ poetry run synapse_homeserver --config-path homeserver.yaml ************************************************************** Error during initialisation: '<' not supported between instances of 'NoneType' and 'str' There may be more information in the logs. ************************************************************** ``` Somehow my database ended up looking like the following, notice the `instance_name` is `null` in the db, and we can't sort `NoneType` things. Another question is why do we see the `instance_name` as `null` sometimes instead of `master` in monolith mode? ``` $ psql synapse synapse=# SELECT * FROM stream_positions; stream_name \| instance_name \| stream_id -----------------+---------------+----------- account_data \| master \| 1242 events \| master \| 1787 to_device \| master \| 58 presence_stream \| master \| 485638 receipts \| master \| 341 backfill \| master \| -139106 (6 rows) synapse=# SELECT instance_name, stream_id FROM receipts_linearized; instance_name \| stream_id ---------------+----------- \| 211 \| 3 \| 4 \| 212 \| 213 \| 224 \| 228 \| 164 \| 313 \| 253 \| 38 \| 321 \| 324 \| 189 \| 192 \| 193 \| 194 \| 195 \| 197 \| 198 \| 275 \| 79 \| 339 \| 340 \| 82 \| 341 \| 84 \| 85 \| 91 \| 119 ```	2022-08-24 12:53:46 -05:00
Eric Eastwood	c807b814ae	Use dedicated `get_local_users_in_room` to find local users when calculating `join_authorised_via_users_server` of a `/make_join` request (#13606 ) Use dedicated `get_local_users_in_room` to find local users when calculating `join_authorised_via_users_server` ("the authorising user for joining a restricted room") of a `/make_join` request. Found while working on https://github.com/matrix-org/synapse/pull/13575#discussion_r953023755 but it's not related.	2022-08-24 11:14:28 -05:00
Andy Balaam	371db86a86	First draft of triage_labelled action (#13612 )	2022-08-24 13:59:33 +01:00
reivilibre	be4250c7a8	Add experimental configuration option to allow disabling legacy Prometheus metric names. (#13540 ) Co-authored-by: David Robertson <davidr@element.io>	2022-08-24 11:35:54 +00:00
Kat Gerasimova	2e2040c93e	Add GitHub automation for new issues (#13610 ) Set up automation to move newly opened issues in GitHub to the issue triage board.	2022-08-24 12:10:32 +01:00
Nick Mills-Barrett	b687010f89	Rewrite get push actions queries (#13597 )	2022-08-24 10:12:51 +01:00
reivilibre	ba882c0357	Faster Room Joins: fix `/make_knock` blocking indefinitely when the room in question is a partial-stated room. (#13583 ) Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>	2022-08-24 09:09:59 +00:00
Eric Eastwood	7af07f9716	Instrument `_check_sigs_and_hash_and_fetch` to trace time spent in child concurrent calls (#13588 ) Instrument `_check_sigs_and_hash_and_fetch` to trace time spent in child concurrent calls because I've see `_check_sigs_and_hash_and_fetch` take [10.41s to process 100 events](https://github.com/matrix-org/synapse/issues/13587) Fix https://github.com/matrix-org/synapse/issues/13587 Part of https://github.com/matrix-org/synapse/issues/13356	2022-08-23 21:53:37 -05:00
David Robertson	a25a37002c	Write about the chain cover a little. (#13602 ) Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>	2022-08-23 17:41:55 +00:00
Erik Johnston	f7ddfe17a3	Speed up `@cachedList` (#13591 ) This speeds things up by ~2x. The vast majority of the time is now spent in `LruCache` moving things around the linked lists. We do this via two things: 1. Don't create a deferred per-key during bulk set operations in `DeferredCache`. Instead, only create them if a subsequent caller asks for the key. 2. Add a bulk lookup API to `DeferredCache` rather than use a loop.	2022-08-23 14:53:27 +00:00
Erik Johnston	05c9c7363b	Fix regression caused by #13573 (#13600 ) Broke in #13573.	2022-08-23 14:14:05 +00:00

... 3 4 5 6 7 ...

21479 commits