Recently I have observed that dendrite spends a lot of time (~390s) in
`selectRoomIDsWithAnyMembershipSQL` query
```
dendrite_syncapi=# select total_exec_time, left(query,100) from pg_stat_statements order by total_exec_time desc limit 5 ;
total_exec_time | left
--------------------+------------------------------------------------------------------------------------------------------
747826.5800519128 | SELECT event_id, id, headered_event_json, session_id, exclude_from_sync, transaction_id, history_vis
389130.5490339942 | SELECT DISTINCT room_id, membership FROM syncapi_current_room_state WHERE type = $2 AND state_key =
376104.17514700035 | SELECT psd.datname, xact_commit, xact_rollback, blks_read, blks_hit, tup_returned, tup_fetched, tup_
363644.164092031 | SELECT event_type_nid, event_state_key_nid, event_nid FROM roomserver_events WHERE event_nid = ANY($
58570.48104699995 | SELECT event_id, headered_event_json FROM syncapi_current_room_state WHERE room_id = $1 AND ( $2::te
(5 rows)
```
Explain analyze showed correct usage of `syncapi_room_state_unique`
index:
```
dendrite_syncapi=#
explain analyze SELECT distinct room_id, membership FROM syncapi_current_room_state WHERE type = 'm.room.member' AND state_key = '@qjfl:dendrite.stg.globekeeper.com';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Unique (cost=2749.38..2749.56 rows=24 width=52) (actual time=2.933..2.956 rows=65 loops=1)
-> Sort (cost=2749.38..2749.44 rows=24 width=52) (actual time=2.932..2.937 rows=65 loops=1)
Sort Key: room_id, membership
Sort Method: quicksort Memory: 34kB
-> Index Scan using syncapi_room_state_unique on syncapi_current_room_state (cost=0.41..2748.83 rows=24 width=52) (actual time=0.030..2.890 rows=65 loops=1)
Index Cond: ((type = 'm.room.member'::text) AND (state_key = '@qjfl:dendrite.stg.globekeeper.com'::text))
Planning Time: 0.140 ms
Execution Time: 2.990 ms
(8 rows)
```
Multi-column indexes in Postgres shall perform well for leftmost
columns, but I gave it a try and created
`syncapi_current_room_state_type_state_key_idx` index. I could observe
significant performance improvement. Execution time dropped from 2.9 ms
to 0.24 ms:
```
explain analyze SELECT distinct room_id, membership FROM syncapi_current_room_state WHERE type = 'm.room.member' AND state_key = '@qjfl:dendrite.stg.globekeeper.com';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Unique (cost=96.46..96.64 rows=24 width=52) (actual time=0.199..0.218 rows=65 loops=1)
-> Sort (cost=96.46..96.52 rows=24 width=52) (actual time=0.199..0.202 rows=65 loops=1)
Sort Key: room_id, membership
Sort Method: quicksort Memory: 34kB
-> Bitmap Heap Scan on syncapi_current_room_state (cost=4.53..95.91 rows=24 width=52) (actual time=0.048..0.139 rows=65 loops=1)
Recheck Cond: ((type = 'm.room.member'::text) AND (state_key = '@qjfl:dendrite.stg.globekeeper.com'::text))
Heap Blocks: exact=59
-> Bitmap Index Scan on syncapi_current_room_state_type_state_key_idx (cost=0.00..4.53 rows=24 width=0) (actual time=0.037..0.037 rows=65 loops=1)
Index Cond: ((type = 'm.room.member'::text) AND (state_key = '@qjfl:dendrite.stg.globekeeper.com'::text))
Planning Time: 0.236 ms
Execution Time: 0.242 ms
(11 rows)
```
Next improvement is skipping DISTINCT and rely on map assignment in
`SelectRoomIDsWithAnyMembership`. Execution time drops by almost half:
```
explain analyze SELECT room_id, membership FROM syncapi_current_room_state WHERE type = 'm.room.member' AND state_key = '@qjfl:dendrite.stg.globekeeper.com';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on syncapi_current_room_state (cost=4.53..95.91 rows=24 width=52) (actual time=0.032..0.113 rows=65 loops=1)
Recheck Cond: ((type = 'm.room.member'::text) AND (state_key = '@qjfl:dendrite.stg.globekeeper.com'::text))
Heap Blocks: exact=59
-> Bitmap Index Scan on syncapi_current_room_state_type_state_key_idx (cost=0.00..4.53 rows=24 width=0) (actual time=0.021..0.021 rows=65 loops=1)
Index Cond: ((type = 'm.room.member'::text) AND (state_key = '@qjfl:dendrite.stg.globekeeper.com'::text))
Planning Time: 0.087 ms
Execution Time: 0.136 ms
(7 rows)
```
In our env we spend only 1s on inserting to table, so the write penalty
of creating an index should be small.
```
dendrite_syncapi=# select total_exec_time, left(query,100) from pg_stat_statements where query like '%INSERT%syncapi_current_room_state%' order by total_exec_time desc;
total_exec_time | left
--------------------+------------------------------------------------------------------------------------------------------
1139.9057619999971 | INSERT INTO syncapi_current_room_state (room_id, event_id, type, sender, contains_url, state_key, he
(1 row)
```
This PR does not require test modifications.
### Pull Request Checklist
<!-- Please read docs/CONTRIBUTING.md before submitting your pull
request -->
* [x] I have added added tests for PR _or_ I have justified why this PR
doesn't need tests.
* [x] Pull request includes a [sign
off](https://github.com/matrix-org/dendrite/blob/main/docs/CONTRIBUTING.md#sign-off)
Signed-off-by: `Piotr Kozimor <p1996k@gmail.com>`
Introduced index improves select query performance. Example execution time of `selectSendToDeviceMessagesSQL` query dropped from 80 ms to 15 ms. No sytest modifications are required.
### Pull Request Checklist
* [x] I have added added tests for PR _or_ I have justified why this PR doesn't need tests.
* [x] Pull request includes a [sign off](https://github.com/matrix-org/dendrite/blob/main/docs/CONTRIBUTING.md#sign-off)
Signed-off-by: `Piotr Kozimor <p1996k@gmail.com>`
* Add possibility to set history_visibility and user AccountType
* Add new DB queries
* Add actual history_visibility changes for /messages
* Add passing tests
* Extract check function
* Cleanup
* Cleanup
* Fix build on 386
* Move ApplyHistoryVisibilityFilter to internal
* Move queries to topology table
* Add filtering to /sync and /context
Some cleanup
* Add passing tests; Remove failing tests :(
* Re-add passing tests
* Move filtering to own function to avoid duplication
* Re-add passing test
* Use newly added GMSL HistoryVisibility
* Update gomatrixserverlib
* Set the visibility when creating events
* Default to shared history visibility
* Remove unused query
* Update history visibility checks to use gmsl
Update tests
* Remove unused statement
* Update migrations to set "correct" history visibility
* Add method to fetch the membership at a given event
* Tweaks and logging
* Use actual internal rsAPI, default to shared visibility in tests
* Revert "Move queries to topology table"
This reverts commit 4f0d41be9c.
* Remove noise/unneeded code
* More cleanup
* Try to optimize database requests
* Fix imports
* PR peview fixes/changes
* Move setting history visibility to own migration, be more restrictive
* Fix unit tests
* Lint
* Fix missing entries
* Tweaks for incremental syncs
* Adapt generic changes
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
Co-authored-by: kegsay <kegan@matrix.org>
* Fix query issue, only add "changed" users if we actually share a room
* Avoid log spam if context is done
* Undo changes to filterSharedUsers
* Add logging again..
* Fix SQLite shared users query
* Change query to include invited users
* Add new db migration
* Update migrations
Remove goose
* Add possibility to test direct upgrades
* Try to fix WASM test
* Add checks for specific migrations
* Remove AddMigration
Use WithTransaction
Add Dendrite version to table
* Fix linter issues
* Update tests
* Update comments, outdent if
* Namespace migrations
* Add direct upgrade tests, skipping over one version
* Split migrations
* Update go version in CI
* Fix copy&paste mistake
* Use contexts in migrations
Co-authored-by: kegsay <kegan@matrix.org>
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Add function to the sync API storage package for filtering shared users
* Use the database instead of asking the RS API
* Fix unit tests
* Fix map handling in `filterSharedUsers`
* Only load members of newly joined rooms
* Comment that the query is prepared at runtime
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Use filter and limit presence count
* More limiting
* More limiting
* Fix unit test
* Also limit presence by last_active_ts
* Update query, use "from" as the initial lastPos
* Get 1000 presence events, they are filtered later
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Don't create fictitious presence entries for users that don't have any
* Update whitelist, since that test probably shouldn't be passing
* Fix panics
Squashed commit of the following:
commit 0ec8de57261d573a5f88577aa9d7a1174d3999b9
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Tue Apr 26 16:56:30 2022 +0100
Select filter onto provided target filter
commit da40b6fffbf5737864b223f49900048f557941f9
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Tue Apr 26 16:48:00 2022 +0100
Specify other field too
commit ffc0b0801f63bb4d3061b6813e3ce5f3b4c8fbcb
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Tue Apr 26 16:45:44 2022 +0100
Send as much account data as possible during complete sync
* syncapi: add more tests; fix more bugs
bugfixes:
- The postgres impl of TopologyTable.SelectEventIDsInRange did not use the provided txn
- The postgres impl of EventsTable.SelectEvents did not preserve the ordering of the input event IDs in the output events slice
- The sqlite impl of EventsTable.SelectEvents did not use a bulk `IN ($1)` query.
Added tests:
- `TestGetEventsInRangeWithTopologyToken`
- `TestOutputRoomEventsTable`
- `TestTopologyTable`
* -p 1 for now
* Add ignore users
* Ignore users in pushrules
Add passing tests
* Update sytest lists
* Store ignore knowledge in the sync API
* Fix copyrights
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Include joined and invite member counts in room summary
This should fix#2314 and also fix the problem where some clients like Element Android, Fluffychat etc would display the wrong member count for a given room.
* Improve SQLite query precision
* Check existence of state key for membership events
* Move receipt sending to own JetStream producer
* Move SendToDevice to producer
* Remove most parts of the EDU server
* Fix SendToDevice & copyrights
* Move structs, cleanup EDU Server traces
* Use HeadersOnly subscription
* Missing file
* Fix linter issues
* Move consumers to own files
* Rename durable consumer; Consumer cleanup
* Docs/config cleanup
* Convert stream positions into topological positions for both `from` and `to` in `/messages`
* Hopefully it works now
* Remove unnecessary logging
* Return sane values if `StreamToTopologicalPosition` can't work out the right thing to do
* Revert logging change
* tweaks
* Fix `selectEventIDsInRangeASCSQL`
* Test `Getting messages going forward is limited for a departed room (SPEC-216)` was passing incorrectly so un-whitelist it
* Add Pushserver component with Pushers API
Co-authored-by: Tommie Gannert <tommie@gannert.se>
Co-authored-by: Dan Peleg <dan@globekeeper.com>
* Wire Pushserver component
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Add PushGatewayClient.
The full event format is required for Sytest.
* Add a pushrules module.
* Change user API account creation to use the new pushrules module's defaults.
Introduces "scope" as required by client API, and some small field
tweaks to make some 61push Sytests pass.
* Add push rules query/put API in Pushserver.
This manipulates account data over User API, and fires sync messages
for changes. Those sync messages should, according to an existing TODO
in clientapi, be moved to userapi.
Forks clientapi/producers/syncapi.go to pushserver/ for later extension.
* Add clientapi routes for push rules to Pushserver.
A cleanup would be to move more of the name-splitting logic into
pushrules.go, to depollute routing.go.
* Output rooms.join.unread_notifications in /sync.
This is the read-side. Pushserver will be the write-side.
* Implement pushserver/storage for notifications.
* Use PushGatewayClient and the pushrules module in Pushserver's room consumer.
* Use one goroutine per user to avoid locking up the entire server for
one bad push gateway.
* Split pushing by format.
* Send one device per push. Sytest does not support coalescing
multiple devices into one push. Matches Synapse. Either we change
Sytest, or remove the group-by-url-and-format logic.
* Write OutputNotificationData from push server. Sync API is already
the consumer.
* Implement read receipt consumers in Pushserver.
Supports m.read and m.fully_read receipts.
* Add clientapi route for /unstable/notifications.
* Rename to UpsertPusher for clarity and handle pusher update
* Fix linter errors
* Ignore body.Close() error check
* Fix push server internal http wiring
* Add 40 newly passing 61push tests to whitelist
* Add next 12 newly passing 61push tests to whitelist
* Send notification data before notifying users in EDU server consumer
* NATS JetStream
* Goodbye sarama
* Fix `NewStreamTokenFromString`
* Consume on the correct topic for the roomserver
* Don't panic, NAK instead
* Move push notifications into the User API
* Don't set null values since that apparently causes Element upsetti
* Also set omitempty on conditions
* Fix bug so that we don't override the push rules unnecessarily
* Tweak defaults
* Update defaults
* More tweaks
* Move `/notifications` onto `r0`/`v3` mux
* User API will consume events and read/fully read markers from the sync API with stream positions, instead of consuming directly
Co-authored-by: Piotr Kozimor <p1996k@gmail.com>
Co-authored-by: Tommie Gannert <tommie@gannert.se>
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Don't request state events if we already have the timeline events (Postgres only)
* Rename variable
* nocyclo
* Add SQLite
* Tweaks
* Revert query change
* Don't dedupe if asking for full state
* Update query