This should transactional snapshot isolation for `/sync` etc requests.
For now we don't use repeatable read due to some odd test failures with
invites.
Based on #2480
This actually indexes events based on their event type. They are removed
from the index if we receive a `m.room.redaction` event on the
`OutputRoomEvent` stream.
An admin endpoint is added to reindex all existing events.
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
This PR changes the handling of notifications
- removes the `StreamEvent` and `ReadUpdate` stream
- listens on the `OutputRoomEvent` stream in the UserAPI to inform the
SyncAPI about unread notifications
- listens on the `OutputReceiptEvent` stream in the UserAPI to set
receipts/update notifications
- sets the `read_markers` directly from within the internal UserAPI
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Membership updater refactoring
* Pass in membership state
* Use membership check rather than referring to state directly
* Delete irrelevant membership states
* We don't need the leave event after all
* Tweaks
* Put a log entry in that I might stand a chance of finding
* Be less panicky
* Tweak invite handling
* Don't freak if we can't find the event NID
* Use event NID from `types.Event`
* Clean up
* Better invite handling
* Placate the almighty linter
* Blacklist a Sytest which is otherwise fine under Complement for reasons I don't understand
* Fix the sytest after all (thanks @S7evinK for the spot)
* Don't ask roomserver for events we already have in federation API
* Check number of events returned is as expected
* Preallocate array
* Improve shape a bit
* syncapi: use finer-grained interfaces when making the syncapi
* Use specific interfaces for syncapi-roomserver interactions
* Define query access token api for shared http auth code
* Roomserver input refactoring — again!
* Ensure the actor runs again
* Preserve consumer after unsubscribe
* Another sprinkling of magic
* Rename `TopicFor` to `Prefixed`
* Recreate the stream if the config is bad
* Check streams too
* Prefix subjects, preserve inboxes
* Recreate if subjects wrong
* Remove stream subject
* Reconstruct properly
* Fix mutex unlock
* Comments
* Fix tests
* Don't drop events
* Review comments
* Separate `queueInputRoomEvents` function
* Re-jig control flow a bit
* Don't send `adds_state_events` in roomserver output events anymore
* Set `omitempty` on some output fields that aren't always set
* Add `AddsState` helper function
* No-op if no added state event IDs
* Revert "No-op if no added state event IDs"
This reverts commit 71a0ef3df1.
* Revert "Add `AddsState` helper function"
This reverts commit c9fbe45475.
* Add Pushserver component with Pushers API
Co-authored-by: Tommie Gannert <tommie@gannert.se>
Co-authored-by: Dan Peleg <dan@globekeeper.com>
* Wire Pushserver component
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Add PushGatewayClient.
The full event format is required for Sytest.
* Add a pushrules module.
* Change user API account creation to use the new pushrules module's defaults.
Introduces "scope" as required by client API, and some small field
tweaks to make some 61push Sytests pass.
* Add push rules query/put API in Pushserver.
This manipulates account data over User API, and fires sync messages
for changes. Those sync messages should, according to an existing TODO
in clientapi, be moved to userapi.
Forks clientapi/producers/syncapi.go to pushserver/ for later extension.
* Add clientapi routes for push rules to Pushserver.
A cleanup would be to move more of the name-splitting logic into
pushrules.go, to depollute routing.go.
* Output rooms.join.unread_notifications in /sync.
This is the read-side. Pushserver will be the write-side.
* Implement pushserver/storage for notifications.
* Use PushGatewayClient and the pushrules module in Pushserver's room consumer.
* Use one goroutine per user to avoid locking up the entire server for
one bad push gateway.
* Split pushing by format.
* Send one device per push. Sytest does not support coalescing
multiple devices into one push. Matches Synapse. Either we change
Sytest, or remove the group-by-url-and-format logic.
* Write OutputNotificationData from push server. Sync API is already
the consumer.
* Implement read receipt consumers in Pushserver.
Supports m.read and m.fully_read receipts.
* Add clientapi route for /unstable/notifications.
* Rename to UpsertPusher for clarity and handle pusher update
* Fix linter errors
* Ignore body.Close() error check
* Fix push server internal http wiring
* Add 40 newly passing 61push tests to whitelist
* Add next 12 newly passing 61push tests to whitelist
* Send notification data before notifying users in EDU server consumer
* NATS JetStream
* Goodbye sarama
* Fix `NewStreamTokenFromString`
* Consume on the correct topic for the roomserver
* Don't panic, NAK instead
* Move push notifications into the User API
* Don't set null values since that apparently causes Element upsetti
* Also set omitempty on conditions
* Fix bug so that we don't override the push rules unnecessarily
* Tweak defaults
* Update defaults
* More tweaks
* Move `/notifications` onto `r0`/`v3` mux
* User API will consume events and read/fully read markers from the sync API with stream positions, instead of consuming directly
Co-authored-by: Piotr Kozimor <p1996k@gmail.com>
Co-authored-by: Tommie Gannert <tommie@gannert.se>
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Put federation client functions into their own file
* Look for missing auth events in RS input
* Remove retrieveMissingAuthEvents from federation API
* Logging
* Sorta transplanted the code over
* Use event origin failing all else
* Don't get stuck on mutexes:
* Add verifier
* Don't mark state events with zero snapshot NID as not existing
* Check missing state if not an outlier before storing the event
* Reject instead of soft-fail, don't copy roominfo so much
* Use synchronous contexts, limit time to fetch missing events
* Clean up some commented out bits
* Simplify `/send` endpoint significantly
* Submit async
* Report errors on sending to RS input
* Set max payload in NATS to 16MB
* Tweak metrics
* Add `workerForRoom` for tidiness
* Try skipping unmarshalling errors for RespMissingEvents
* Track missing prev events separately to avoid calculating state when not possible
* Tweak logic around checking missing state
* Care about state when checking missing prev events
* Don't check missing state for create events
* Try that again
* Handle create events better
* Send create room events as new
* Use given event kind when sending auth/state events
* Revert "Use given event kind when sending auth/state events"
This reverts commit 089d64d271.
* Only search for missing prev events or state for new events
* Tweaks
* We only have missing prev if we don't supply state
* Room version tweaks
* Allow async inputs again
* Apply backpressure to consumers/synchronous requests to hopefully stop things being overwhelmed
* Set timeouts on roomserver input tasks (need to decide what timeout makes sense)
* Use work queue policy, deliver all on restart
* Reduce chance of duplicates being sent by NATS
* Limit the number of servers we attempt to reduce backpressure
* Some review comment fixes
* Tidy up a couple things
* Don't limit servers, randomise order using map
* Some context refactoring
* Update gmsl
* Don't resend create events
* Set stateIDs length correctly or else the roomserver thinks there are missing events when there aren't
* Exclude our own servername
* Try backing off servers
* Make excluding self behaviour optional
* Exclude self from g_m_e
* Update sytest-whitelist
* Update consumers for the roomserver output stream
* Remember to send outliers for state returned from /gme
* Make full HTTP tests less upsetti
* Remove 'If a device list update goes missing, the server resyncs on the next one' from the sytest blacklist
* Remove debugging test
* Fix blacklist again, remove unnecessary duplicate context
* Clearer contexts, don't use background in case there's something happening there
* Don't queue up events more than once in memory
* Correctly identify create events when checking for state
* Fill in gaps again in /gme code
* Remove `AuthEventIDs` from `InputRoomEvent`
* Remove stray field
Co-authored-by: Kegan Dougal <kegan@matrix.org>
* Use named NATS durable consumers
* Build fixes
* Remove dupe call to SetFederationAPI
* Use namespaced consumer name
* Fix namespacing
* Fix unit tests hopefully
* Add NATS JetStream support
Update shopify/sarama
* Fix addresses
* Don't change Addresses in Defaults
* Update saramajetstream
* Add missing error check
Keep typing events for at least one minute
* Use all configured NATS addresses
* Update saramajetstream
* Try setting up with NATS
* Make sure NATS uses own persistent directory (TODO: make this configurable)
* Update go.mod/go.sum
* Jetstream package
* Various other refactoring
* Build fixes
* Config tweaks, make random jetstream storage path for CI
* Disable interest policies
* Try to sane default on jetstream base path
* Try to use in-memory for CI
* Restore storage/retention
* Update nats.go dependency
* Adapt changes to config
* Remove unneeded TopicFor
* Dep update
* Revert "Remove unneeded TopicFor"
This reverts commit f5a4e4a339.
* Revert changes made to streams
* Fix build problems
* Update nats-server
* Update go.mod/go.sum
* Roomserver input API queuing using NATS
* Fix topic naming
* Prometheus metrics
* More refactoring to remove saramajetstream
* Add missing topic
* Don't try to populate map that doesn't exist
* Roomserver output topic
* Update go.mod/go.sum
* Message acknowledgements
* Ack tweaks
* Try to resume transaction re-sends
* Try to resume transaction re-sends
* Update to matrix-org/gomatrixserverlib@91dadfb
* Remove internal.PartitionStorer from components that don't consume keychanges
* Try to reduce re-allocations a bit in resolveConflictsV2
* Tweak delivery options on RS input
* Publish send-to-device messages into correct JetStream subject
* Async and sync roomserver input
* Update dendrite-config.yaml
* Remove roomserver tests for now (they need rewriting)
* Remove roomserver test again (was merged back in)
* Update documentation
* Docker updates
* More Docker updates
* Update Docker readme again
* Fix lint issues
* Send final event in `processEvent` synchronously (since this might stop Sytest from being so upset)
* Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that
* Go 1.16 instead of Go 1.13 for upgrade tests and Complement
* Revert "Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that"
This reverts commit 368675283f.
* Don't report any errors on `/send` to see what fun that creates
* Fix panics on closed channel sends
* Enforce state key matches sender
* Do the same for leave
* Various tweaks to make tests happier
Squashed commit of the following:
commit 13f9028e7a
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Tue Jan 4 15:47:14 2022 +0000
Do the same for leave
commit e6be7f05c3
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Tue Jan 4 15:33:42 2022 +0000
Enforce state key matches sender
commit 85ede6d64b
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Tue Jan 4 14:07:04 2022 +0000
Fix panics on closed channel sends
commit 9755494a98
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Tue Jan 4 13:38:22 2022 +0000
Don't report any errors on `/send` to see what fun that creates
commit 3bb4f87b5d
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Tue Jan 4 13:00:26 2022 +0000
Revert "Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that"
This reverts commit 368675283f.
commit fe2673ed7b
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Tue Jan 4 12:09:34 2022 +0000
Go 1.16 instead of Go 1.13 for upgrade tests and Complement
commit 368675283f
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Tue Jan 4 11:51:45 2022 +0000
Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that
commit b028dfc085
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Tue Jan 4 10:29:08 2022 +0000
Send final event in `processEvent` synchronously (since this might stop Sytest from being so upset)
* Merge in NATS Server v2.6.6 and nats.go v1.13 into the in-process connection fork
* Add `jetstream.WithJetStreamMessage` to make ack/nak-ing less messy, use process context in consumers
* Fix consumer component name in federation API
* Add comment explaining where streams are defined
* Tweaks to roomserver input with comments
* Finish that sentence that I apparently forgot to finish in INSTALL.md
* Bump version number of config to 2
* Add comments around asynchronous sends to roomserver in processEventWithMissingState
* More useful error message when the config version does not match
* Set version in generate-config
* Fix version in config.Defaults
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* It's half-alive
* Wakeups largely working
* Other tweaks, typing works
* Fix bugs, add receipt stream
* Delete notifier, other tweaks
* Dedupe a bit, add a template for the invite stream
* Clean up, add templates for other streams
* Don't leak channels
* Bring forward some more PDU logic, clean up other places
* Add some more wakeups
* Use addRoomDeltaToResponse
* Log tweaks, typing fixed?
* Fix timed out syncs
* Don't reset next batch position on timeout
* Add account data stream/position
* End of day
* Fix complete sync for receipt, typing
* Streams package
* Clean up a bit
* Complete sync send-to-device
* Don't drop errors
* More lightweight notifications
* Fix typing positions
* Don't advance position on remove again unless needed
* Device list updates
* Advance account data position
* Use limit for incremental sync
* Limit fixes, amongst other things
* Remove some fmt.Println
* Tweaks
* Re-add notifier
* Fix invite position
* Fixes
* Notify account data without advancing PDU position in notifier
* Apply account data position
* Get initial position for account data
* Fix position update
* Fix complete sync positions
* Review comments @Kegsay
* Room consumer parameters
* Update sync responses
* Fix positions, add ApplyUpdates
* Fix MarshalText as non-pointer, PrevBatch is optional
* Increment by number of read receipts
* Merge branch 'master' into neilalexander/devicelist
* Tweak typing
* Include keyserver position tweak
* Fix typing next position in all cases
* Tweaks
* Fix typo
* Tweaks, restore StreamingToken.MarshalText which somehow went missing?
* Rely on positions from notifier rather than manually advancing them
* Revert "Rely on positions from notifier rather than manually advancing them"
This reverts commit 53112a62cc.
* Give invites their own position, fix other things
* Fix test
* Fix invites maybe
* Un-whitelist tests that look to be genuinely wrong
* Use real receipt positions
* Ensure send-to-device uses real positions too
* Set RewritesState once
* Check if any new state provided
* Obey rewritesState
* Don't nuke everything the sync API knows when purging state
* Fix panic from duplicate insert
* Consistency
* Use HasState
* Remove nolint
* Clean up joined rooms on state rewrite
* Add KindOld
* Don't process latest events/memberships for old events
* Allow federationsender to ignore duplicate key entries when LatestEventIDs is duplicated by RS output events
* Signal to downstream components if an event has become a forward extremity
* Don't exclude from sync
* Soft-fail checks on KindNew
* Don't run the latest events updater at all for KindOld
* Don't make federation sender change after all
* Kind in federation sender join
* Don't send isForwardExtremity
* Fix syncapi
* Update comments
* Fix SendEventWithState
* Update sytest-whitelist
* Generate old output events
* Sync API consumes old room events
* Update comments
* SendEventWithState events as new
* Use cumulative state IDs for final event
* Error wrapping in calculateAndSetState
* Handle overwriting same event type and state key
* Hacky way to spot historical events
* Don't exclude from sync
* Don't generate output events when rewriting forward extremities
* Update output event check
* Historical output events
* Define output room event type
* Notify key changes on state
* Don't send our membership event twice
* Deduplicate state entries
* Tweaks
* Remove unnecessary nolint
* Fix current state upsert in sync API
* Send auth events as outliers, state events as rewrite
* Sync API don't consume state events
* Process events actually
* Improve outlier check
* Fix local room check
* Remove extra room check, it seems to break the whole damn world
* Fix federated join check
* Fix nil pointer exception
* Better comments on DeduplicateStateEntries
* Reflow forced federated joins
* Don't force federated join for possibly even local invites
* Comment SendEventWithState better
* Rewrite room state in sync API storage
* Add TODO
* Clean up all room data when receiving create event
* Don't generate output events for rewrites, but instead notify that state is rewritten on the final new event
* Rename to PurgeRoom
* Exclude backfilled messages from /sync
* Split out rewriting state from updating state from state res
Co-authored-by: Kegan Dougal <kegan@matrix.org>
* Initial pass at refactoring config (not finished)
* Don't forget current state and EDU servers
* More shifting around
* Update server key API tests
* Fix roomserver test
* Fix more tests
* Further tweaks
* Fix current state server test (sort of)
* Maybe fix appservices
* Fix client API test
* Include database connection string in database options
* Fix sync API build
* Update config test
* Fix unit tests
* Fix federation sender build
* Fix gobind build
* Set Listen address for all services in HTTP monolith mode
* Validate config, reinstate appservice derived in directory, tweaks
* Tweak federation API test
* Set MaxOpenConnections/MaxIdleConnections to previous values
* Update generate-config
* Recheck device lists when join/leave events come in
* Add PerformDeviceDeletion
* Notify clients when devices are deleted
* Unbreak things
* Remove debug logging
* Add support for logs in StreamingToken
Tokens now end up looking like `s11_22|dl-0-123|ab-0-12224`
where `dl` and `ab` are log names, `0` is the partition and
`123` and `12224` are the offsets.
* Also test reserialisation
* s/|/./g so tokens url escape nicely
* Add a bit more logging to the fedsender
* bugfix: continue sending PDUs if ones are added whilst sending another PDU
Without this, the queue goes back to sleep on `<-oq.notifyPDUs` which won't
fire because `pendingPDUs` is already > 0. This should fix a flakey sytest.
* Break if no txn is sent
* WIP syncapi work
* More debugging
* Bump GMSL version to pull in working Event.Redact
* Remove logging
* Make redactions work on v3+
* Fix more tests
This is a wrapper around whatever impl we have which then logs
the function name/request/response/error.
Also tweak when we log on kafka streams: only log on the producer
side not the consumer side: we've never had issues with comms and
having 1 message rather than N would be nice.
* s/QueryBackfill/PerformBackfill/g
* OutputEvent now includes AddStateEvents which contain the full event of extra state events
* Only include adds not the current event
* Get adding state right
* syncapi: Rename and split out tokens
Previously we used the badly named `PaginationToken` which was
used for both `/sync` and `/messages` requests. This quickly
became confusing because named fields like `PDUPosition` meant
different things depending on the token type. Instead, we now have
two token types: `TopologyToken` and `StreamingToken`, both of
which have fields which make more sense for their specific situations.
Updated the codebase to use one or the other. `PaginationToken` still
lives on as `syncToken`, an unexported type which both tokens rely on.
This allows us to guarantee that the specific mappings of positions
to a string remain solely under the control of the `types` package.
This enables us to move high-level conceptual things like
"decrement this topological token" to function calls e.g
`TopologicalToken.Decrement()`.
Currently broken because `/messages` seemingly used both stream and
topological tokens, though I need to confirm this.
* final tweaks/hacks
* spurious logging
* Review comments and linting
* Consolidation of roomserver APIs
* Comment out alias tests for now, they are broken
* Wire AS API into roomserver again
* Roomserver didn't take asAPI param before so return to that
* Prevent roomserver asking AS API for alias info
* Rename some files
* Remove alias_test, incoherent tests and unwanted appservice integration
* Remove FS API inject on syncapi component