* db migration: fix#1844 and add additional assertions
- Migration scripts will now check to see if there are any unconverted
snapshot IDs and fail the migration if there are any. This should
prevent people from getting a corrupt database in the event the root
cause is still unknown.
- Add an ORDER BY clause when doing batch queries in the postgres
migration. LIMIT and OFFSET without ORDER BY are undefined and must
not be relied upon to produce a deterministic ordering (e.g row order).
See https://www.postgresql.org/docs/current/queries-limit.html
* Linting
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Hash-deduplicated state storage (and migrations) for PostgreSQL and SQLite
* Refactor droomserver database setup for migrations
* Fix conflict statements
* Update migration names
* Set a boundary for old to new block/snapshot IDs so we don't rewrite them more than once accidentally
* Create sequence if not exists
* Fix boundary queries
* Fix boundary queries
* Use Query
* Break out queries a bit
* More sequence tweaks
* Query parameters are not playing the game
* Injection escaping may not work for CREATE SEQUENCE after all
* Fix snapshot sequence name
* Use boundaried IDs in SQLite too
* Use IFNULL for SQLite
* Use COALESCE in PostgreSQL
* Review comments @Kegsay
* Check membership of room
* Use QueryStateAfterEventsResponse
* Fix complexity
* Add field ShouldHitAppservice to GetRoomIDForAlias
* Hit appservice when trying to join a non-existent alias
* remove unused
* Changes that I made a long time ago
* Rename to appserviceJoinedAtEvent
* Check membership in GetMemberships
* Update QueryMembershipsForRoom
* Tweaks in client API
* Update appserviceJoinedAtEvent
* Comments
* Try QueryMembershipForUser instead
* Undo some changes to client API that shouldn't be needed
* More /event tweaks
* Refactor /event bit
* Go back to QueryMembershipsForRoom because appservices are hard
* Fix bugs in onMessage
* Add comments
* More logical naming, clean up a bit
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Check membership of room
* Use QueryStateAfterEventsResponse
* Fix complexity
* Changes that I made a long time ago
* Rename to appserviceJoinedAtEvent
* Check membership in GetMemberships
* Update QueryMembershipsForRoom
* Tweaks in client API
* Update appserviceJoinedAtEvent
* Comments
* Try QueryMembershipForUser instead
* Undo some changes to client API that shouldn't be needed
* More /event tweaks
* Refactor /event bit
* Go back to QueryMembershipsForRoom because appservices are hard
* Fix bugs in onMessage
* Add comments
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Look up servers less often, don't hit API for missing auth events unless there are actually missing auth events
* Remove ResolveConflictsAdhoc (since it is already in GMSL), other tweaks
* Update gomatrixserverlib to matrix-org/gomatrixserverlib#254
* Fix resolve-state
* Initialise t.servers on first use
MSC1772 needs this because the create event contains info on if
the room is a space or not. The create event itself isn't sensitive
so other people may find this useful too.
* a very very WIP first cut of peeking via MSC2753.
doesn't yet compile or work.
needs to actually add the peeking block into the sync response.
checking in now before it gets any bigger, and to gather any initial feedback on the vague shape of it.
* make PeekingDeviceSet private
* add server_name param
* blind stab at adding a `peek` section to /sync
* make it build
* make it launch
* add peeking to getResponseWithPDUsForCompleteSync
* cancel any peeks when we join a room
* spell out how to runoutside of docker if you want speed
* fix SQL
* remove unnecessary txn for SelectPeeks
* fix s/join/peek/ cargocult fail
* HACK: Track goroutine IDs to determine when we write by the wrong thread
To use: set `DENDRITE_TRACE_SQL=1` then grep for `unsafe`
* Track partition offsets and only log unsafe for non-selects
* Put redactions in the writer goroutine
* Update filters on writer goroutine
* wrap peek storage in goid hack
* use exclusive writer, and MarkPeeksAsOld more efficiently
* don't log ascii in binary at sql trace...
* strip out empty roomd deltas
* re-add txn to SelectPeeks
* re-add accidentally deleted field
* reject peeks for non-worldreadable rooms
* move perform_peek
* fix package
* correctly refactor perform_peek
* WIP of implementing MSC2444
* typo
* Revert "Merge branch 'kegan/HACK-goid-sqlite-db-is-locked' into matthew/peeking"
This reverts commit 3cebd8dbfb, reversing
changes made to ed4b3a58a7.
* (almost) make it build
* clean up bad merge
* support SendEventWithState with optional event
* fix build & lint
* fix build & lint
* reinstate federated peeks in the roomserver (doh)
* fix sql thinko
* todo for authenticating state returned by /peek
* support returning current state from QueryStateAndAuthChain
* handle SS /peek
* reimplement SS /peek to prod the RS to tell the FS about the peek
* rename RemotePeeks as OutboundPeeks
* rename remote_peeks_table as outbound_peeks_table
* add perform_handle_remote_peek.go
* flesh out federation doc
* add inbound peeks table and hook it up
* rename ambiguous RemotePeek as InboundPeek
* rename FSAPI's PerformPeek as PerformOutboundPeek
* setup inbound peeks db correctly
* fix api.SendEventWithState with no event
* track latestevent on /peek
* go fmt
* document the peek send stream race better
* fix SendEventWithRewrite not to bail if handed a non-state event
* add fixme
* switch SS /peek to use SendEventWithRewrite
* fix comment
* use reverse topo ordering to find latest extrem
* support postgres for federated peeking
* go fmt
* back out bogus go.mod change
* Fix performOutboundPeekUsingServer
* Fix getAuthChain -> GetAuthChain
* Fix build issues
* Fix build again
* Fix getAuthChain -> GetAuthChain
* Don't repeat outbound peeks for the same room ID to the same servers
* Fix lint
* Don't omitempty to appease sytest
Co-authored-by: Kegan Dougal <kegan@matrix.org>
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Light-weight checking of state changes when updating forward extremities
* Only do this for non-state events, since state events will always result in state change at extremities
Squashed commit of the following:
commit e5e2d793119733ecbcf9b85f966e018ab0318741
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Wed Jan 13 17:28:12 2021 +0000
Add dendrite_roomserver_processroomevent_duration_millis to prometheus
* Add RoomInfo cache, remove RoomServerRoomNID cache, ensure caches are thread-safe
* Don't panic if the roomInfo isn't known yet
* LRU package is already threadsafe
* Use RoomInfo cache to find room version if possible in Events()
* Adding comments about RoomInfoCache safety
* Hit the database far less to find room NIDs for event NIDs
* Close the rows
* Fix SQLite selectRoomNIDsForEventNIDsSQL
* Give same treatment to room version lookups
* Update GMSL
* Add MSC2836EventRelationships to fedsender
* Call MSC2836EventRelationships in reqCtx
* auth remote servers
* Extract room ID and servers from previous events; refactor a bit
* initial cut of federated threading
* Use the right client/fed struct in the response
* Add QueryAuthChain for use with MSC2836
* Add auth chain to federated response
* Fix pointers
* under CI: more logging and enable mscs, nil fix
* Handle direction: up
* Actually send message events to the roomserver..
* Add children and children_hash to unsigned, with tests
* Add logic for exploring threads and tracking children; missing storage functions
* Implement storage functions for children
* Add fetchUnknownEvent
* Do federated hits for include_children if we have unexplored children
* Use /ev_rel rather than /event as the former includes child metadata
* Remove cross-room threading impl
* Enable MSC2836 in the p2p demo
* Namespace mscs db
* Enable msc2836 for ygg
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Don't recalculate event IDs so often
* Revert invite change
* Make sure we're using the right NIDs
* Update gomatrixserverlib
* Update to NewEventFromTrustedJSONWithEventID
* Fix go.mod
* Update gomatrixserverlib to matrix-org/gomatrixserverlib#243
* Use BulkSelectEventID
* Add mscs/hooks package, begin work for msc2836
* Flesh out hooks and add SQL schema
* Begin implementing core msc2836 logic
* Add test harness
* Linting
* Implement visibility checks; stub out APIs for tests
* Flesh out testing
* Flesh out walkThread a bit
* Persist the origin_server_ts as well
* Edges table instead of relationships
* Add nodes table for event metadata
* LEFT JOIN to extract origin_server_ts for children
* Add graph walking structs
* Implement walking algorithm
* Add more graph walking tests
* Add auto_join for local rooms
* Fix create table syntax on postgres
* Add relationship_room_id|servers to the unsigned section of events
* Persist the parent room_id/servers in edge metadata
Other events cannot assert the true room_id/servers for the
parent event, only make claims to them, hence why this is
edge metadata.
* guts to pass through room_id/servers
* Refactor msc2836 to allow handling from federation
* Add JoinedVia to PerformJoin responses
* Fix tests; review comments
The previous implementation was only checking if room history was
"shared", which it wasn't for rooms where a user was invited, or world
readable rooms.
This implementation leverages the IsServerAllowed method, which already
implements the complete verification algorithm.
Signed-off-by: `Mayeul Cantan <oss+matrix@mayeul.net>`
Co-authored-by: Kegsay <kegan@matrix.org>
* Add basic storage methods
* Add internal api handler
* Add check for forgotten room
* Add /rooms/{roomID}/forget endpoint
* Add missing rsAPI method
* Remove unused parameters
* Add passing tests
Signed-off-by: Till Faelligen <tfaelligen@gmail.com>
* Add missing file
* Add postgres migration
* Add sqlite migration
* Use Forgetter to forget room
* Remove empty line
* Update HTTP status codes
It looks like the spec calls for these to be 400, rather than 403: https://matrix.org/docs/spec/client_server/r0.6.1#post-matrix-client-r0-rooms-roomid-forget
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Add configuration for max_message_bytes for sarama
* Log all errors when sending multiple messages
Signed-off-by: Till Faelligen <tfaelligen@gmail.com>
* Add missing config
* - Better comments on what MaxMessageBytes is used for
- Also sets the size the consumer may use
* Set RewritesState once
* Check if any new state provided
* Obey rewritesState
* Don't nuke everything the sync API knows when purging state
* Fix panic from duplicate insert
* Consistency
* Use HasState
* Remove nolint
* Clean up joined rooms on state rewrite
* Add resolve-state helper
* Tweaks
* Refactor forward extremities, again
* Tweaks
* Minor optimisation
* Make path a bit clearer
* Only process state/membership if forward extremities have changed
* Usage comments in resolve-state
* Fix sqite locking bugs present on sytest
Comments do the explaining.
* Fix deadlock in sqlite mode
Caused by starting a writer whilst within a writer
* Only complain about invalid state deltas for non-overwrite events
* Do not re-process outlier unnecessarily
* Add KindOld
* Don't process latest events/memberships for old events
* Allow federationsender to ignore duplicate key entries when LatestEventIDs is duplicated by RS output events
* Signal to downstream components if an event has become a forward extremity
* Don't exclude from sync
* Soft-fail checks on KindNew
* Don't run the latest events updater at all for KindOld
* Don't make federation sender change after all
* Kind in federation sender join
* Don't send isForwardExtremity
* Fix syncapi
* Update comments
* Fix SendEventWithState
* Update sytest-whitelist
* Generate old output events
* Sync API consumes old room events
* Update comments
* Start Kafka connection for each component that needs one
* Fix roomserver unit tests
* Rename to naffkaInstance (@Kegsay review comment)
* Fix import cycle
* Capture errors
* Don't request only state key tuples needed for auth (we end up discarding room state this way)
* QueryStateAfterEvent returns all state when no tuples supplied
* Resolve state
* Comments
* Adjust backfill to send backward extremity with state before other backfilled events, include prev_events with no state amongst missing events
* Not finished refactor
* Fix test
* Remove isInboundTxn
* Remove debug logging
We expect to have missing events as we walk back in the DAG over federation
as we didn't always create the room. When checking if the server is allowed
to see those events, just give up and stop rather than fail the request.
* Rename serverkeyapi to signingkeyserver
We use "api" for public facing stuff and "server" for internal stuff.
As the server key API is internal only, we call it 'signing key server',
which also clarifies the type of key (as opposed to TLS keys, E2E keys, etc)
* Convert docker/scripts to use signing-key-server
* Rename missed bits
* Deep forward extremity calculation
* Use updater txn
* Update error
* Update error
* Create previous event references in StoreEvent
* Use latest events updater to row-lock prev events
* Fix unexpected fallthrough
* Fix deadlock
* Don't roll back
* Update comments in calculateLatest
* Don't include events that we can't find references for in the forward extremities
* Add another passing test
* Don't send rewrite events
* Remove final traces of rewrite events
* Remove test that is no longer needed
* Revert "Remove test that is no longer needed"
This reverts commit 9a45babff6.
* Update test to use KindOutlier
* Resolve state after event against current room state when determining latest state changes
* Update sytest-whitelist
* Update sytest-whitelist, blacklist
* Try to ask other servers in the room for missing events if the origin won't provide them
* Logging
* More logging
* Implement QueryMissingAuthPrevEvents
* Try to get missing auth events badly
* Use processEvent
* Logging
* Update QueryMissingAuthPrevEvents
* Try to find missing auth events
* Patchy fix for test
* Logging tweaks
* Send auth events as outliers
* Update check in QueryMissingAuthPrevEvents
* Error responses
* More return codes
* Don't return error on reject/soft-fail since it was ultimately handled
* More tweaks
* More error tweaks
* Sanity-check room version on RS event input
* Update gomatrixserverlib
* Reject make_join when no room members are left
* Revert some changes from wrong branch
* Distinguish between room not existing and room being abandoned on this server
* nolint
* Replace all usages of txn.Stmt with sqlutil.TxStmt
Signed-off-by: Sam Day <me@samcday.com>
* Fix sign off link in PR template.
Signed-off-by: Sam Day <me@samcday.com>
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* WIP Event rejection
* Still send back errors for rejected events
Instead, discard them at the federationapi /send layer rather than
re-implementing checks at the clientapi/PerformJoin layer.
* Implement rejected events
Critically, rejected events CAN cause state resolution to happen
as it can merge forks in the DAG. This is fine, _provided_ we
do not add the rejected event when performing state resolution,
which is what this PR does. It also fixes the error handling
when NotAllowed happens, as we were checking too early and needlessly
handling NotAllowed in more than one place.
* Update test to match reality
* Modify InputRoomEvents to no longer return an error
Errors do not serialise across HTTP boundaries in polylith mode,
so instead set fields on the InputRoomEventsResponse. Add `Err()`
function to make the API shape basically the same.
* Remove redundant returns; linting
* Update blacklist
* SendEventWithState events as new
* Use cumulative state IDs for final event
* Error wrapping in calculateAndSetState
* Handle overwriting same event type and state key
* Hacky way to spot historical events
* Don't exclude from sync
* Don't generate output events when rewriting forward extremities
* Update output event check
* Historical output events
* Define output room event type
* Notify key changes on state
* Don't send our membership event twice
* Deduplicate state entries
* Tweaks
* Remove unnecessary nolint
* Fix current state upsert in sync API
* Send auth events as outliers, state events as rewrite
* Sync API don't consume state events
* Process events actually
* Improve outlier check
* Fix local room check
* Remove extra room check, it seems to break the whole damn world
* Fix federated join check
* Fix nil pointer exception
* Better comments on DeduplicateStateEntries
* Reflow forced federated joins
* Don't force federated join for possibly even local invites
* Comment SendEventWithState better
* Rewrite room state in sync API storage
* Add TODO
* Clean up all room data when receiving create event
* Don't generate output events for rewrites, but instead notify that state is rewritten on the final new event
* Rename to PurgeRoom
* Exclude backfilled messages from /sync
* Split out rewriting state from updating state from state res
Co-authored-by: Kegan Dougal <kegan@matrix.org>
* Remove QueryBulkStateContent from current state server
Expected fail due to db impl not existing
* Implement query bulk state content
* Fix up rejecting invites over federation
* Fix bulk content marshalling
* Move currentstateserver API to roomserver
Stub out DB functions for now, nothing uses the roomserver version yet.
* Allow it to startup
* Implement some current-state-server storage interface functions
* Add missing package
* Initial FIFOing of roomserver inputs
* Remove EventID response from api.InputRoomEventsResponse
* Don't send back event ID unnecessarily
* Fix ordering hopefully
* Reduce copies, use buffered task channel to reduce contention on other rooms
* Fix error handling
* Add Queryer and use embedded structs
* Add Inputer and factor out more RS API stuff
This neatly splits up the RS API based on the functionality it provides,
whilst providing a useful place for code sharing via the `helpers` package.
* Use federation sender for backfill and getting missing events
* Fix internal URL paths
* Update go.mod/go.sum for matrix-org/gomatrixserverlib#218
* Add missing server implementations in HTTP interface
- New package `perform` which contains all `Perform` functions
- New package `helpers` which contains helper functions used by both
perform and query/input functions.
- Perform invite/leave have no idea how to `WriteOutputEvents` and this
is now returned from `PerformInvite` or `PerformLeave` respectively.
Still to do:
- RSAPI is fed into the inviter/joiner/leaver - this introduces circular
logic so will need to be removed.
- Put query operations in a `query` package.
- Put input operations (and output) in an `input` package.
- Factor out helper functions as much as possible, possibly rejigging the
storage layer in the process.
* Initial work on roomserver NID caches
* Give caches to roomserver storage
* Populate caches
* Fix bugs
* Fix WASM build
* Don't hit cache twice in RoomNIDExcludingStubs
* Store reverse room ID-room NID mapping, consult caches when assigning NIDs
* Offset updates take place using TransactionWriter
* Refactor TransactionWriter in current state server
* Refactor TransactionWriter in federation sender
* Refactor TransactionWriter in key server
* Refactor TransactionWriter in media API
* Refactor TransactionWriter in server key API
* Refactor TransactionWriter in sync API
* Refactor TransactionWriter in user API
* Fix deadlocking Sync API tests
* Un-deadlock device database
* Fix appservice API
* Rename TransactionWriters to Writers
* Move writers up a layer in sync API
* Document sqlutil.Writer interface
* Add note to Writer documentation
* Per-room input mutex
* GetMembership should use transaction when assigning state key NID
* Actually use writer transactions rather than ignoring them
* Limit per-room mutexes to Postgres
* Flip the check in InputRoomEvents
* Updated TransactionWriters, moved locks in roomserver, various other tweaks
* Fix redaction deadlocks
* Fix lint issue
* Rename SQLiteTransactionWriter to ExclusiveTransactionWriter
* Fix us not sending transactions through in latest events updater
* Make PerformJoin send input membership event
* Invite input room events in separate goroutine
* Don't limit roomserver input events using request context
* Synchronous input room events
* Nope, that didn't work
* oops send state key to GetMembership
* Don't generate stripped state in client API more times than necessary, generate output events on receiving end of federated invite
* Commit membership updater changes
* Tweaks
* Initial pass at refactoring config (not finished)
* Don't forget current state and EDU servers
* More shifting around
* Update server key API tests
* Fix roomserver test
* Fix more tests
* Further tweaks
* Fix current state server test (sort of)
* Maybe fix appservices
* Fix client API test
* Include database connection string in database options
* Fix sync API build
* Update config test
* Fix unit tests
* Fix federation sender build
* Fix gobind build
* Set Listen address for all services in HTTP monolith mode
* Validate config, reinstate appservice derived in directory, tweaks
* Tweak federation API test
* Set MaxOpenConnections/MaxIdleConnections to previous values
* Update generate-config
* Modify /state/{eventType}/{stateKey} to return the event at the time the user left
Or live, depending on their current state. Hopefully fixes some sytests!
* Linting
* Set HasBeenInRoom
* Fix cases for world-readable history visibility
* Fix bug in finding the requested state event
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Use TransactionWriter on other component SQLites
* Fix sync API tests
* Fix panic in media API
* Fix a couple of transactions
* Fix wrong query, add some logging output
* Add debug logging into StoreEvent
* Adjust InsertRoomNID
* Update logging
* Add a bit more logging to the fedsender
* bugfix: continue sending PDUs if ones are added whilst sending another PDU
Without this, the queue goes back to sleep on `<-oq.notifyPDUs` which won't
fire because `pendingPDUs` is already > 0. This should fix a flakey sytest.
* Break if no txn is sent
* WIP syncapi work
* More debugging
* Bump GMSL version to pull in working Event.Redact
* Remove logging
* Make redactions work on v3+
* Fix more tests
* Emit redacted_event from the roomserver when redactions are validated
- Consume them in the currentstateserver and act accordingly.
- Add integration test for the roomserver to check that injecting
`m.room.redaction` events result in `redacted_event` being emitted.
* Linting
* Ignore events that redact themselves
* Implement core redaction logic
- Add a new `redactions_table.go` which tracks the mapping of
the redaction event ID and the redacted event ID
- Mark redactions as 'validated' when we have both events.
- When redactions are validated, add `unsigned.redacted_because`
and modify the `eventJSON` accordingly.
Note: We currently do NOT redact the event content - it's gated
behind a feature flag - until we have tested redactions a bit more.
* Linting
* Use content_value instead of membership
* Fix build
* Replace publicroomsapi with a combination of clientapi/roomserver/currentstateserver
- All public rooms paths are now handled by clientapi
- Requests to (un)publish rooms are sent to the roomserver via `PerformPublish`
which are stored in a new `published_table.go`
- Requests for public rooms are handled in clientapi by:
* Fetch all room IDs which are published using `QueryPublishedRooms` on the roomserver.
* Apply pagination parameters to the slice.
* Do a `QueryBulkStateContent` request to the currentstateserver to pull out
required state event *content* (not entire events).
* Aggregate and return the chunk.
Mostly but not fully implemented (DB queries on currentstateserver are missing)
* Fix pq query
* Make postgres work
* Make sqlite work
* Fix tests
* Unbreak pagination tests
* Linting
* Return remote errors from FS.PerformJoin
Follows the same pattern as PerformJoin on roomserver (no error return).
Also return the right format for incompatible room version errors.
Makes a bunch of tests pass!
* Handle network errors better when returning remote HTTP errors
* Linting
* Fix tests
* Update whitelist, pass network errors through in API=1 mode
* Add PerformInvite and refactor how errors get handled
- Rename `JoinError` to `PerformError`
- Remove `error` from the API function signature entirely. This forces
errors to be bundled into `PerformError` which makes it easier for callers
to detect and handle errors. On network errors, HTTP clients will make a
`PerformError`.
* Unbreak everything; thanks Go!
* Send back JSONResponse according to the PerformError
* Update federation invite code too
* Pass join errors through internal API boundaries
Required for certain invite sytests. We will need to think of a
better way of handling this going forwards.
* Include m.room.avatar in stripped state; handle trailing slashes when GETing state events
* Update whitelist
* Update whitelist
* Minor perf/debugging improvements
- publicroomsapi: Don't call QueryEventsByID with no event IDs
- appservice: Consume only if there are 1 or more ASes
- roomserver: don't keep a copy of the request "for debugging" - we trace now
* fedsender: return early if we have no destinations
* Unbreak tests
This is a wrapper around whatever impl we have which then logs
the function name/request/response/error.
Also tweak when we log on kafka streams: only log on the producer
side not the consumer side: we've never had issues with comms and
having 1 message rather than N would be nice.
* s/QueryBackfill/PerformBackfill/g
* OutputEvent now includes AddStateEvents which contain the full event of extra state events
* Only include adds not the current event
* Get adding state right
* Remove clientapi producers which aren't actually producers
They are actually just convenience wrappers around the internal APIs
for roomserver/eduserver. Move their logic to their respective `api`
packages and call them directly.
* Remove TODO
* unbreak ygg
* Split out adding HTTP routes from making internal APIs for clarity
* Split out more components
* Split out more things
* Finish converting
* internal mux for internal routes
* Move Updater structs to shared and use it for postgres
* Add constructors for NewXXXUpdater and a useTxns flag
In sqlite, we set useTxns=false and comment why.
* Handle nil txn
* Handle nil in transaction
* Missed one
* Close the txn at the right time
* Don't close the transaction as we reuse it between calls
* Add missing routing for PerformDirectoryLookupRequest
* Tweak output
* Fix some bugs in devices
* Don't default to federated room joins in response to invite
* Update sytest-whitelist
* Update comments
* Return correct room ID from PerformJoin
* Fix appservice and EDU server API setup, update sytest-whitelist
* Update sytest-whitelist
* Separate muxes for public and internal APIs
* Update client-api-proxy and federation-api-proxy so they don't add /api to the path
* Tidy up
* Consistent HTTP setup
* Set up prefixes properly
* sytest: Make 'Inbound federation can backfill events' pass
This breaks 'Outbound federation can backfill events' because now
we are returning the right number of events, which the previous
test was relying on.
Previously, /messages was backfilling the membership event, causing
the test to pass. Now we are no longer backfilling the membership
event due to the change in this commit, causing the test to fail.
The test should instead be returning the membership event locally
from synacpis database, but it doesn't do it fast enough, resulting
in a no-op /sync response with a next_batch=s0_0 which will never
pick up the local membership event when it rolls in. The test
does attempt to retry, but doesn't take the new next_batch=s1_0
resulting in it missing from the /messages response.
* Linting
* Comment out updaters a bit, add overwrite flag to latest events
* Make sure we don't send fast-forwarded state changes over federation, start with empty set when overwriting
* Remove redundant check for overwrite
* WIP get_missing_events work
* More WIP get_missing_events work
* First working /get_missing_events implementation
Flakey currently due to racing between /sync and /send
* Final tweaks
* Remove log lines
* Linting
* go mod tidy
* Clamp min depth to 0
* sort events by depth because sytest makes me sad
Specifically I think it's
4172585c25/lib/SyTest/Federation/Client.pm (L265)
to blame here.
* Improve federation sender performance and behaviour, add backoff
* Tweaks
* Tweaks
* Tweaks
* Take copies of events before passing to destination queues
* Don't accidentally drop queued messages
* Don't take copies again
* Tidy up a bit
* Break out statistics (tracked component-wide), report success and failures from Perform actions
* Fix comment, use atomic add
* Improve logic a bit, don't block on wakeup, move idle check
* Don't retry sucessful invites, don't dispatch sendEvent, sendInvite etc
* Dedupe destinations, fix other bug hopefully
* Dispatch sends again
* Federation sender to ignore invites that are destined locally
* Loopback invite events
* Remodel a bit with channels
* Linter
* Only loopback invite event if we know the room
* We should tell other resident servers about the invite if we know about the room
* Correct invite signing
* Fix invite loopback
* Check HTTP response codes, push new invites to front of queue
* Review comments
* Add PerformJoin template
* Try roomserver perform join
* Send correct server name to FS API
* Pass through content, try to handle multiple server names
* Fix local server checks
* Don't refer to non-existent error
* Add directory lookups of aliases
* Remove unneeded parameters
* Don't repeat join events into the roomserver
* Unmarshal the content, that would help
* Check if the user is already in the room in the fedeationapi too
* Return incompatible room version error
* Use Membership, don't try more servers than needed
* Review comments, make FS API take list of servernames, dedupe them, break out of loop properly on success
* Tweaks
* Limit database connections (#564)
- Add new options to the config file database:
max_open_conns: 100
max_idle_conns: 2
conn_max_lifetime: -1
- Implement connection parameter setup on the *DB (database/sql) in internal/sqlutil/trace.go:Open()
- Propagate the values in the form of DbProperties interface via all the
Open() and NewDatabase() functions
Signed-off-by: Tomas Jirka <tomas.jirka@email.cz>
* Fix wasm builds
* Remove file accidentally added from working tree
Co-authored-by: Tomas Jirka <tomas.jirka@email.cz>
* Consolidation of roomserver APIs
* Comment out alias tests for now, they are broken
* Wire AS API into roomserver again
* Roomserver didn't take asAPI param before so return to that
* Prevent roomserver asking AS API for alias info
* Rename some files
* Remove alias_test, incoherent tests and unwanted appservice integration
* Remove FS API inject on syncapi component
* Make backfill work for shared history visibility
* fetch missing state on backfill to remember snapshots correctly
* Fix gmsl to not mux in auth events into room state
* Whoops
* Linting
* Define an input API for the federationsender
* Wiring for rooomserver input API and federation sender input API
* Whoops, commit common too
* Merge input API into query API
* Rename FederationSenderQueryAPI to FederationSenderInternalAPI
* Fix dendritejs
* Rename Input to Perform
* Fix a couple of inputs -> performs
* Remove needless storage interface, add comments
* Initial cut for backfilling
The syncserver now asks the roomserver via QueryBackfill (which already
existed to *handle* backfill requests) which then makes federation requests
via gomatrixserverlib.RequestBackfill.
Currently, tests fail on subsequent /messages requests because we don't know
which servers are in the room, because we are unable to get state snapshots
from a backfilled event because that code doesn't exist yet.
* WIP backfill, doesn't work
* Make initial backfill pass checks
* Persist backfilled events with state snapshots
* Remove debug lines
* Linting
* Review comments
* Update gomatixserverlib
* Try to build invite stripped state if not given to us
* SendInvite improvements
* Transpose invite_room_state into invite_state.events for sync API
* Remove syncapi debugging output
* Use RespInviteV2
* Update gomatrixserverlib
* Send the invite event as a normal roomserver event too, for incorporating into room (should this be done by the roomserver automatically for invite inputs?)
* Federation sender use invite_room_state, room server try to insert membership state
* Check supported room versions on the invite endpoint
* Prevent roomserver query API from trying to handle requests for stub rooms
* Adding a nolint
* Replace IsRoomStub with RoomNIDExcludingStubs, fix query API to use that instead
* Review comments
* Experimental LRU cache for room versions
* Don't accidentally try to type-assert nil
* Also reduce hits on query API
* Use hashicorp implementation which mutexes for us
* Define const for max cache entries
* Rename to be specifically immutable, panic if we try to mutate a cache entry
* Review comments
* Remove nil guards, give roomserver integration test a cache
* go mod tidy
* Set default room version to 4
* Default to v1 when room_version key missing
Signed-off-by: Alex Chen <minecnly@gmail.com>
* Squashed commit of the following:
commit c1bca95adb
Author: Kegsay <kegan@matrix.org>
Date: Thu Apr 16 10:06:55 2020 +0100
Add SQL tracing via DENDRITE_TRACE_SQL (#968)
* Add SQL tracing via DENDRITE_TRACE_SQL
Add this to `internal/sqlutil` in preparation for #897
* Not entirely
commit c2ea961909
Author: Kegsay <kegan@matrix.org>
Date: Wed Apr 15 17:48:40 2020 +0100
Add HTTP trace logging (#965)
* Dump all requests/response server-side
* Wrap in DENDRITE_TRACE
* DENDRITE_TRACE_HTTP is better
* Bugfix for response body and linting
* False is true and true is false
* Linting
* How did this get missed
* More linting
* Update gomatrixserverlib
* Remove unneeded imports
Co-authored-by: Cnly <minecnly@gmail.com>
* Update gomatrixserverlib
* Default to room version 4
* Update gomatrixserverlib
* Limit prev_events and auth_events
* Fix auth_events, prev_events
* Fix linter issues
* Update gomatrixserverlib
* Fix getState
* Update sytest-whitelist
* Squashed commit of the following:
commit 067b875063
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Fri Apr 3 14:29:06 2020 +0100
Invites v2 endpoint (#952)
* Start converting v1 invite endpoint to v2
* Update gomatrixserverlib
* Early federationsender code for sending invites
* Sending invites sorta happens now
* Populate invite request with stripped state
* Remodel a bit, don't reflect received invites
* Handle invite_room_state
* Handle room versions a bit better
* Update gomatrixserverlib
* Tweak order in destinationQueue.next
* Revert check in processMessage
* Tweak federation sender destination queue code a bit
* Add comments
commit 955244c092
Author: Ben B <benne@klimlive.de>
Date: Fri Apr 3 12:40:50 2020 +0200
use custom http client instead of the http DefaultClient (#823)
This commit replaces the default client from the http lib with a custom one.
The previously used default client doesn't come with a timeout. This could cause
unwanted locks.
That solution chosen here creates a http client in the base component dendrite
with a constant timeout of 30 seconds. If it should be necessary to overwrite
this, we could include the timeout in the dendrite configuration.
Here it would be a good idea to extend the type "Address" by a timeout and
create an http client for each service.
Closes#820
Signed-off-by: Benedikt Bongartz <benne@klimlive.de>
Co-authored-by: Kegsay <kegan@matrix.org>
* Update sytest-whitelist, sytest-blacklist
* Update go.mod/go.sum
* Add some error wrapping for debug
* Add a NOTSPEC to common/events.go
* Perform state resolution at send_join
* Set default room version to v2 again
* Tweak GetCapabilities
* Add comments to ResolveConflictsAdhoc
* Update sytest-blacklist
* go mod tidy
* Update sytest-whitelist, sytest-blacklist
* Update versions
* Updates from review comments
* Update sytest-blacklist, sytest-whitelist
* Check room versions compatible at make_join, add some comments, update gomatrixserverlib, other tweaks
* Set default room version back to v2
* Update gomatrixserverlib, sytest-whitelist
* Start converting v1 invite endpoint to v2
* Update gomatrixserverlib
* Early federationsender code for sending invites
* Sending invites sorta happens now
* Populate invite request with stripped state
* Remodel a bit, don't reflect received invites
* Handle invite_room_state
* Handle room versions a bit better
* Update gomatrixserverlib
* Tweak order in destinationQueue.next
* Revert check in processMessage
* Tweak federation sender destination queue code a bit
* Add comments
This commit replaces the default client from the http lib with a custom one.
The previously used default client doesn't come with a timeout. This could cause
unwanted locks.
That solution chosen here creates a http client in the base component dendrite
with a constant timeout of 30 seconds. If it should be necessary to overwrite
this, we could include the timeout in the dendrite configuration.
Here it would be a good idea to extend the type "Address" by a timeout and
create an http client for each service.
Closes#820
Signed-off-by: Benedikt Bongartz <benne@klimlive.de>
Co-authored-by: Kegsay <kegan@matrix.org>
* Room version 2 by default, other wiring updates, update gomatrixserverlib
* Fix nil pointer exception
* Fix some more nil pointer exceptions hopefully
* Update gomatrixserverlib
* Send all room versions when joining, not just stable ones
* Remove room version cquery
* Get room version when getting events from the roomserver database
* Reset default back to room version 2
* Don't generate event IDs unless needed
* Revert "Remove room version cquery"
This reverts commit a170d58733.
* Query room version in federation API, client API as needed
* Improvements to make_join send_join dance
* Make room server producers use headered events
* Lint tweaks
* Update gomatrixserverlib
* Versioned SendJoin
* Query room version in syncapi backfill
* Handle transaction marshalling/unmarshalling within Dendrite
* Sorta fix federation (kinda)
* whoops commit federation API too
* Use NewEventFromTrustedJSON when getting events from the database
* Update gomatrixserverlib
* Strip headers on federationapi endpoints
* Fix bug in clientapi profile room version query
* Update gomatrixserverlib
* Return more useful error if room version query doesn't find the room
* Update gomatrixserverlib
* Update gomatrixserverlib
* Maybe fix federation
* Fix formatting directive
* Update sytest whitelist and blacklist
* Temporarily disable room versions 3 and 4 until gmsl is fixed
* Fix count of EDUs in logging
* Update gomatrixserverlib
* Update gomatrixserverlib
* Update gomatrixserverlib
* Rely on EventBuilder in gmsl to generate the event IDs for us
* Some review comments fixed
* Move function out of common and into gmsl
* Comment in federationsender destinationqueue
* Update gomatrixserverlib
* Implement history visibility checks for /backfill
Required for p2p to show history correctly.
* Add sytest
* Logging
* Fix two backfill bugs which prevented backfill from working correctly
- When receiving backfill requests, do not send the event that was in the original request.
- When storing backfill results, correctly update the backwards extremity for the room.
* hack: make backfill work multiple times
* add sqlite impl and remove logging
* Linting
* Rearrange state package a bit, add some code to look up the right state resolution algorithm
* Remove shared
* Add GetRoomVersionForRoomNID
* Try to use room version to get correct state resolution algorithm
* Fix room joins over federation
* nolint resolveConflictsV2 because all attempts to break it up so far just result in it being awfully less obvious how it works
* Rename Prepare to NewStateResolution
* Update comments
* Re-add missing tests
* Use HeaderedEvent in syncapi
* Update notifier test
* Fix persisting headered event
* Clean up unused API function
* Fix overshadowed err from linter
* Write headered JSON to invites table too
* Rename event_json to headered_event_json in syncapi database schemae
* Fix invites_table queries
* Update QueryRoomVersionCapabilitiesResponse comment
* Fix syncapi SQLite
* Retrieve room version where known in roomserver
* Get room versions in alias code
* Increase gocyclothreshold to 13, since we hit that number a lot
* Remove gocyclo nolint from StoreEvent
* Update interface to get room version from room ID instead of NID
* Remove new API
* Fixed this query for SQLite but not for Postgres
* bugfix: Fix a bug which caused failures to join rooms over federation
The cause of this was the semantics of `/send_join`'s `auth_chain` response.
Previously, we would only send back the auth chain *for the join event* and
not the entire room state. However, we would then try to check that the
room state is valid, and then be missing auth events.
Now, we send back the entire auth chain for all room state in `/send_join`.
The spec needs to be clarified that this is what the chain should be.
* refactor: split out grabbing state to reduce cyclo complexity
* Add room version into createRoomReq
* Extract room version from m.room.create event when persisting
* Reduce cyclomatic complexity
* Update whitelist, gomatrixserverlib, tweaks to roomserver
* Update sytest-whitelist again
* bugfix: Fix#908 by setting the correct state after the event
Previously, this would only happen if the state already existed
previously!
* Structured logging
* Implement gomatrixserverlib.HeaderedEvent, which should allow us to store room version headers along with the event across API boundaries and consumers/producers, and intercept unmarshalling to get the event structure right
* Add federationsender to previous
* Update room version descriptors, add error handling
* Fix database queries
* Drop Get from version package
* Fix database wrapping, add comments for version descriptions
* Don't set default room_version value in SQL
* Use a fork of pq which supports userCurrent on wasm
* Use sqlite3_js driver when running in JS
* Add cmd/dendritejs to pull in sqlite3_js driver for wasm only
* Update to latest go-sqlite-js version
* Replace prometheus with a stub. sigh
* Hard-code a config and don't use opentracing
* Latest go-sqlite3-js version
* Generate a key for now
* Listen for fetch traffic rather than HTTP
* Latest hacks for js
* libp2p support
* More libp2p
* Fork gjson to allow us to enforce auth checks as before
Previously, all events would come down redacted because the hash
checks would fail. They would fail because sjson.DeleteBytes didn't
remove keys not used for hashing. This didn't work because of a build
tag which included a file which no-oped the index returned.
See https://github.com/tidwall/gjson/issues/157
When it's resolved, let's go back to mainline.
* Use gjson@1.6.0 as it fixes https://github.com/tidwall/gjson/issues/157
* Use latest gomatrixserverlib for sig checks
* Fix a bug which could cause exclude_from_sync to not be set
Caused when sending events over federation.
* Use query variadic to make lookups actually work!
* Latest gomatrixserverlib
* Add notes on getting p2p up and running
Partly so I don't forget myself!
* refactor: Move p2p specific stuff to cmd/dendritejs
This is important or else the normal build of dendrite will fail
because the p2p libraries depend on syscall/js which doesn't work
on normal builds.
Also, clean up main.go to read a bit better.
* Update ho-http-js-libp2p to return errors from RoundTrip
* Add an LRU cache around the key DB
We actually need this for P2P because otherwise we can *segfault*
with things like: "runtime: unexpected return pc for runtime.handleEvent"
where the event is a `syscall/js` event, caused by spamming sql.js
caused by "Checking event signatures for 14 events of room state" which
hammers the key DB repeatedly in quick succession.
Using a cache fixes this, though the underlying cause is probably a bug
in the version of Go I'm on (1.13.7)
* breaking: Add Tracing.Enabled to toggle whether we do opentracing
Defaults to false, which is why this is a breaking change. We need
this flag because WASM builds cannot do opentracing.
* Start adding conditional builds for wasm to handle lib/pq
The general idea here is to have the wasm build have a `NewXXXDatabase`
that doesn't import any postgres package and hence we never import
`lib/pq`, which doesn't work under WASM (undefined `userCurrent`).
* Remove lib/pq for wasm for syncapi
* Add conditional building to remaining storage APIs
* Update build script to set env vars correctly for dendritejs
* sqlite bug fixes
* Docs
* Add a no-op main for dendritejs when not building under wasm
* Use the real prometheus, even for WASM
Instead, the dendrite-sw.js must mock out `process.pid` and
`fs.stat` - which must invoke the callback with an error (e.g `EINVAL`)
in order for it to work:
```
global.process = {
pid: 1,
};
global.fs.stat = function(path, cb) {
cb({
code: "EINVAL",
});
}
```
* Linting
* bugfix: fix panic on new invite events from sytest
I'm unsure why the previous code didn't work, but it's
clearer, quicker and easier to read the `LastInsertID()` way.
Previously, the code would panic as the SELECT would fail
to find the last inserted row ID.
* sqlite: Fix UNIQUE violations and close more cursors
- Add missing `defer rows.Close()`
- Do not have the state block NID as a PRIMARY KEY else it breaks for blocks
with >1 state event in them. Instead, rejig the queries so we can still
have monotonically increasing integers without using AUTOINCREMENT (which
mandates PRIMARY KEY).
* sqlite: Add missing variadic function
* Use LastInsertId because empirically it works over the SELECT form (though I don't know why that is)
* sqlite: Fix invite table by using the global stream pos rather than one specific to invites
If we don't use the global, clients don't get notified about any invites
because the position is too low.
* linting: shadowing
* sqlite: do not use last rowid, we already know the stream pos!
* sqlite: Fix account data table in syncapi by commiting insert txns!
* sqlite: Fix failing federation invite
Was failing with 'database is locked' due to multiple write txns
being taken out.
* sqlite: Ensure we return exactly the number of events found in the database
Previously we would return exactly the number of *requested* events, which
meant that several zero-initialised events would bubble through the system,
failing at JSON serialisation time.
* sqlite: let's just ignore the problem for now....
* linting
* Move current work into single branch
* Initial massaging of clientapi etc (not working yet)
* Interfaces for accounts/devices databases
* Duplicate postgres package for sqlite3 (no changes made to it yet)
* Some keydb, accountdb, devicedb, common partition fixes, some more syncapi tweaking
* Fix accounts DB, device DB
* Update naffka dependency for SQLite
* Naffka SQLite
* Update naffka to latest master
* SQLite support for federationsender
* Mostly not-bad support for SQLite in syncapi (although there are problems where lots of events get classed incorrectly as backward extremities, probably because of IN/ANY clauses that are badly supported)
* Update Dockerfile -> Go 1.13.7, add build-base (as gcc and friends are needed for SQLite)
* Implement GET endpoints for account_data in clientapi
* Nuke filtering for now...
* Revert "Implement GET endpoints for account_data in clientapi"
This reverts commit 4d80dff458.
* Implement GET endpoints for account_data in clientapi (#861)
* Implement GET endpoints for account_data in clientapi
* Fix accountDB parameter
* Remove fmt.Println
* Fix insertAccountData SQLite query
* Fix accountDB storage interfaces
* Add empty push rules into account data on account creation (#862)
* Put SaveAccountData into the right function this time
* Not sure if roomserver is better or worse now
* sqlite work
* Allow empty last sent ID for the first event
* sqlite: room creation works
* Support sending messages
* Nuke fmt.println
* Move QueryVariadic etc into common, other device fixes
* Fix some linter issues
* Fix bugs
* Fix some linting errors
* Fix errcheck lint errors
* Make naffka use postgres as fallback, fix couple of compile errors
* What on earth happened to the /rooms/{roomID}/send/{eventType} routing
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Always defer *sql.Rows.Close and consult with Err
database/sql.Rows.Next() makes sure to call Close only after exhausting
result rows which would NOT happen when returning early from a bad Scan.
Close being idempotent makes it a great candidate to get always deferred
regardless of what happens later on the result set.
This change also makes sure call Err() after exhausting Next() and
propagate non-nil results from it as the documentation advises.
Closes#764
Signed-off-by: Kiril Vladimiroff <kiril@vladimiroff.org>
* Override named result parameters in last returns
Signed-off-by: Kiril Vladimiroff <kiril@vladimiroff.org>
* Do the same over new changes that got merged
Signed-off-by: Kiril Vladimiroff <kiril@vladimiroff.org>
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Add wiring for querying the roomserver for the default room version
* Try to implement /capabilities for room versions
* Update copyright notices
* Update sytests, add /capabilities endpoint into CS API
* Update sytest-whitelist
* Add GetDefaultRoomVersion
* Fix cases where state package was shadowed
* Fix version formatting
* Update Dockerfile to Go 1.13.6
* oh yes types I remember
* And fix the default too
* Rough first pass at adding room version abstractions
* Define newer room versions
* Update room version metadata
* Fix roomserver/versions
* Try to fix whitespace in roomsSchema
* Merge forward
* Tidy up a bit
* TODO: What to do with NextBatch here?
* Replace SyncPosition with PaginationToken throughout syncapi
* Fix PaginationTokens
* Fix lint errors
* Add a couple of missing functions into the syncapi external storage interface
* Some updates based on review comments from @babolivier
* Some updates based on review comments from @babolivier
* argh whitespacing
* Fix opentracing span
* Remove dead code
* Don't overshadow err (fix lint issue)
* Handle extremities after inserting event into topology
* Try insert event topology as ON CONFLICT DO NOTHING
* Prevent OOB error in addRoomDeltaToResponse
* Thwarted by gocyclo again
* Fix NewPaginationTokenFromString, define unit test for it
* Update pagination token test
* Update sytest-whitelist
* Hopefully fix some of the sync batch tokens
* Remove extraneous sync position func
* Revert to topology tokens in addRoomDeltaToResponse etc
* Fix typo
* Remove prevPDUPos as dead now that backwardTopologyPos is used instead
* Fix selectEventsWithEventIDsSQL
* Update sytest-blacklist
* Update sytest-whitelist
* Fall back to postgres when parsing the database connection string for a URI schema fails
* Fix behaviour so that it really tries postgres when URL parsing fails and it complains about unknown schema if it succeeds
A conditional is added to wrap the call to appserviceAPI if a local alias is not found in the database.
Fixes#631
Signed-off-by: Serra Allgood <serra@allgood.dev>