The server ACL code on startup will grab all known rooms from
the rooms_table and then call `GetStateEvent` with each found
room ID to find the server ACL event. This can fail for stub
rooms, which will be present in the rooms table. Previously
this would result in an error being returned and the server
failing to start (!). Now we just return no event for stub
rooms.
* Add more logs
To help debug the migration issue in #1924 along with manual data-loss-inducing fixes.
Also log the origin server on processed txns to help debug buggy server origins.
* Fix query
* Do not store 'null' in the database for empty JSON arrays
This can cause issues, though it should be noted that the majority
of the time this will marshal/unmarshal just fine, see
https://play.golang.org/p/Doe2NZUgv7Q
* bugfix: sqlite migration should handle create events as having no 'before' snapshot
The state snapshot for any given event in the roomserver represents the state _before_
the event. For the create event, this is nothing, so the state snapshot nid should be 0.
In some cases this wasn't happening, resulting in a nice mix of possible options including:
- A state snapshot without any state blocks `[]` or `null`.
- A state snapshot with a single state block with a single event, the create event, causing
a circular loop. This is incorrect as it represents the state before the event, not after.
* Add state key check
* Topologically sort outliers in SendEventWithState
* Knock in membership updater
* Update gomatrixserverlib
* Update gomatrixserverlib
* Get the NID of the knock event properly for the membership updater
* Generate m.room.canonical_alias instead of legacy m.room.aliases
* Add omitempty tags
* Add aliases endpoint to client API
* Check power levels when setting aliases
* Don't return null on /aliases
* Don't return error if the state event fails
* Update sytest-whitelist
* Don't send updated m.room.canonical_alias events
* Don't check PLs after all because for local aliases they are apparently irrelevant
* Fix some bugs
* Allow deleting a local alias with enough PL
* Fix some more bugs
* Update sytest-whitelist
* Fix copyright notices
* Review comments
* Add room membership and powerlevel checks for func SendBan
* Added non-error return to func GetStateEvent when no state events with the specified state key are found
* Add passing tests to whitelist
* Fixed formatting
* Update roomserver/storage/shared/storage.go
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
Co-authored-by: kegsay <kegan@matrix.org>
Co-authored-by: kegsay <kegsay@gmail.com>
* Add more optimised code path for checking if we're in a room
* Fix database queries
* Fix federation API test
* Fix logging
* Review comments
* Make separate API call for room membership
* db migration: fix#1844 and add additional assertions
- Migration scripts will now check to see if there are any unconverted
snapshot IDs and fail the migration if there are any. This should
prevent people from getting a corrupt database in the event the root
cause is still unknown.
- Add an ORDER BY clause when doing batch queries in the postgres
migration. LIMIT and OFFSET without ORDER BY are undefined and must
not be relied upon to produce a deterministic ordering (e.g row order).
See https://www.postgresql.org/docs/current/queries-limit.html
* Linting
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Hash-deduplicated state storage (and migrations) for PostgreSQL and SQLite
* Refactor droomserver database setup for migrations
* Fix conflict statements
* Update migration names
* Set a boundary for old to new block/snapshot IDs so we don't rewrite them more than once accidentally
* Create sequence if not exists
* Fix boundary queries
* Fix boundary queries
* Use Query
* Break out queries a bit
* More sequence tweaks
* Query parameters are not playing the game
* Injection escaping may not work for CREATE SEQUENCE after all
* Fix snapshot sequence name
* Use boundaried IDs in SQLite too
* Use IFNULL for SQLite
* Use COALESCE in PostgreSQL
* Review comments @Kegsay
* Add RoomInfo cache, remove RoomServerRoomNID cache, ensure caches are thread-safe
* Don't panic if the roomInfo isn't known yet
* LRU package is already threadsafe
* Use RoomInfo cache to find room version if possible in Events()
* Adding comments about RoomInfoCache safety
* Hit the database far less to find room NIDs for event NIDs
* Close the rows
* Fix SQLite selectRoomNIDsForEventNIDsSQL
* Give same treatment to room version lookups
* Don't recalculate event IDs so often
* Revert invite change
* Make sure we're using the right NIDs
* Update gomatrixserverlib
* Update to NewEventFromTrustedJSONWithEventID
* Fix go.mod
* Update gomatrixserverlib to matrix-org/gomatrixserverlib#243
* Use BulkSelectEventID
* Add basic storage methods
* Add internal api handler
* Add check for forgotten room
* Add /rooms/{roomID}/forget endpoint
* Add missing rsAPI method
* Remove unused parameters
* Add passing tests
Signed-off-by: Till Faelligen <tfaelligen@gmail.com>
* Add missing file
* Add postgres migration
* Add sqlite migration
* Use Forgetter to forget room
* Remove empty line
* Update HTTP status codes
It looks like the spec calls for these to be 400, rather than 403: https://matrix.org/docs/spec/client_server/r0.6.1#post-matrix-client-r0-rooms-roomid-forget
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Fix sqite locking bugs present on sytest
Comments do the explaining.
* Fix deadlock in sqlite mode
Caused by starting a writer whilst within a writer
* Only complain about invalid state deltas for non-overwrite events
* Do not re-process outlier unnecessarily
* Deep forward extremity calculation
* Use updater txn
* Update error
* Update error
* Create previous event references in StoreEvent
* Use latest events updater to row-lock prev events
* Fix unexpected fallthrough
* Fix deadlock
* Don't roll back
* Update comments in calculateLatest
* Don't include events that we can't find references for in the forward extremities
* Add another passing test
* Replace all usages of txn.Stmt with sqlutil.TxStmt
Signed-off-by: Sam Day <me@samcday.com>
* Fix sign off link in PR template.
Signed-off-by: Sam Day <me@samcday.com>
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* WIP Event rejection
* Still send back errors for rejected events
Instead, discard them at the federationapi /send layer rather than
re-implementing checks at the clientapi/PerformJoin layer.
* Implement rejected events
Critically, rejected events CAN cause state resolution to happen
as it can merge forks in the DAG. This is fine, _provided_ we
do not add the rejected event when performing state resolution,
which is what this PR does. It also fixes the error handling
when NotAllowed happens, as we were checking too early and needlessly
handling NotAllowed in more than one place.
* Update test to match reality
* Modify InputRoomEvents to no longer return an error
Errors do not serialise across HTTP boundaries in polylith mode,
so instead set fields on the InputRoomEventsResponse. Add `Err()`
function to make the API shape basically the same.
* Remove redundant returns; linting
* Update blacklist
* Remove QueryBulkStateContent from current state server
Expected fail due to db impl not existing
* Implement query bulk state content
* Fix up rejecting invites over federation
* Fix bulk content marshalling
* Move currentstateserver API to roomserver
Stub out DB functions for now, nothing uses the roomserver version yet.
* Allow it to startup
* Implement some current-state-server storage interface functions
* Add missing package
* Initial work on roomserver NID caches
* Give caches to roomserver storage
* Populate caches
* Fix bugs
* Fix WASM build
* Don't hit cache twice in RoomNIDExcludingStubs
* Store reverse room ID-room NID mapping, consult caches when assigning NIDs
* Offset updates take place using TransactionWriter
* Refactor TransactionWriter in current state server
* Refactor TransactionWriter in federation sender
* Refactor TransactionWriter in key server
* Refactor TransactionWriter in media API
* Refactor TransactionWriter in server key API
* Refactor TransactionWriter in sync API
* Refactor TransactionWriter in user API
* Fix deadlocking Sync API tests
* Un-deadlock device database
* Fix appservice API
* Rename TransactionWriters to Writers
* Move writers up a layer in sync API
* Document sqlutil.Writer interface
* Add note to Writer documentation
* Per-room input mutex
* GetMembership should use transaction when assigning state key NID
* Actually use writer transactions rather than ignoring them
* Limit per-room mutexes to Postgres
* Flip the check in InputRoomEvents
* Updated TransactionWriters, moved locks in roomserver, various other tweaks
* Fix redaction deadlocks
* Fix lint issue
* Rename SQLiteTransactionWriter to ExclusiveTransactionWriter
* Fix us not sending transactions through in latest events updater
* Initial pass at refactoring config (not finished)
* Don't forget current state and EDU servers
* More shifting around
* Update server key API tests
* Fix roomserver test
* Fix more tests
* Further tweaks
* Fix current state server test (sort of)
* Maybe fix appservices
* Fix client API test
* Include database connection string in database options
* Fix sync API build
* Update config test
* Fix unit tests
* Fix federation sender build
* Fix gobind build
* Set Listen address for all services in HTTP monolith mode
* Validate config, reinstate appservice derived in directory, tweaks
* Tweak federation API test
* Set MaxOpenConnections/MaxIdleConnections to previous values
* Update generate-config
* Use TransactionWriter on other component SQLites
* Fix sync API tests
* Fix panic in media API
* Fix a couple of transactions
* Fix wrong query, add some logging output
* Add debug logging into StoreEvent
* Adjust InsertRoomNID
* Update logging
* Add a bit more logging to the fedsender
* bugfix: continue sending PDUs if ones are added whilst sending another PDU
Without this, the queue goes back to sleep on `<-oq.notifyPDUs` which won't
fire because `pendingPDUs` is already > 0. This should fix a flakey sytest.
* Break if no txn is sent
* WIP syncapi work
* More debugging
* Bump GMSL version to pull in working Event.Redact
* Remove logging
* Make redactions work on v3+
* Fix more tests
* Emit redacted_event from the roomserver when redactions are validated
- Consume them in the currentstateserver and act accordingly.
- Add integration test for the roomserver to check that injecting
`m.room.redaction` events result in `redacted_event` being emitted.
* Linting
* Ignore events that redact themselves
* Implement core redaction logic
- Add a new `redactions_table.go` which tracks the mapping of
the redaction event ID and the redacted event ID
- Mark redactions as 'validated' when we have both events.
- When redactions are validated, add `unsigned.redacted_because`
and modify the `eventJSON` accordingly.
Note: We currently do NOT redact the event content - it's gated
behind a feature flag - until we have tested redactions a bit more.
* Linting
* Use content_value instead of membership
* Fix build
* Replace publicroomsapi with a combination of clientapi/roomserver/currentstateserver
- All public rooms paths are now handled by clientapi
- Requests to (un)publish rooms are sent to the roomserver via `PerformPublish`
which are stored in a new `published_table.go`
- Requests for public rooms are handled in clientapi by:
* Fetch all room IDs which are published using `QueryPublishedRooms` on the roomserver.
* Apply pagination parameters to the slice.
* Do a `QueryBulkStateContent` request to the currentstateserver to pull out
required state event *content* (not entire events).
* Aggregate and return the chunk.
Mostly but not fully implemented (DB queries on currentstateserver are missing)
* Fix pq query
* Make postgres work
* Make sqlite work
* Fix tests
* Unbreak pagination tests
* Linting
* Move Updater structs to shared and use it for postgres
* Add constructors for NewXXXUpdater and a useTxns flag
In sqlite, we set useTxns=false and comment why.
* Handle nil txn
* Handle nil in transaction
* Missed one
* Close the txn at the right time
* Don't close the transaction as we reuse it between calls
* Limit database connections (#564)
- Add new options to the config file database:
max_open_conns: 100
max_idle_conns: 2
conn_max_lifetime: -1
- Implement connection parameter setup on the *DB (database/sql) in internal/sqlutil/trace.go:Open()
- Propagate the values in the form of DbProperties interface via all the
Open() and NewDatabase() functions
Signed-off-by: Tomas Jirka <tomas.jirka@email.cz>
* Fix wasm builds
* Remove file accidentally added from working tree
Co-authored-by: Tomas Jirka <tomas.jirka@email.cz>
* Make backfill work for shared history visibility
* fetch missing state on backfill to remember snapshots correctly
* Fix gmsl to not mux in auth events into room state
* Whoops
* Linting
* Initial cut for backfilling
The syncserver now asks the roomserver via QueryBackfill (which already
existed to *handle* backfill requests) which then makes federation requests
via gomatrixserverlib.RequestBackfill.
Currently, tests fail on subsequent /messages requests because we don't know
which servers are in the room, because we are unable to get state snapshots
from a backfilled event because that code doesn't exist yet.
* WIP backfill, doesn't work
* Make initial backfill pass checks
* Persist backfilled events with state snapshots
* Remove debug lines
* Linting
* Review comments
* Update gomatixserverlib
* Try to build invite stripped state if not given to us
* SendInvite improvements
* Transpose invite_room_state into invite_state.events for sync API
* Remove syncapi debugging output
* Use RespInviteV2
* Update gomatrixserverlib
* Send the invite event as a normal roomserver event too, for incorporating into room (should this be done by the roomserver automatically for invite inputs?)
* Federation sender use invite_room_state, room server try to insert membership state
* Check supported room versions on the invite endpoint
* Prevent roomserver query API from trying to handle requests for stub rooms
* Adding a nolint
* Replace IsRoomStub with RoomNIDExcludingStubs, fix query API to use that instead
* Review comments
* Set default room version to 4
* Default to v1 when room_version key missing
Signed-off-by: Alex Chen <minecnly@gmail.com>
* Squashed commit of the following:
commit c1bca95adb
Author: Kegsay <kegan@matrix.org>
Date: Thu Apr 16 10:06:55 2020 +0100
Add SQL tracing via DENDRITE_TRACE_SQL (#968)
* Add SQL tracing via DENDRITE_TRACE_SQL
Add this to `internal/sqlutil` in preparation for #897
* Not entirely
commit c2ea961909
Author: Kegsay <kegan@matrix.org>
Date: Wed Apr 15 17:48:40 2020 +0100
Add HTTP trace logging (#965)
* Dump all requests/response server-side
* Wrap in DENDRITE_TRACE
* DENDRITE_TRACE_HTTP is better
* Bugfix for response body and linting
* False is true and true is false
* Linting
* How did this get missed
* More linting
* Update gomatrixserverlib
* Remove unneeded imports
Co-authored-by: Cnly <minecnly@gmail.com>
* Start converting v1 invite endpoint to v2
* Update gomatrixserverlib
* Early federationsender code for sending invites
* Sending invites sorta happens now
* Populate invite request with stripped state
* Remodel a bit, don't reflect received invites
* Handle invite_room_state
* Handle room versions a bit better
* Update gomatrixserverlib
* Tweak order in destinationQueue.next
* Revert check in processMessage
* Tweak federation sender destination queue code a bit
* Add comments
* Room version 2 by default, other wiring updates, update gomatrixserverlib
* Fix nil pointer exception
* Fix some more nil pointer exceptions hopefully
* Update gomatrixserverlib
* Send all room versions when joining, not just stable ones
* Remove room version cquery
* Get room version when getting events from the roomserver database
* Reset default back to room version 2
* Don't generate event IDs unless needed
* Revert "Remove room version cquery"
This reverts commit a170d58733.
* Query room version in federation API, client API as needed
* Improvements to make_join send_join dance
* Make room server producers use headered events
* Lint tweaks
* Update gomatrixserverlib
* Versioned SendJoin
* Query room version in syncapi backfill
* Handle transaction marshalling/unmarshalling within Dendrite
* Sorta fix federation (kinda)
* whoops commit federation API too
* Use NewEventFromTrustedJSON when getting events from the database
* Update gomatrixserverlib
* Strip headers on federationapi endpoints
* Fix bug in clientapi profile room version query
* Update gomatrixserverlib
* Return more useful error if room version query doesn't find the room
* Update gomatrixserverlib
* Update gomatrixserverlib
* Maybe fix federation
* Fix formatting directive
* Update sytest whitelist and blacklist
* Temporarily disable room versions 3 and 4 until gmsl is fixed
* Fix count of EDUs in logging
* Update gomatrixserverlib
* Update gomatrixserverlib
* Update gomatrixserverlib
* Rely on EventBuilder in gmsl to generate the event IDs for us
* Some review comments fixed
* Move function out of common and into gmsl
* Comment in federationsender destinationqueue
* Update gomatrixserverlib
* Rearrange state package a bit, add some code to look up the right state resolution algorithm
* Remove shared
* Add GetRoomVersionForRoomNID
* Try to use room version to get correct state resolution algorithm
* Fix room joins over federation
* nolint resolveConflictsV2 because all attempts to break it up so far just result in it being awfully less obvious how it works
* Rename Prepare to NewStateResolution
* Update comments
* Re-add missing tests
* Use HeaderedEvent in syncapi
* Update notifier test
* Fix persisting headered event
* Clean up unused API function
* Fix overshadowed err from linter
* Write headered JSON to invites table too
* Rename event_json to headered_event_json in syncapi database schemae
* Fix invites_table queries
* Update QueryRoomVersionCapabilitiesResponse comment
* Fix syncapi SQLite
* Retrieve room version where known in roomserver
* Get room versions in alias code
* Increase gocyclothreshold to 13, since we hit that number a lot
* Remove gocyclo nolint from StoreEvent
* Update interface to get room version from room ID instead of NID
* Remove new API
* Fixed this query for SQLite but not for Postgres
* Add room version into createRoomReq
* Extract room version from m.room.create event when persisting
* Reduce cyclomatic complexity
* Update whitelist, gomatrixserverlib, tweaks to roomserver
* Update sytest-whitelist again
* Update room version descriptors, add error handling
* Fix database queries
* Drop Get from version package
* Fix database wrapping, add comments for version descriptions
* Don't set default room_version value in SQL
* Use a fork of pq which supports userCurrent on wasm
* Use sqlite3_js driver when running in JS
* Add cmd/dendritejs to pull in sqlite3_js driver for wasm only
* Update to latest go-sqlite-js version
* Replace prometheus with a stub. sigh
* Hard-code a config and don't use opentracing
* Latest go-sqlite3-js version
* Generate a key for now
* Listen for fetch traffic rather than HTTP
* Latest hacks for js
* libp2p support
* More libp2p
* Fork gjson to allow us to enforce auth checks as before
Previously, all events would come down redacted because the hash
checks would fail. They would fail because sjson.DeleteBytes didn't
remove keys not used for hashing. This didn't work because of a build
tag which included a file which no-oped the index returned.
See https://github.com/tidwall/gjson/issues/157
When it's resolved, let's go back to mainline.
* Use gjson@1.6.0 as it fixes https://github.com/tidwall/gjson/issues/157
* Use latest gomatrixserverlib for sig checks
* Fix a bug which could cause exclude_from_sync to not be set
Caused when sending events over federation.
* Use query variadic to make lookups actually work!
* Latest gomatrixserverlib
* Add notes on getting p2p up and running
Partly so I don't forget myself!
* refactor: Move p2p specific stuff to cmd/dendritejs
This is important or else the normal build of dendrite will fail
because the p2p libraries depend on syscall/js which doesn't work
on normal builds.
Also, clean up main.go to read a bit better.
* Update ho-http-js-libp2p to return errors from RoundTrip
* Add an LRU cache around the key DB
We actually need this for P2P because otherwise we can *segfault*
with things like: "runtime: unexpected return pc for runtime.handleEvent"
where the event is a `syscall/js` event, caused by spamming sql.js
caused by "Checking event signatures for 14 events of room state" which
hammers the key DB repeatedly in quick succession.
Using a cache fixes this, though the underlying cause is probably a bug
in the version of Go I'm on (1.13.7)
* breaking: Add Tracing.Enabled to toggle whether we do opentracing
Defaults to false, which is why this is a breaking change. We need
this flag because WASM builds cannot do opentracing.
* Start adding conditional builds for wasm to handle lib/pq
The general idea here is to have the wasm build have a `NewXXXDatabase`
that doesn't import any postgres package and hence we never import
`lib/pq`, which doesn't work under WASM (undefined `userCurrent`).
* Remove lib/pq for wasm for syncapi
* Add conditional building to remaining storage APIs
* Update build script to set env vars correctly for dendritejs
* sqlite bug fixes
* Docs
* Add a no-op main for dendritejs when not building under wasm
* Use the real prometheus, even for WASM
Instead, the dendrite-sw.js must mock out `process.pid` and
`fs.stat` - which must invoke the callback with an error (e.g `EINVAL`)
in order for it to work:
```
global.process = {
pid: 1,
};
global.fs.stat = function(path, cb) {
cb({
code: "EINVAL",
});
}
```
* Linting
* bugfix: fix panic on new invite events from sytest
I'm unsure why the previous code didn't work, but it's
clearer, quicker and easier to read the `LastInsertID()` way.
Previously, the code would panic as the SELECT would fail
to find the last inserted row ID.
* sqlite: Fix UNIQUE violations and close more cursors
- Add missing `defer rows.Close()`
- Do not have the state block NID as a PRIMARY KEY else it breaks for blocks
with >1 state event in them. Instead, rejig the queries so we can still
have monotonically increasing integers without using AUTOINCREMENT (which
mandates PRIMARY KEY).
* sqlite: Add missing variadic function
* Use LastInsertId because empirically it works over the SELECT form (though I don't know why that is)
* sqlite: Fix invite table by using the global stream pos rather than one specific to invites
If we don't use the global, clients don't get notified about any invites
because the position is too low.
* linting: shadowing
* sqlite: do not use last rowid, we already know the stream pos!
* sqlite: Fix account data table in syncapi by commiting insert txns!
* sqlite: Fix failing federation invite
Was failing with 'database is locked' due to multiple write txns
being taken out.
* sqlite: Ensure we return exactly the number of events found in the database
Previously we would return exactly the number of *requested* events, which
meant that several zero-initialised events would bubble through the system,
failing at JSON serialisation time.
* sqlite: let's just ignore the problem for now....
* linting