Clarify `(room_id, event_id)` global uniqueness (#13701)

Summarized from @richvdh's reply at https://github.com/matrix-org/synapse/pull/13589#discussion_r961116999
This commit is contained in:
Eric Eastwood 2022-09-02 14:05:39 -05:00 committed by GitHub
parent 36b184b782
commit 877bdfa889
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 25 additions and 0 deletions

1
changelog.d/13701.doc Normal file
View File

@ -0,0 +1 @@
Clarify `(room_id, event_id)` global uniqueness and how we should scope our database schemas.

View File

@ -191,3 +191,27 @@ There are three separate aspects to this:
flavour will be accepted by SQLite 3.22, but will give a column whose
default value is the **string** `"FALSE"` - which, when cast back to a boolean
in Python, evaluates to `True`.
## `event_id` global uniqueness
In room versions `1` and `2` it's possible to end up with two events with the
same `event_id` (in the same or different rooms). After room version `3`, that
can only happen with a hash collision, which we basically hope will never
happen.
There are several places in Synapse and even Matrix APIs like [`GET
/_matrix/federation/v1/event/{eventId}`](https://spec.matrix.org/v1.1/server-server-api/#get_matrixfederationv1eventeventid)
where we assume that event IDs are globally unique.
But hash collisions are still possible, and by treating event IDs as room
scoped, we can reduce the possibility of a hash collision. When scoping
`event_id` in the database schema, it should be also accompanied by `room_id`
(`PRIMARY KEY (room_id, event_id)`) and lookups should be done through the pair
`(room_id, event_id)`.
There has been a lot of debate on this in places like
https://github.com/matrix-org/matrix-spec-proposals/issues/2779 and
[MSC2848](https://github.com/matrix-org/matrix-spec-proposals/pull/2848) which
has no resolution yet (as of 2022-09-01).