synapse/docs/user_directory.md
David Robertson 318162f5de
Easy refactors of the user directory (#10789)
No functional changes here. This came out as I was working to tackle #5677
2021-09-10 10:54:38 +01:00

50 lines
2.4 KiB
Markdown

User Directory API Implementation
=================================
The user directory is currently maintained based on the 'visible' users
on this particular server - i.e. ones which your account shares a room with, or
who are present in a publicly viewable room present on the server.
The directory info is stored in various tables, which can (typically after
DB corruption) get stale or out of sync. If this happens, for now the
solution to fix it is to execute the SQL [here](https://github.com/matrix-org/synapse/blob/master/synapse/storage/schema/main/delta/53/user_dir_populate.sql)
and then restart synapse. This should then start a background task to
flush the current tables and regenerate the directory.
Data model
----------
There are five relevant tables that collectively form the "user directory".
Three of them track a master list of all the users we could search for.
The last two (collectively called the "search tables") track who can
see who.
From all of these tables we exclude three types of local user:
- support users
- appservice users
- deactivated users
* `user_directory`. This contains the user_id, display name and avatar we'll
return when you search the directory.
- Because there's only one directory entry per user, it's important that we only
ever put publicly visible names here. Otherwise we might leak a private
nickname or avatar used in a private room.
- Indexed on rooms. Indexed on users.
* `user_directory_search`. To be joined to `user_directory`. It contains an extra
column that enables full text search based on user ids and display names.
Different schemas for SQLite and Postgres with different code paths to match.
- Indexed on the full text search data. Indexed on users.
* `user_directory_stream_pos`. When the initial background update to populate
the directory is complete, we record a stream position here. This indicates
that synapse should now listen for room changes and incrementally update
the directory where necessary.
* `users_in_public_rooms`. Contains associations between users and the public rooms they're in.
Used to determine which users are in public rooms and should be publicly visible in the directory.
* `users_who_share_private_rooms`. Rows are triples `(L, M, room id)` where `L`
is a local user and `M` is a local or remote user. `L` and `M` should be
different, but this isn't enforced by a constraint.