forked from MirrorHub/synapse
ba7a91aea5
The major change is moving the decision of whether to use oEmbed further up the call-stack. This reverts the _download_url method to being a "dumb" functionwhich takes a single URL and downloads it (as it was before #7920). This also makes more minor refactorings: * Renames internal variables for clarity. * Factors out shared code between the HTML and rich oEmbed previews. * Fixes tests to preview an oEmbed image.
56 lines
2.6 KiB
Markdown
56 lines
2.6 KiB
Markdown
URL Previews
|
|
============
|
|
|
|
The `GET /_matrix/media/r0/preview_url` endpoint provides a generic preview API
|
|
for URLs which outputs [Open Graph](https://ogp.me/) responses (with some Matrix
|
|
specific additions).
|
|
|
|
This does have trade-offs compared to other designs:
|
|
|
|
* Pros:
|
|
* Simple and flexible; can be used by any clients at any point
|
|
* Cons:
|
|
* If each homeserver provides one of these independently, all the HSes in a
|
|
room may needlessly DoS the target URI
|
|
* The URL metadata must be stored somewhere, rather than just using Matrix
|
|
itself to store the media.
|
|
* Matrix cannot be used to distribute the metadata between homeservers.
|
|
|
|
When Synapse is asked to preview a URL it does the following:
|
|
|
|
1. Checks against a URL blacklist (defined as `url_preview_url_blacklist` in the
|
|
config).
|
|
2. Checks the in-memory cache by URLs and returns the result if it exists. (This
|
|
is also used to de-duplicate processing of multiple in-flight requests at once.)
|
|
3. Kicks off a background process to generate a preview:
|
|
1. Checks the database cache by URL and timestamp and returns the result if it
|
|
has not expired and was successful (a 2xx return code).
|
|
2. Checks if the URL matches an [oEmbed](https://oembed.com/) pattern. If it
|
|
does, update the URL to download.
|
|
3. Downloads the URL and stores it into a file via the media storage provider
|
|
and saves the local media metadata.
|
|
4. If the media is an image:
|
|
1. Generates thumbnails.
|
|
2. Generates an Open Graph response based on image properties.
|
|
5. If the media is HTML:
|
|
1. Decodes the HTML via the stored file.
|
|
2. Generates an Open Graph response from the HTML.
|
|
3. If an image exists in the Open Graph response:
|
|
1. Downloads the URL and stores it into a file via the media storage
|
|
provider and saves the local media metadata.
|
|
2. Generates thumbnails.
|
|
3. Updates the Open Graph response based on image properties.
|
|
6. If the media is JSON and an oEmbed URL was found:
|
|
1. Convert the oEmbed response to an Open Graph response.
|
|
2. If a thumbnail or image is in the oEmbed response:
|
|
1. Downloads the URL and stores it into a file via the media storage
|
|
provider and saves the local media metadata.
|
|
2. Generates thumbnails.
|
|
3. Updates the Open Graph response based on image properties.
|
|
7. Stores the result in the database cache.
|
|
4. Returns the result.
|
|
|
|
The in-memory cache expires after 1 hour.
|
|
|
|
Expired entries in the database cache (and their associated media files) are
|
|
deleted every 10 seconds. The default expiration time is 1 hour from download.
|