0
0
Fork 0
mirror of https://github.com/matrix-construct/construct synced 2024-06-03 18:48:55 +02:00

Add deployment review section; various.

Jason Volk 2020-06-13 19:36:50 -07:00
parent 163100a0ed
commit 51c4873cde

@ -22,17 +22,43 @@ guide for how to recover from configuration errors.
##### Table of Contents
- [Event Cache Memory Locking](#Event-Cache-Memory-Locking)
- [Event Cache Size](#Event-Cache-Size)
- [Client Pool Size](#Client-Pool-Size)
1. [Deployment Review](#Deployment-Review)
2. [Cache Memory Locking](#Cache-Memory-Locking)
3. [Event Cache Size](#Event-Cache-Size)
4. [Client Pool Size](#Client-Pool-Size)
### Event Cache Memory Locking
## Deployment Review
On Linux systems (or systems which support `mlock2(2)`) Construct has a feature which can prevent swapping of event caches by locking them into RAM. To enable this feature, the resource limit for locked memory must be set to `unlimited`. This can be achieved by running `ulimit -l unlimited` before executing. Note that _any_ limit on locked memory (even if larger than the cache sizes) will disable this feature.
It is highly recommended these items are reviewed by administrators before continuing with the remainder of this guide.
#### Asynchronous Filesystem I/O
The most impactful actions an administrator can take come from providing Construct with a suitable I/O environment. The server's workload relies heavily on random-access to local files. Using a solid-state-srive (SSD) rather than a mechanical hard-disk (HD) is preferred but not required for great performance. When using a low-latency storage device, the impact from this section is much less pronounced, but adherence to it is still advised nonetheless. **When using high-latency storage devices, adhering to this section is essential.**
- **Operating system must support asynchronous filesystem I/O.**
Currently, only Linux is viable as Construct makes use of the AIO interface. We note that while FreeBSD and Windows actually have superior asynchronous IO support, we simply haven't ported to those within Construct yet and this section will be updated. On Linux, there are incremental benefits which Construct takes advantage of between Linux 4.4 and Linux 5.3. Always use the newest possible kernel for best performance.
- **Filesystem must support Direct-IO.**
Ext4 and family support Direct-IO (`O_DIRECT`). Many experimental filesystems *do not* support Direct-IO including ZFS (with the exception of some very recent releases). It is strongly advised you place your database directory in a filesystem which fully supports Direct-IO. It is important that Construct _does not_ read data through the operating system's page-cache. Direct-IO allows the server to submit many fine-grained read requests to high-latency storage in parallel without impeding execution.
- **Hardware device properties must be detected.**
Construct probes information about block devices using `sysfs(5)`. It must acquire the device's queue depth, and a few other characteristics for optimal operation. When not found, the queue depth defaults to `32`. This value controls the request parallelism and may be incorrectly utilizing the backplane. For multi-disk RAID arrays, it is often too low and bandwidth is wasted. For limited virtual machines and cheaper hardware it may be too high, causing `io_submit(2)` to block, stalling the server.
- Review device information with the console command `fs dev`. If detected device information is incorrect or absent, manually configure the appropriate queue depth with the environment variable `ircd_fs_aio_max_events` before execution.
#### Optimizing Dynamic Memory
Construct automatically detects `jemalloc(3)` on the system at `./configure` time (see: [BUILD](https://github.com/matrix-construct/construct/wiki/BUILD)) and marks it as `DT_NEEDED` on ELF systems to load it as the default allocator. **Do not `LD_PRELOAD` jemalloc** as this will override our configuration and increases memory usage.
- Confirm jemalloc in use at the console with the command: `mem get version string`.
## Cache Memory Locking
On Linux systems (or systems which support `mlock2(2)`), and when `jemalloc(3)` is being used, Construct has a feature which can prevent swapping of database caches by locking them into RAM. To enable this feature, the resource limit for locked memory must be set to `unlimited`. This can be achieved by running `ulimit -l unlimited` before executing. Note that _any_ limit on locked memory (even if larger than the cache sizes) will disable this feature.
> A page-fault for swapped data will block the entire server until the data is read back into RAM. This is inferior to Construct's normal operation which reads data from the disk asynchronously without blocking. There is never a good reason to swap cache data; it is always better to simply drop it. In the future, Construct will support trimming caches under high memory pressure as reported by Linux Pressure-Stall-Information.
### Event Cache Size
## Event Cache Size
Most of Construct's runtime footprint in RAM consists of a cache of Matrix
events read from the database. The data in many of these events may be
@ -103,6 +129,6 @@ value is several times higher than the cache size and growing, consider
increasing that cache's size.
### Client Pool Size
## Client Pool Size
(TODO)