mirror of
https://github.com/matrix-construct/construct
synced 2024-11-16 15:00:51 +01:00
123 lines
6.7 KiB
Markdown
123 lines
6.7 KiB
Markdown
## Userspace Context Switching
|
|
|
|
The `ircd::ctx` subsystem is a userspace threading library meant to regress
|
|
the asynchronous callback pattern back to synchronous suspensions. These are
|
|
stackful coroutines which provide developers with more intuitive control in
|
|
environments which conduct frequent I/O which would otherwise break up a single
|
|
asynchronous stack into callback-hell.
|
|
|
|
### Motivation
|
|
|
|
Userspace threads are an alternative to using posix kernel threads as a way
|
|
to develop intuitively-stackful programs in applications which are primarily
|
|
I/O-bound rather than CPU-bound. This is born out of a recognition that a
|
|
single CPU core has enough capacity to compute the entirety of all requests for
|
|
an efficiently-written network daemon if I/O were instantaneous; if one
|
|
*could* use a single thread it is advantageous to do so right up until the
|
|
compute-bound is reached, rather than introducing more threads for any other
|
|
reason. The limits to single-threading and scaling beyond a single CPU is then
|
|
pushed to higher-level application logic: either message-passing between
|
|
multiple processes (or machines in a cluster), and/or threads which have
|
|
extremely low interference.
|
|
|
|
`ircd::ctx` allows for a very large number of contexts to exist, on the order
|
|
of thousands or more, and still efficiently make progress without the overhead
|
|
of kernel context switches. As an anecdotal example, a kernel context switch
|
|
from a contended mutex could realistically be five to ten times more costly
|
|
than a userspace context switch if not significantly more, and with effects
|
|
that are less predictable. Contexts will accomplish as much work as possible
|
|
in a "straight line" before yielding to the kernel to wait for the completion
|
|
of any I/O event.
|
|
|
|
> There is no preemptive interleaving of contexts. This makes every sequence
|
|
of instructions executed a natural transaction requiring no other method of
|
|
exclusion. It also allows for introspective conditions, i.e: check if context
|
|
switch occurred: if so, refresh value; if not, the old value is good. This is
|
|
impossible in a preemptive environment as the result may have changed at
|
|
instruction boundaries rather than at cooperative boundaries.
|
|
|
|
### Foundation
|
|
|
|
This library is based in `boost::coroutine / boost::context` which wraps
|
|
the register save/restores in a cross-platform way in addition to providing
|
|
properly `mmap(NOEXEC)'ed` etc memory appropriate for stacks on each platform.
|
|
|
|
`boost::asio` has then added its own comprehensive integration with the above
|
|
libraries eliminating the need for us to worry about a lot of boilerplate to
|
|
de-async the asio networking calls. See: [boost::asio::spawn](http://www.boost.org/doc/libs/1_65_1/boost/asio/spawn.hpp).
|
|
|
|
This is a nice boost, but that's as far as it goes. The rest is on us here to
|
|
actually make a threading library.
|
|
|
|
### Interface
|
|
|
|
We mimic the standard library `std::thread` suite as much as possible (which
|
|
mimics the `boost::thread` library) and offer alternative threading primitives
|
|
for these userspace contexts rather than those for operating system threads in
|
|
`std::` such as `ctx::mutex` and `ctx::condition_variable` and `ctx::future`
|
|
among others.
|
|
|
|
* The primary user object is `ircd::context` (or `ircd::ctx::context`) which has
|
|
an `std::thread`-like interface. By default the lifetime of this instance has
|
|
significance, and when `context` goes out of scope, a termination is sent to
|
|
the child context and the parent yields until it has joined (see DETACH and
|
|
WAIT_JOIN flags to change this behavior). Take note that this means by default
|
|
a child context may never be able to complete (or even be entered at all!)
|
|
if the parent constructs and then desctructs `context` without either any flags
|
|
or any strategy by the parent to wait for completion.
|
|
|
|
### Context Switching
|
|
|
|
A context switch has the overhead of a heavy function call -- a function with
|
|
a bunch of arguments (i.e the registers being saved and restored). We consider
|
|
this _fast_ and our philosophy is to not think about the context switch
|
|
_itself_ as a bad thing to be avoided for its own sake.
|
|
|
|
This system is also fully integrated both with the IRCd core
|
|
`boost::asio::io_service` event loop and networking systems. There are actually
|
|
several types of context switches going on here built on two primitives:
|
|
|
|
* Direct jump: This is the fastest switch. Context `A` can yield to context `B`
|
|
directly if `A` knows about `B` and if it knows that `B` is in a state ready to
|
|
resume from a direct jump _and_ that `A` will also be further resumed somehow.
|
|
This is not always suitable in practice so other techniques may be used instead.
|
|
|
|
* Queued wakeup: This is the common default and safe switch. This is where
|
|
the context system integrates with the `boost::asio::io_service` event loop.
|
|
The execution of a "slice" as we'll call a yield-to-yield run of non-stop
|
|
computation is analogous to a function posted to the `io_service` in the
|
|
asynchronous pattern. Context `A` can enqueue context `B` if it knows about `B`
|
|
and then choose whether to yield or not to yield. In any case the `io_service`
|
|
queue will simply continue to the next task which isn't guaranteed to be `B`.
|
|
|
|
The most common context switching encountered and wielded by developers is the
|
|
`ctx::dock`, a non-locking condition variable. The power of the dock is in its:
|
|
|
|
1. Lightweight. It's just a ctx::list; two pointers for a list head, where the
|
|
nodes are the contexts themselves participating in the list.
|
|
|
|
2. Cooperative-condition optimized. When a context is waiting on a condition
|
|
it provides, the ctx system can run the function itself to test the condition
|
|
without waking up the context.
|
|
|
|
#### When does Context Switching (yielding) occur?
|
|
|
|
Bottom line is that this is simply not javascript. There are no
|
|
stack-clairvoyant keywords like `await` which explicitly indicate to everyone
|
|
everywhere that the overall state of the program before and after any totally
|
|
benign-looking function call will be different. This is indeed multi-threaded
|
|
programming but in a very PG-13 rated way. You have to assume that if you
|
|
aren't sure some function has a "deep yield" somewhere way up the stack that
|
|
there is a potential for yielding in that function. Unlike real concurrent
|
|
threading everything beyond this is much easier.
|
|
|
|
* Anything directly on your stack is safe (same with real MT anyway).
|
|
|
|
* `static` and global assets are safe if you can assert no yielding. Such
|
|
an assertion can be made with an instance of `ircd::ctx::critical_assertion`.
|
|
Note that we try to use `thread_local` rather than `static` to still respect
|
|
any real multi-threading that may occur now or in the future.
|
|
|
|
* Calls which may yield and do IO may be marked with `[GET]` and `[SET]`
|
|
conventional labels but they may not be. Some reasoning about obvious yields
|
|
and a zen-like awareness is always recommended.
|