0
0
Fork 0
mirror of https://github.com/matrix-construct/construct synced 2024-12-11 16:13:01 +01:00
construct/include/ircd/ctx/README.md
2020-01-11 23:31:53 -08:00

6.7 KiB

Userspace Context Switching

The ircd::ctx subsystem is a userspace threading library meant to regress the asynchronous callback pattern back to synchronous suspensions. These are stackful coroutines which provide developers with more intuitive control in environments which conduct frequent I/O which would otherwise break up a single asynchronous stack into callback-hell.

Motivation

Userspace threads are an alternative to using posix kernel threads as a way to develop intuitively-stackful programs in applications which are primarily I/O-bound rather than CPU-bound. This is born out of a recognition that a single CPU core has enough capacity to compute the entirety of all requests for an efficiently-written network daemon if I/O were instantaneous; if one could use a single thread it is advantageous to do so right up until the compute-bound is reached, rather than introducing more threads for any other reason. The limits to single-threading and scaling beyond a single CPU is then pushed to higher-level application logic: either message-passing between multiple processes (or machines in a cluster), and/or threads which have extremely low interference.

ircd::ctx allows for a very large number of contexts to exist, on the order of thousands or more, and still efficiently make progress without the overhead of kernel context switches. As an anecdotal example, a kernel context switch from a contended mutex could realistically be five to ten times more costly than a userspace context switch if not significantly more, and with effects that are less predictable. Contexts will accomplish as much work as possible in a "straight line" before yielding to the kernel to wait for the completion of any I/O event.

There is no preemptive interleaving of contexts. This makes every sequence of instructions executed a natural transaction requiring no other method of exclusion. It also allows for introspective conditions, i.e: check if context switch occurred: if so, refresh value; if not, the old value is good. This is impossible in a preemptive environment as the result may have changed at instruction boundaries rather than at cooperative boundaries.

Foundation

This library is based in boost::coroutine / boost::context which wraps the register save/restores in a cross-platform way in addition to providing properly mmap(NOEXEC)'ed etc memory appropriate for stacks on each platform.

boost::asio has then added its own comprehensive integration with the above libraries eliminating the need for us to worry about a lot of boilerplate to de-async the asio networking calls. See: boost::asio::spawn.

This is a nice boost, but that's as far as it goes. The rest is on us here to actually make a threading library.

Interface

We mimic the standard library std::thread suite as much as possible (which mimics the boost::thread library) and offer alternative threading primitives for these userspace contexts rather than those for operating system threads in std:: such as ctx::mutex and ctx::condition_variable and ctx::future among others.

  • The primary user object is ircd::context (or ircd::ctx::context) which has an std::thread-like interface. By default the lifetime of this instance has significance, and when context goes out of scope, a termination is sent to the child context and the parent yields until it has joined (see DETACH and WAIT_JOIN flags to change this behavior). Take note that this means by default a child context may never be able to complete (or even be entered at all!) if the parent constructs and then desctructs context without either any flags or any strategy by the parent to wait for completion.

Context Switching

A context switch has the overhead of a heavy function call -- a function with a bunch of arguments (i.e the registers being saved and restored). We consider this fast and our philosophy is to not think about the context switch itself as a bad thing to be avoided for its own sake.

This system is also fully integrated both with the IRCd core boost::asio::io_service event loop and networking systems. There are actually several types of context switches going on here built on two primitives:

  • Direct jump: This is the fastest switch. Context A can yield to context B directly if A knows about B and if it knows that B is in a state ready to resume from a direct jump and that A will also be further resumed somehow. This is not always suitable in practice so other techniques may be used instead.

  • Queued wakeup: This is the common default and safe switch. This is where the context system integrates with the boost::asio::io_service event loop. The execution of a "slice" as we'll call a yield-to-yield run of non-stop computation is analogous to a function posted to the io_service in the asynchronous pattern. Context A can enqueue context B if it knows about B and then choose whether to yield or not to yield. In any case the io_service queue will simply continue to the next task which isn't guaranteed to be B.

The most common context switching encountered and wielded by developers is the ctx::dock, a non-locking condition variable. The power of the dock is in its:

  1. Lightweight. It's just a ctx::list; two pointers for a list head, where the nodes are the contexts themselves participating in the list.

  2. Cooperative-condition optimized. When a context is waiting on a condition it provides, the ctx system can run the function itself to test the condition without waking up the context.

When does Context Switching (yielding) occur?

Bottom line is that this is simply not javascript. There are no stack-clairvoyant keywords like await which explicitly indicate to everyone everywhere that the overall state of the program before and after any totally benign-looking function call will be different. This is indeed multi-threaded programming but in a very PG-13 rated way. You have to assume that if you aren't sure some function has a "deep yield" somewhere way up the stack that there is a potential for yielding in that function. Unlike real concurrent threading everything beyond this is much easier.

  • Anything directly on your stack is safe (same with real MT anyway).

  • static and global assets are safe if you can assert no yielding. Such an assertion can be made with an instance of ircd::ctx::critical_assertion. Note that we try to use thread_local rather than static to still respect any real multi-threading that may occur now or in the future.

  • Calls which may yield and do IO may be marked with [GET] and [SET] conventional labels but they may not be. Some reasoning about obvious yields and a zen-like awareness is always recommended.