ircd:Ⓜ️ Update README.

2024-11-18 16:00:57 +01:00 · 2017-10-25 09:26:25 -07:00 · 2017-10-25 09:26:25 -07:00 · deaea72f9a
commit deaea72f9a
parent 4ccc610bfe
1 changed files with 224 additions and 0 deletions
--- a/include/ircd/m/README.md
+++ b/include/ircd/m/README.md
@ -98,3 +98,227 @@ room. Rooms may contain events of any type, but we don't invent new `m.room.*`
 type events ourselves. This project tends to create events in the namespace
 `ircd.*` These events should not alter the room's functionality for a client
 with knowledge of only the published `m.room.*` events wouldn't understand.
 #### Coherence
 Matrix is specified as a directed acyclic graph of messages. The conversation of
 messages moves in one direction: past to future. Messages only reference other
 messages which have a lower degree of separation (depth) from the first message
 in the graph (m.room.create). Specifically, each message makes a reference to all
 known messages at the last depth.
 * The strong ordering of this system contributes to an intuitive "light cone"
 read coherence. Knowledge of any piece of information (like an event) offers
 coherent knowledge of all known information which preceded it at that point.
 * Write consistency is relaxed. Multiple messages may be issued at the same
 depth from independent actors and multiple reference chains may form
 independent of others. This provides the scalar for performance in a large
 distributed internet system.
 * Write incoherence must then be resolved with entry consistency because of the
 relaxed release sequence. While parties broadcast all of their new messages,
 they make no guarantees for their arrival integration with destinations at
 the point of release. This wouldn't be as practical. This means a write which
 wishes to be coherent can only use the best available state they have been
 made aware of and commit a new message to it.
 The system has no other method of resolving incoherence. As a future thought,
 some form of release commitment will have to be integrated among at least a
 subset of actors for a few important updates to the graph. For example, a
 two-phase commit of an important state event *or the re-introduction of
 the classic IRC mode change indicating a commitment to change state.*
 References to previous events:
 [A0] <-- [A1] <-- [A2]       | A has seen B1 and includes a reference in A2
  ^                 |
  |        <---<----<
  |        |
  ^------ [B1] <-- [B2]       | B hasn't yet seen A1 or A2
 [T0]  A release A0  :
 [T1]  A release A1  :  B acquire A0
 [T2]                :  B release B1
 [T3]  A acquire B1  :  B release B2
 [T4]  A release A2  :
 Both actors will have their clock (depth) now set to 2 and will issue the
 next new message at clock cycle 3 referencing all messages from cycle 2 to
 merge the split in the illustration above which is happening.
 [A0] <-- [A1] <-- [A2]            [A4]      | A now sees B3, B2, and B1
 ^                 |  |            |
 |        <---<----<  ^--<--<   <--<
 |        |                 |   |
 ^------- [B1] <-- [B2] <-- [B3]             | B now sees A2, A1, and A0
 ### Implementation
 This is a single-writer/multiple-reader approach. The "core" is the only writer.
 The write itself is just the saving of an event. This serves as a transaction
 advancing the state of the machine with effects visible to all future
 transactions and external actors.
 The core takes the pattern of
 `evaluate + exclude -> write commitment -> release sequence`. The single
 writer approach means that we resolve all incoherence using exclusion or
 reordering or rejection on entry and before any writing and release of the
 event. Many ircd::ctx's can orbit the inner core resolving their evaluation
 with the tightest exclusion occurring around the write at the inner core.
 This also gives us the benefit of a total serialization at this point.
       :::::::
       |||||||      <-- evaluation + rejection
         \|/        <-- evaluation + exclusion / reordering
          !
          *         <-- actor serialized core write commitment
       //|||\\
     //|// \\|\\
    :::::::::::::   <-- release sequence propagation cone
 The evaluation phase ensures the event commitment will work: that the event
 is valid, and that the event is a valid transition of the machine according
 to the rules. This process may take some time and many yields and IO, even
 network IO -- if the server lacks a warm cache. During the evaluation phase
 locks and exclusions may be acquired to maintain the validity of the
 evaluation state through writing at the expense of other contexts contending
 for that resource.
 > Many ircd::ctx are concurrently working their way through the core. The
 > "velocity" is low when an ircd::ctx on this path may yield a lot for various
 > IO and allow other events to be processed. The velocity increases when
 > concurrent evaluation and reordering is no longer viable to maintain
 > coherence. Any yielding of an ircd::ctx at a higher velocity risks stalling
 > the whole core.
       :::::::       <-- event input              (low velocity)
       |||||||       <-- evaluation process       (low velocity)
         \|/         <-- serialization process    (higher velocity)
 The write commitment saves the event to the database. This is a relatively
 fast operation which probably won't even yield the ircd::ctx, and all
 future reads to the database will see this write.
          !          <-- serial write commitment  (highest velocity)
 The release sequence broadcasts the event so its effects can be consumed.
 This works by yielding the ircd::ctx so all consumers can view the event
 and apply its effects for their feature module or send the event out to
 clients. This is usually faster than it sounds, as the consumers try not to
 hold up the release sequence for more than their first execution-slice,
 and copy the event if their output rate is slower.
          *         <-- event revelation (higher velocity)
       //|||\\
     //|// \\|\\
    :::::::::::::   <-- release sequence propagation cone (low velocity)
 The entire core commitment process relative to an event riding through it
 on an ircd::ctx has a duration tolerable for something like a REST interface,
 so the response to the user can wait for the commitment to succeed or fail
 and properly inform them after.
 The core process is then optimized by the following facts:
 	* The resource exclusion zone around most matrix events is either
 	  small or non-existent because of its relaxed write consistency.
 	* Writes in this implementation will not delay.
 "Core dilation" is a phenomenon which occurs when large numbers of events
 which have relaxed dependence are processed concurrently because none of
 them acquire any exclusivity which impede the others.
       :::::::
       |||||||
       |||||||   <-- Core dilation; flow shape optimized for volume.
       |||||||
       /|||||\
      ///|||\\\
     //|/|||\|\\
    :::::::::::::
 Close up of the charybdis's write head when tight to one schwarzschild-radius of
 matrix room surface which propagates only one event through at a time.
 Vertical tracks are contexts on their journey through each evaluation and exclusion
 step to the core.
    Input Events                                            Phase
    ::::::::::::::::::::::::::::::::::::::::::::::::::::::  validation / dupcheck
    ||||||||||||||||||||||||||||||||||||||||||||||||||||||  identity/key resolution
    ||||||||||||||||||||||||||||||||||||||||||||||||||||||  verification
    |||| ||||||||||||||| ||||||||||||||| |||||||||||||||||  head resolution
    --|--|----|-|---|--|--|---|---|---|---------|---|---|-  graph resolutions
    ----------|-|---|---------|-------|-----------------|-  module evaluations
     \          |   |         |       |                  /
       ==       ==============|       |               ==    Lowest velocity locks
          \                   |       |             /
            ==                |       |          ==         Mid velocity locks
               \              |       |        /
                 ==           |      /      ==              High velocity locks
                    \         |     /     /
                      ==      =====/=  ==                   Highest velocity lock
                         \       /   /
                          \__   / __/
                             _ | _
                               !                            Write commitment
 Above, two contexts are illustrated as contending for the highest velocity
 lock. The highest velocity lock is not held for significant time, as the
 holder has very little work left to be done within the core, and will
 release the lock to the other context quickly. The lower velocity locks
 may have to be held longer, but are also less exclusive to all contexts.
                               *                            Singularity
                             [   ]
               /-------------[---]-------------\
            /                :   :                \         Federation send
         /         /---------[---]---------\         \
                 /           :   :           \              Client sync
        out    /      /------[---]------\      \   out
             /       /       :   :       \       \
           /   out  /        |   |        \  out   \
                   /   out   /   \   out   \
                            /     \
                            return
                         | result to |
                         | evaluator |
                         -------------
 Above, a close-up of the release sequence. The new event is being "viewed" by
 each consumer context separated by the horizontal lines representing a context
 switch from the perspective of the event travelling down. Each consumer
 performs its task for how to propagate the commissioned event.
 Each consumer has a shared-lock of the event which will hold up the completion
 of the commitment until all consumers release that. The ideal consumer will only
 hold their lock for a single context-slice while they play their part in applying
 the event, like non-blocking copies to sockets etc. These consumers then go on
 to do the rest of their output without the original event data which was memory
 supplied by the evaluator (like an HTTP client). Then all locks acquired on
 the entry side of the core can be released. The evaluator then gets the result
 of the successful commitment.
 #### Scaling
 Scaling beyond the limit of a single CPU core can be done with multiple instances
 of IRCd which form a cluster of independent actors. This cluster can extend
 to other machines on the network too. The independent actors leverage the weak
 write consistency and strong ordering of the matrix protocol to scale the same
 way the federation scales.
 Interference pattern of two IRCd'en:
  ::::::::::::::::::::::::::::::::::::
  --------\:::::::/--\:::::::/--------
           |||||||    |||||||
             \|/        \|/
              !          !
              *          *
           //|||\\    //|||\\
         //|// \\|\\//|// \\|\\
        /|/|/|\|\|\/|/|/|\|\|\|\