diff --git a/.gitmodules b/.gitmodules index cacb2f450..c418dc866 100644 --- a/.gitmodules +++ b/.gitmodules @@ -15,3 +15,6 @@ [submodule "deps/pbc"] path = deps/pbc url = https://github.com/blynn/pbc.git +[submodule "doc"] + path = doc + url = https://github.com/matrix-construct/construct.wiki.git diff --git a/doc b/doc new file mode 160000 index 000000000..a91f7dd0d --- /dev/null +++ b/doc @@ -0,0 +1 @@ +Subproject commit a91f7dd0d9f4dbc6bb42a8117814db9589a73bd2 diff --git a/doc/.gitignore b/doc/.gitignore deleted file mode 100644 index b791d9359..000000000 --- a/doc/.gitignore +++ /dev/null @@ -1,5 +0,0 @@ -doxygen -html -TAGS -latex -xml diff --git a/doc/ARCHITECTURE.md b/doc/ARCHITECTURE.md deleted file mode 100644 index 495bfedfc..000000000 --- a/doc/ARCHITECTURE.md +++ /dev/null @@ -1,65 +0,0 @@ -# Architectural Philosophy - -### libircd - -##### Single-threaded✝ - -The design of `libircd` is fully-asynchronous, oriented around a single-thread -event-loop. No code in the library _blocks_ the process. All operations are -conducted on top of a single `boost::asio::io_service` which must be supplied -by the executable linking to `libircd`. That `io_service` must be run by the -executable at its discretion; typically the embedder's call to `ios.run()` is -the only place the process will _block_. - -The single-threaded approach ensures there is an _uninterrupted_, _uncontended_, -_predictable_ execution which is easy for developers to reason about intuitively -with sequential-consistency. This is ideal for the I/O-bound application being -facilitated. If there are periods of execution which are computationally intense -like parsing, hashing, cryptography, etc: this is absorbed in lieu of thread -synchronization and bus contention. - -This system achieves scale through running multiple independent instances which -synchronize at the application-logic level through passing the application's own -messages. - -✝ However, do not assume a truly threadless execution for the entire address -space. If there is ever a long-running background computation or a call to a -3rd party library which will block the event loop, we may use an additional -`std::thread` to "offload" such an operation. Thus we do have a threading model, -but it is heterogeneous. - -##### Introduces userspace threading - -IRCd presents an interface introducing stackful coroutines, a.k.a. userspace -context switching, a.k.a. green threads, a.k.a. fibers. The library avoids -callbacks as the way to break up execution when waiting for events. Instead, we -harken back to the simple old ways of synchronous programming where control -flow and data are easy to follow. If there are certain cases where we don't -want a stack to linger which may jeopardize the c10k'ness of the daemon the -asynchronous pattern is still used (this is a hybrid system). - -Consider coroutines like "macro-ops" and asynchronous callbacks like -"micro-ops." The pattern tends to use a coroutine to perform a large and -complex operation which may involve many micro-ops behind the scenes. This -approach relegates the asynchronous callback pattern to simple tasks contained -within specific units which require scale, encapsulating the complexity away -from the rest of the project. - -##### Runs only one server at a time - -Keeping with the spirit of simplicity of the original architecture, `libircd` -continues to be a "singleton" object which uses globals and keeps actual server -state in the library itself. In other words, **only one IRC daemon can exist -within a process's address space at a time.** Whether or not this was a pitfall -of the original design, it has emerged over the decades as a very profitable -decision for making IRCd an accessible open source internet project. - -##### Formal grammars, RTTI, exceptions - -We utilize the `boost::spirit` system of parsing and printing through -compile-time formal grammars, rather than writing our own parsers manually. -In addition, we build several tools on top of such formal devices like a -type-safe format string library acting as a drop-in for `::sprintf()`, but -accepting objects like `std::string` without `.c_str()` and prevention of -outputting unprintable/unwanted characters that may have been injected into -the system somewhere prior. diff --git a/doc/BUILD.md b/doc/BUILD.md deleted file mode 100644 index 49c64a72f..000000000 --- a/doc/BUILD.md +++ /dev/null @@ -1,176 +0,0 @@ -## BUILD (standalone) - -##### Compatibility Primer - -This section is intended to allow building with dependencies that have not -made their way to mainstream systems. Important notes that may affect you: - -- Boost: The required version is available through `apt` as `libboost-all-dev` on -Ubuntu Cosmic (18.10). All earlier releases (including 18.04 LTS) can configure -with `--with-included-boost` as instructed below. - -- RocksDB: THE COMPLETE SOURCE-CODE OF ROCKSDB MUST BE AVAILABLE TO BUILD CONSTRUCT. -This is different from the `include/` and `lib/` files installed by your -distribution's package system. You do not have to build the source, but it must -be available. ALL UBUNTU USERS MUST BUILD THE SOURCE AS WELL (SKIP TO NEXT BULLET). - -``` -git submodule update --init deps/rocksdb -cd deps/rocksdb -git fetch --tags --force -git checkout v5.17.2 -``` -> For best performance and stability, please check for the version available on -your system for the above `git checkout`. - -- RocksDB: All Ubuntu users on all releases must configure Construct with the -option `--with-included-rocksdb`. This will fetch and properly build rocksdb. - -> Ubuntu builds their library with `-Bsymbolic-functions`. This conflicts with -the requirements of Construct's embedding. - -##### Installation Primer - -A general overview of what construct will build and install is given here. At -this time it is suggested to supply `./configure` with a `--prefix` path, -especially for development. Example `--prefix=~/.local/`. - -- Binary executable `$prefix/bin/construct` -- Shared library `$prefix/lib/libircd.so` -- Shared library modules `$prefix/lib/modules/construct/*.so` -- Header files `$prefix/include/ircd/*` -- Read-only shared assets `$prefix/share/construct/*` -- Database directory may be established at `$prefix/var/db/construct/` - -``` -Do not set your `--prefix` path to a directory inside your git repository or -an invocation of `git clean` will erase your database in $prefix/var/db/. -``` - -#### STANDALONE BUILD PROCEDURE - -``` -./autogen.sh -./configure --prefix=$PWD/build -make install -``` - -> The `--with-included-*` will fetch, configure **and build** the dependencies included -as submodules. The result will not be installable on the system without this repository -remaining intact. Please read the compatibility primer first to understand which options -you need or don't need on your system. - - -### Additional build options - -#### Debug mode - -``` ---enable-debug -``` -Full debug mode. Includes additional code within `#ifdef RB_DEBUG` sections. -Optimization level is `-Og`, which is still valgrind-worthy. Debugger support -is `-ggdb`. Log level is `DEBUG` (maximum). Assertions are enabled. No -sanitizer instrumentation is generated by default in this mode. - - -#### Generic mode binary (for distribution packages) - -Construct developers have set the default compilation to generate native -hardware operations which may only be supported on very specific targets. For -a generic mode binary, package maintainers may require this option. - -``` ---enable-generic -``` -Sets `-mtune=generic` as `native` is otherwise the default. - - -#### Compact mode (experimental) - -``` ---enable-compact -``` -Create the smallest possible resulting output. This will optimize for size -(if optimization is enabled), remove all debugging, strip symbols, and apply -any toolchain-feature or #ifdef in code that optimizes the output size. - -_This feature is experimental. It may not build or execute on all platforms -reliably. Please report bugs._ - - -#### Manually enable assertions - -``` ---enable-assert -``` -Implied by `--enable-debug`. This is useful to specifically enable `assert()` -statements when `--enable-debug` is not used. - -``` ---with-assert=trap -``` -Recommended when using `--enable-assert` for debugging. This replaces the -default mechanism of assertion with traps rather than aborts; allowing -developers to explore an unterminated program. - -#### Manually enable optimization - -``` ---enable-optimize -``` -This manually applies full release-mode optimizations even when using -`--enable-debug`. Implied when not in debug mode. - - -#### Disable third-party dynamic allocator libraries - -``` ---disable-malloc-libs -``` -`./configure` will detect alternative `malloc()` implementations found in -libraries installed on the system (jemalloc/tcmalloc/etc). Construct developers -may enable these to be configured by default, if detected. To always prevent -any alternative to the default standard library allocator specify this option. - - -#### Enable third-party dynamic allocator libraries - -Currently: - -``` ---enable-jemalloc -``` - -`./configure` will detect alternative `malloc()` implementations found in -libraries installed on the system (jemalloc/tcmalloc/etc). Construct developers -may not enable these to be configured by default, falling back on the default -allocator. To always use one of the alternative allocators use one option here. - - -#### Logging level - -``` ---with-log-level= -``` -This manually sets the level of logging. All log levels at or below this level -will be available. When a log level is not available, all code used to generate -its messages will be entirely eliminated via *dead-code-elimination* at compile -time. - -The log levels are (from logger.h): -``` -7 DEBUG Maximum verbosity for developers. -6 DWARNING A warning but only for developers (more frequent than WARNING). -5 DERROR An error but only worthy of developers (more frequent than ERROR). -4 INFO A more frequent message with good news. -3 NOTICE An infrequent important message with neutral or positive news. -2 WARNING Non-impacting undesirable behavior user should know about. -1 ERROR Things that shouldn't happen; user impacted and should know. -0 CRITICAL Catastrophic/unrecoverable; program is in a compromised state. -``` - -When `--enable-debug` is used `--with-log-level=DEBUG` is implied. Otherwise -for release mode `--with-log-level=INFO` is implied. Large deployments with -many users may consider lower than `INFO` to maximize optimization and reduce -noise. diff --git a/doc/FAQ.md b/doc/FAQ.md deleted file mode 100644 index 161a44fa9..000000000 --- a/doc/FAQ.md +++ /dev/null @@ -1,46 +0,0 @@ -# FREQUENTLY ASKED QUESTIONS - - -##### Why does it say IRCd everywhere? - -This is a long story which is not covered in full here. The short version -is that this project was originally intended to implement an IRC federation -using an extended superset of the rfc1459/rfc2812 protocol. This concept went -through several iterations. The Atheme Services codebase was first considered -for development into a "gateway" for IRC networks to connect to each other. -That was succeeded by the notion of eliminating separate services-daemons in -favor of IRCd-meshing for redundancy and scale. At that point Charybdis/4 was -chosen as a basis for the project. - -Around this time, the Matrix protocol was emerging as a potential candidate -for federating synchronous-messaging. Though far from perfect, it had enough -potential to outweigh the troubles of inventing and promoting yet another -messaging protocol in a wildly diverse and already saturated space. - -Somewhile after, the original collaborators of this endeavor became -disillusioned by many of the finer details of Matrix. Many red-flags observed -about its stewards, community, and the overall engineering requirements placed -on implementations made it clear this project's goals would never be reached in -a timely or cost-effective way. Coupled with the political situation and -death-spiral of IRC itself, the original collaborators disbanded. - -One developer decided to continue by simplifying the mission down to just -creating a Matrix server first, and worrying about IRC later, maybe through -TS6, or maybe never. This reasoning was bolstered by the ongoing poor -performance of Matrix's principal reference implementation in python+pgsql. -Today there is virtually nothing left of any original IRCd. The project -namespaces like "ircd::" and IRCD_ remain but they too might be replaced by -"ctor" etc at some time in the future. - - -##### Why is there a SpiderMonkey JavaScript embedding? - -One of the goals of this project is realtime team collaboration and -development inside chat rooms. The embedding is intended to replace the -old notion of running a "bot" which is just a single instance of a program -that some user connects. The embedding facilitates a cloud-esque or so-called -"lambda" ecosystem of many untrusted user-written modules that are stored -and managed by the server. - -*The SpiderMonkey embedding is defunct and no longer developed. It is planned -to be succeeded by WASM.* diff --git a/doc/SETUP.md b/doc/SETUP.md deleted file mode 100644 index 5d3a40c74..000000000 --- a/doc/SETUP.md +++ /dev/null @@ -1,73 +0,0 @@ -## SETUP - -This guide will help you execute Construct for the first time. If you are -building from source code and have not already done so please follow the -instructions in [BUILD](BUILD.md) before continuing here. - -#### NOTES - -- We will refer to your server as `host.tld`. For those familiar with matrix: -this is your _origin_ and mxid `@user:host.tld` hostpart. If you delegate -your server's location to something like `matrix.host.tld:1234` we refer to -this as your _servername_. - -> Construct clusters all share the same _origin_ but each individual instance -of the daemon has a unique _servername_. - -- If you built construct yourself as a standalone build you will need to add -the included library directories before executing: -`export LD_LIBRARY_PATH=/path/to/src/deps/boost/lib:$LD_LIBRARY_PATH` -`export LD_LIBRARY_PATH=/path/to/src/deps/rocksdb:$LD_LIBRARY_PATH` - -### PROCEDURE - -1. Execute - - There are two arguments: ` [servername]`. If the _servername_ - argument is missing, the _origin_ will be used for it instead. - - ``` - bin/construct host.tld - ```` - > There is no configuration file. - - > Log messages will appear in terminal concluding with notice `IRCd RUN`. - - -2. Strike ctrl-c on keyboard - > The command-line console will appear. - - -3. Create a general listener socket by entering the following command: - - ``` - net listen matrix * 8448 privkey.pem cert.pem chain.pem - ``` - - `matrix` is your name for this listener; you can use any name. - - `*` and `8448` is the local address and port to bind. - - `privkey.pem` and `cert.pem` and `chain.pem` are paths (ideally - absolute paths) to PEM-format files for the listener's TLS. - - > The Matrix Federation Tester should now pass. Browse to - https://matrix.org/federationtester/api/report?server_name=host.tld and - verify `"AllChecksOK": true` - -4. To use a web-based client like Riot, configure the "web root" directory -to point at Riot's `webapp/` directory by entering the following: - ``` - conf set ircd.web.root.path /path/to/riot-web/webapp/ - mod reload web_root - ``` - -6. Browse to `https://host.tld:8448/` and register a user. - - -### ADDENDUM - -* If you are employing a reverse-proxy you must review the apropos section in -the [TROUBLESHOOTING](TROUBLESHOOTING.md#trouble-with-reverse-proxies-and-middlewares) -guide or the server may not operate correctly. - -* Logging to files is only enabled by default for CRITICAL, ERROR, and WARNING. -It is not enabled by default for the INFO level. To enable, use `conf set -ircd.log.info.file.enable true`. diff --git a/doc/STYLE.md b/doc/STYLE.md deleted file mode 100644 index e0e9b2a0e..000000000 --- a/doc/STYLE.md +++ /dev/null @@ -1,455 +0,0 @@ -# How to CPP for IRCd - - -In the post-C++11 world it is time to leave C99+ behind and seriously consider -C++ as C proper. It has been a hard 30 year journey to finally earn that, but -now it is time. This document is the effective style guide for how Charybdis -will integrate -std=gnu++17 and how developers should approach it. - - -### C++ With Respect For C People - -Remember your C heritage. There is nothing wrong with C, it is just incomplete. -There is also no overhead with C++, that is a myth. If you write C code in C++ -it will be the same C code. Think about it like this: if C is like a bunch of -macros on assembly, C++ is a bunch of macros on C. This guide will not address -any more myths and for that we refer you [here](https://isocpp.org/blog/2014/12/myths-3). - - -#### Direct initialization - -Use `=` only for assignment to an existing object. *Break your C habit right now.* -Use bracket initialization `{}` of all variables and objects. Fall back to parens `()` -if brackets conflict with an initializer_list constructor (such as with STL containers) -or if absolutely necessary to quash warnings about conversions. - -> Quick note to preempt a confusion for C people: -> Initialization in C++ is like C but you don't have to use the `=`. -> -> ```C++ -> struct user { const char *nick; }; -> struct user you = {"you"}; -> user me {"me"}; -> ``` -> - -* Use Allman style for complex/long initialization statements. It's like a function - returning the value to your new object; it is easier to read than one giant line. - -> ```C++ -> const auto sum -> { -> 1 + (2 + (3 * 4) + 5) + 6 -> }; -> ``` - -* Do not put uninitialized variables at the top of a function and assign them -later. - -* Even though C++17 mandates [copy elision](https://en.cppreference.com/w/cpp/language/copy_elision) -this project does not relax its comprehensive use of direct initialization. - - -#### Use full const correctness - -`const` correctness should extend to all variables, pointers, arguments, and -functions- not just "pointed-to" data. If it *can* be `const` then make it -`const` and relax it later if necessary. - - -#### Use auto - -Use `auto` whenever it is possible to use it; specify a type when you must. -If the compiler can't figure out the auto, that's when you indicate the type. - - -#### RAII will be in full force - -All variables, whether they're function-local, class-members, even globals, -must always be under some protection at all times. There must be the -expectation at *absolutely any point* including *between those points* -everything will blow up randomly and the protection will be invoked to back-out -the way you came. That is, essentially, **the juice of why we are here.** - -**This is really serious business.** You have to do one thing at a time. When you -move on to the next thing the last thing has to have already fully succeeded -or fully failed. Everything is a **transaction**. Nothing in the future exists. -There is nothing you need from the future to give things a consistent state. - -* The program should be effectively reversible -- should be able to "go backwards" -or "unwind" from any point. Think in terms of stacks, not linear procedures. -This means when a variable, or member (a **resource**) first comes into scope, -i.e. it is declared or accessible (**acquired**), it must be **initialized** -to a completely consistent state at that point. - -> -> Imagine pulling down a window shade to hide the sun. As you pull down, the canvas -> unrolls from its spool at the top. Your goal is to hook the shade on to the nail -> at the bottom of the window: that is reaching the return statement. If you slip -> and let go, the shade will roll back up into the spool at the top: that is an -> exception. -> -> What you can't do is prepare work on the way down which needs _any_ further pulling -> to be in a consistent state and not leak. You might slip and let go at any time for -> any reason. A `malloc()` on one line and a `free()` following it is an example of -> requiring more pulling. -> -> Indeed slipping and letting go is an accident -- but the point is that *accidents -> happen*. They're not always your fault, and many times are in other parts of the -> code which are outside of your control. This is a good approach for robust and -> durable code over long-lived large-scale projects. - - -#### Exceptions will be used - -Wait, you were trolling "respect for C people" right? **No.** If you viewed -the above section merely through the prism avoiding classic memory leaks, and -can foresee how to now write stackful, reversible, protected programs without -even calling free() or delete: you not only have earned the right, but you -**have** to use exceptions. This is no longer a matter of arguing for or -against `if()` statement clutter and checking return types and passing errors -down the stack. - -* Object construction (logic in the initialization list, constructor body, etc) -is actual real program logic. Object construction is not something to just -prepare some memory, like initializing it to zero, leaving an instance -somewhere for further functions to conduct operations on. Your whole program -could be running - the entire universe could be running - in some member -initializer somewhere. The only way to error out of this is to throw, and it -is perfectly legitimate to do so. - -* Function bodies and return types should not be concerned with error -handling and passing of such. They only cause and generate the errors. - -* Try/catch style note: We specifically discourage naked try/catch blocks. -In other words, **most try-catch blocks are of the -[function-try-catch](http://en.cppreference.com/w/cpp/language/function-try-block) -variety.** The style is simply to piggyback the try/catch where another block -would have been. - -> ```C++ -> while(foo) try -> { -> ... -> } -> catch(exception) -> { -> } -> ``` - -* We extend this demotion style of keywords to `do` as well, which should - avoid having its own line if possible. - -> ```C++ -> int x; do -> { -> ... -> } -> while((x = foo()); -> ``` - - -#### Encapsulation will be relaxed - -To summarize, most structures will default to being fully public unless there -is a very pressing reason to create a private section. Such a reason is not -"the user *could* break something by touching this," instead it is "the user -*will only ever* break something by touching this." - -* Do not use the keyword `class` unless your sole intent is to have the members -immediately following it be private. Using `class` followed by a `public:` -label is nubile. - -Note that public interfaces and private implementation patterns are still -widely used and encouraged, even expected, but not purely using the C++ -language features. The intent here is to allow hacking on the project to be -easy. We don't want to stifle creativity by getting in the way of developers -implementing new ideas which do things that weren't originally intended. -In practice, interfaces try to expose as much as possible, but require only -a tiny surface by default for actual intended use. - - -#### Pointers and References - -* The `&` or `*` prefixes the variable name; it does not postfix the type. -This is evidenced by comma-delimited declarations. There is only one exception -to this for universal references which is described later. - -> ```C++ -> int a, &b{a}, *c{&b}, *const d{&b}, *const *const e{&c}; -> ``` - -* Biblical maxim: Use references when you can, pointers when you must. - -* Pass arguments by const reference `const foo &bar` preferably, non-const -reference `foo &bar` if you must. - -* Use const references even if you're not referring to anything created yet. -const references can construct, contain, and refer to an instance of the type -with all in one magic. This style has no sympathy for erroneously expecting -that a const reference is not a local construction; expert C++ developers -do not make this error. See reasons for using a pointer below. - -* Passing by value indicates some kind of need for object construction in -the argument, or that something may be std::move()'ed to and from it. Except -for some common patterns, this is generally suspect. - -* Passing to a function with an rvalue reference argument `foo &&bar` indicates -something will be std::move()'ed to it, and ownership is now acquired by that -function. - -* In a function with a template `template`, an rvalue reference in -the prototype for something in the template `void func(foo &&bar)` is actually -a [universal reference](https://isocpp.org/blog/2012/11/universal-references-in-c11-scott-meyers) -which has some differences from a normal rvalue reference. To make this clear -our style is to move the `&&` like so `void func(foo&& bar)`. This is actually -useful because a variadic template foo `template` will require -the prototype `void func(foo&&... bar)`. - -* Passing a pointer, or pointer arguments in general, indicates something may -be null (optional), or to explicitly prevent local const construction which is -a rare reason. Otherwise suspect. - -* Avoid using references as object members, you're most likely just limiting -the ability to assign, move, and reuse the object because references cannot be -reseated; then the "~~big three~~" "big five" custom constructors have to be -created and maintained, and it becomes an unnecessary mess. - - -#### Miscellaneous - - -* Prefer "locality" rather than "centrality." In other words, we keep things -in as local of a scope or file as possible to where it is used. - -* new and delete should rarely if ever be seen. This is more true than ever with -C++14 std::make_unique() and std::make_shared(). - -* We allow some C-style arrays, especially on the stack, even C99 dynamic sized ones; -there's no problem here, just be responsible. - -* `alloca()` will not be used. - -* C format strings are still acceptable. This is an IRC project, with heavy -use of strings and complex formats and all the stringencies. We even have -our own custom *protocol safe* format string library, and that should be used -where possible. - -* streams and standard streams are generally avoided in this project. We could have -have taken the direction to customize C++'s stream interface to make it -performant, but otherwise the streams are generally slow and heavy. Instead we -chose a more classical approach with format strings and buffers -- but without -sacrificing type safety with our RTTI-based fmt library. - -* ~~varargs are still legitimate.~~ There are just many cases when template -varargs, now being available, are a better choice; they can also be inlined. - - * Our template va_rtti is starting to emerge as a suitable replacement - for any use of varags. - -* When using a `switch` over an `enum` type, put what would be the `default` case after/outside -of the `switch` unless the situation specifically calls for one. We use -Wswitch so changes to -the enum will provide a good warning to update any `switch`. - -* Prototypes should name their argument variables to make them easier to understand, except if -such a name is redundant because the type carries enough information to make it obvious. In -other words, if you have a prototype like `foo(const std::string &message)` you should name -`message` because std::string is common and *what* the string is for is otherwise opaque. -OTOH, if you have `foo(const options &options, const std::string &message)` one should skip -the name for `options &` as it just adds redundant text to the prototype. - -* Consider any code inside a runtime `assert()` statement to **entirely** -disappear in optimized builds. If some implementations of `assert()` may only -elide the boolean check and thus preserve the inner statement and the effects -of its execution: this is not standard; we do not rely on this. Do not use -`assert()` to check return values of statements that need to be executed in -optimized builds. - - -#### Comments - -* `/* */` Multi-line comments are not normally used. We reserve this for -debugging and temporary multi-line grey-outs. The goal for rarely using this -is to not impede anybody attempting to refactor or grey-out a large swath of -code. - -* `//` Primary developer comment; used even on multiple lines. - -* `///` Documentation comment; the same style as the single line comment; the -documentation is applied to code that follows the comment block. - -* `///<` Documentation comment; this documents code preceding the comment. - -##### Documentation will be pedantic, windy and even patronizing - -This is considered a huge anti-pattern in most other contexts where comments -and documentation are minimal, read by experts, end up being misleading, tend -to diverge from their associated code after maintenance, etc. This project is -an exception. Consider two things: - -1. This is a free and open source public internet project. The goal here -is to make it easy for many-eyeballs to understand everything. Then, -many-eyeballs can help fix comments which become misleading. - -2. Most free and open source public internet projects are written in C -because C++ is complicated with a steep learning curve. It is believed -C++ reduces the amount of many-eyeballs. A huge number of contributions -to these projects come from people with limited experience working on -their "first project." - -Therefor, writers of documentation will consider a reader which has -encountered IRCd as their first project, specifically in C++. Patronizing -explanations of common/standard C++ patterns and intricacies can be made. - - -### Art & Tableaux - - -* Tab style is **tabs before spaces**. Tabs set an indentation level and -then spaces format things *at that level*. This is one of the hardest styles -to get right and then enforce, but it looks the best for everyone. The point -here is that the tab-width becomes a personal setting -- nobody has to argue -whether it's worth 2 or 4 or 8 spaces... Remember, tabs are never used to -align things that would fall out of alignment if the tab-width changed. - -* Only one blank line at a time. While an entire section could be devoted to -*where* to create whitespace, for now, just know to only use a single blank -line to do so. There are ways to cheat. I am a huge fan of whitespace and I -will share some of these ways. For example, a comment block may end in a -line starting with `//` with no text after it. Combined with the allowed -completely blank line after that you now have more whitespace. - - -### Conventions - -These are things you should know when mulling over the code as a whole. Knowing -these things will help you avoid various gotchas and not waste your tim -debugging little surprises. You may or may not agree with some of these -choices (specifically the lack of choices in many cases) but that's why they're -explicitly discussed here. Conventions are not laws: they can be ignored or -overruled on a case basis. One should follow them by default. - - -#### Null termination - -- We don't rely on null terminated strings. We always carry around two points -of data to indicate such vectoring. Ideally this is a pair of pointers -indicating the `begin`/`end` like an STL iterator range. `string_view` et al -and the `buffer::` suite work this way. - -- Null terminated strings can still be used and we even still create them in -many places on purpose just because we can. - -- Null terminated creations use the BSD `strl*` style and *not* the `strn*` -style. Take note of this. When out of buffer space, such an `strl*` style -will *always* add a null to the end of the buffer. Since we almost always -have vectoring data and don't really need this null, a character of the string -may be lost. This can happen when creating a buffer tight to the length of an -expected string without a `+ 1`. This is actually the foundation of a case -to move *back* to `strn*` style but it's not prudent at this time. - -- Anything named `print*` like `print(mutable_buffer, T)` always composes null -terminated output into the buffer. These functions usually return a size_t -which count characters printed *not including null*. They may return a -`string_view`/`const_buffer` of that size (never viewing the null). - - -#### Iteration protocols - -When not using STL-iterators, you may encounter some closure/callback-based -iterator functions. Usually that's a `for_each()`. If we want to break out -of the loop, our conventions are as follows: - -- *find protocol* for `find()` functions. The closure returns true to break -the loop at that element, false to continue. The `find()` function itself -then returns a pointer or reference to that element. If the end of the -iteration is reached then a `find()` usually returns `nullptr` or throws an -exception, etc. - -- *test protocol* for `test()` functions (this has nothing to do with unit- -tests or development testing). This is the same logic as the find protocol -except the `test()` function itself returns true if the closure broke the -loop by returning true, or false if the end of the iteration was reached. - -- *until protocol* for `until()` functions. The closure "remains true 'till -the end." When the end is reached, true is returned. The closure returns false -to break the loop, and then false is returned from until() as well. - -Overloads of `for_each()` may be encountered accepting closures that return -`void` and others that return `bool`. The `bool` overloads use the -*until protocol* as that matches the same logic in a `for(; bool;)` loop. - - -#### nothrow is not noexcept - -Often a function is overloaded with an std::nothrow_t argument or our -util::nothrow overload template. This means the function **will not throw -a specific exception expected from the overload alternative** (or set of -exceptions, etc). Any exception may still come out of that nothrow overload; -technically including the specific exception if it came from somewhere else! - -Use the noexcept keyword with tact, not by default. Most of the project -propagates exceptions. Functions that handle their errors and are expected to -return (i.e since they catch `std::exception`), still throw special exceptions -like `ircd::ctx::terminated`. If the `catch(...)` and `noexcept` features are -used: developers must cooperate by handling ctx interruptions and propagating -terminations. This is not an issue on leaf and simple functions where we tend -to make use of `noexcept`, especially for non-inlines allowing for better -compiler optimizations to occur. - - -#### Indications of yielding and IO's - -There is a section on how yielding and IO can occur far up the stack from a -benign-looking callsite in ctx/README. We try to make comments to indicate -these things directly in the definitions and certainly in documentation. - -Some of those indications may say nothing more than `[GET]` and `[SET]` without -any other comment. That is the minimum acceptable marking for something which -will likely do read or write IO respectively to disk or even the network. In -any such case the ircd::ctx will definitely yield if that happens. - - -#### Nothing ticks - -The project makes considerable use of userspace threads which may be spawned by -various subsystems to perform tasks: some of those tasks tend to be performed at -intervals or in some cases may require scanning data at an interval (i.e timeout -check). Our style is to not wakeup a context (or similarly queue a callback in -the plain event loop) for an empty dataset. In other words, when there is no -work, the program should be entirely comatose and not woken up by the OS. -For example: if you were to `strace(1)` construct and then pull the network -cable: eventually there would be complete silence. - - - -### Git / Development related - -Commits in this project tend to have a `prefix:` like `ircd::m:`. This is -simply an indicator of where the change occurred. If multiple areas of the -project are changed: first determine if the change in each area can stand on -its own and break what you're doing into multiple commits; this is generally -the case when adding a low-level feature to support something built at a higher -level. Otherwise, prefix the commit with the largest/most-fundamental area -being changed. -- Prefixes tend to just be the namespace where the change is occurring. -- Prefixes can be an actual class name if that class has a lot of nested -assets and pretty much acts as a namespace. -- Prefixes for changes in `modules/` where code is not in any namespace tend -to be the path to the module i.e `modules/s_conf:` or `modules/client/sync:` -- Prefixes for other areas of the project can just be the directory like `doc:` -or `tools:` or `README:` - -Existing conventions for commit wording are documented here as follows: -Generally after the prefix, the most frequent words a commit start with -are "Add" "Fix" "Move" "Remove" and "Improve" and though it is not -required, if you can classify what you're doing with one of those that -is ideal. -- The use of the word "minor" indicates that no application logic was -affected by a commit: i.e code formatting changes and "minor cleanup" etc. -- The use of the word "various" indicates many not-very-related changes -or very spread-out changes: i.e "various fixes" etc; this tends not to be -something one is proud of using. -- The use of the word "checkpoint" indicates something sloppy and -incomplete is being committed; it compiles and runs; there is a pressing -need to get it out of the dirty head for the time being. diff --git a/doc/TROUBLESHOOTING.md b/doc/TROUBLESHOOTING.md deleted file mode 100644 index 8781ac041..000000000 --- a/doc/TROUBLESHOOTING.md +++ /dev/null @@ -1,70 +0,0 @@ - -# TROUBLESHOOTING - -##### Useful program options - -Start the daemon with one or more of the following program options to make it -easier to troubleshoot and perform maintenance: - -- *-single* will start in "single user mode" which is a convenience combination -of *-nolisten -wa -console* options described below. - -- *-nolisten* will disable the loading of any listener sockets during startup. - -- *-wa* write-avoid will discourage (but not deny) writes to the database. This -prevents a lot of background tasks and other noise for any maintenance. - -- *-console* convenience to immediately drop to the adminstrator console -after startup. - -- *-debug* enables full debug log output. - -##### Recovering from broken configurations - -If your server ever fails to start from an errant conf item: you can override -any item using an environmental variable before starting the program. To do -this simply replace the '.' characters with '_' in the name of the item when -setting it in the environment. The name is otherwise the same, including its -lower case. - -Otherwise, the program can be run with the option `-defaults`. This will -prevent initial loading of the configuration from the database. It will -not prevent environmental variable overrides (as mentioned above). Values -will not be written back to the database unless they are explicitly set by -the user in the console. - - -##### Recovering from database corruption - -In very rare cases after a hard crash the journal cannot completely restore -data before the crash. Due to the design of rocksdb and the way we apply it -for Matrix, data is lost in chronological order starting from the most recent -transaction (matrix event). The database is consistent for all events up until -the first corrupt event, called the point-in-time. - -When any loss has occurred the daemon will fail to start normally. To enable -point-in-time recovery use the command-line option `-pitrecdb` at the next -invocation. - -##### Trouble with reverse proxies and middlewares - -Construct is designed to be capable internet service software and should -perform best when directly interfacing with remote parties. Nevertheless, -some users wish to employ middlewares known as "reverse-proxies" through -which all communication is forwarded. This gives the appearance, from the -server's perspective, that all clients are connecting from the same IP -address on different ports. - -At this time there are some known issues with reverse proxies which may be -mitigated by administrators having reviewed the following: - -1. The connection limit from a single remote IP address must be raised from -its default, for example by entering the following in !control or console: - -``` -conf set ircd.client.max_client_per_peer 65535 -``` - -2. The server does not yet support non-SSL listening sockets. Administrators -may have to generate locally signed certificates for communication from the -reverse-proxy. diff --git a/doc/TUNING.md b/doc/TUNING.md deleted file mode 100644 index ecafccf05..000000000 --- a/doc/TUNING.md +++ /dev/null @@ -1,97 +0,0 @@ -## TUNING - -This guide is intended for system administrators to optimize Construct and -maximize its performance for their environment. This does not cover [BUILD](BUILD.md) -tuning, and it is expected that Construct is already installed and the [SETUP](SETUP.md) -has been completed. - -- Some instructions may reference Construct's configuration system. This is -accessed via the administrator's console which can be reached by striking -`ctrl-c (SIGINT)` and then using the `conf` command (see: `help conf`). The -console can also be reached interactively through your preferred client in -the `!control` room. Alternatively, configuration state can be manipulated -directly through the `!conf` room. Configuration changes take effect as a -result of state events sent to the `!conf` room, thus all aforementioned -methods to change configuration are the same. - -- CHANGES TO CONFIGURATION ARE EFFECTIVE IMMEDIATELY. ERRONEOUS VALUES MAY -CAUSE UNEXPECTED BEHAVIOR AND RESULT IN PROGRAM TERMINATION. CONFIGURATION -ERRORS MAY ALSO PREVENT STARTUP. Please see the -[TROUBLESHOOTING](TROUBLESHOOTING.md#recovering-from-broken-configurations) -guide for how to recover from configuration errors. - - -### Event Cache Tuning - -Most of Construct's runtime footprint in RAM consists of a cache of Matrix -events read from the database. The data in many of these events may be -directly accessed for fundamental server operations; for example, a client's -access-token and user information is stored with events in special server -rooms. The event cache is a set of LRU (Least Recently Used) caches. The size -of these caches should be tuned to at least the "working-set size" expected -by the server. If these caches are too small, load will be placed -on the next storage tier. For storage devices with poor random access -characteristics it is important these caches cover the server's working-set -size. - -To list the event cache information, try the following commands (example output -shown): - -``` -> db cache events * - -COLUMN PCT HITS MISSES INSERT CACHED CAPACITY INSERT TOTAL LOCKED -* 61.94% 18742243 3818637 3814446 1.41 GiB (1517280856) 2.28 GiB (2449473536) 4.46 GiB (4787594200) 4.41 MiB (4628512) -``` - -``` -> db cache events ** - -COLUMN PCT HITS MISSES INSERT CACHED CAPACITY INSERT TOTAL LOCKED -content 17.85% 2113271 85256 83255 22.85 MiB (23962992) 128.00 MiB (134217728) 569.37 MiB (597026848) 0.00 B (0) -depth 90.71% 11292 96431 96431 58.06 MiB (60876968) 64.00 MiB (67108864) 59.68 MiB (62575248) 0.00 B (0) -event_id 9.24% 191518 153523 153523 5.92 MiB (6202768) 64.00 MiB (67108864) 865.07 MiB (907093240) 0.00 B (0) -origin_server_ts 99.99% 9852 566483 566258 64.00 MiB (67103832) 64.00 MiB (67108864) 353.29 MiB (370455584) 0.00 B (0) -room_id 99.99% 1015939 216695 216694 63.99 MiB (67102496) 64.00 MiB (67108864) 132.05 MiB (138467768) 1.93 MiB (2019088) -sender 39.18% 56357 80879 80879 50.16 MiB (52592768) 128.00 MiB (134217728) 50.36 MiB (52809616) 0.00 B (0) -state_key 40.49% 7336 89035 87181 25.91 MiB (27171856) 64.00 MiB (67108864) 383.42 MiB (402049648) 0.00 B (0) -type 99.92% 1716885 66485 66485 31.97 MiB (33527264) 32.00 MiB (33554432) 40.69 MiB (42667312) 0.00 B (0) -_event_idx 99.99% 652575 505956 505955 255.98 MiB (268418416) 256.00 MiB (268435456) 635.40 MiB (666268064) 23.45 KiB (24016) -_room_events 62.14% 308312 13144 13144 79.54 MiB (83405864) 128.00 MiB (134217728) 79.73 MiB (83608112) 284.73 KiB (291560) -_room_joined 52.73% 2087968 6789 6789 4.22 MiB (4422936) 8.00 MiB (8388608) 4.23 MiB (4431280) 0.00 B (0) -_room_state 25.40% 2038549 21590 21590 16.25 MiB (17044504) 64.00 MiB (67108864) 52.26 MiB (54793600) 0.00 B (0) -_room_head 26.41% 7986 9435 9435 2.11 MiB (2215192) 8.00 MiB (8388608) 37.56 MiB (39389688) 0.00 B (0) -_event_json 62.79% 82254 1166164 1166153 642.96 MiB (674189112) 1024.00 MiB (1073741824) 736.76 MiB (772552224) 3.52 MiB (3690824) -_event_refs 79.17% 54501 112508 112505 50.67 MiB (53127080) 64.00 MiB (67108864) 68.76 MiB (72098088) 0.00 B (0) -_event_type 99.77% 22 8215 8215 15.96 MiB (16738848) 16.00 MiB (16777216) 17.27 MiB (18109240) 73.93 KiB (75704) -_event_sender 0.00% 0 23453 23453 0.00 B (0) 16.00 MiB (16777216) 15.01 MiB (15739768) 0.00 B (0) -_event_horizon 99.96% 15722 18296 18296 15.99 MiB (16769768) 16.00 MiB (16777216) 18.91 MiB (19833200) 0.00 B (0) -_room_state_space 67.24% 3997 24712 24712 86.06 MiB (90241400) 128.00 MiB (134217728) 92.28 MiB (96762256) 0.00 B (0) -``` - -To view the configuration item for the size of a cache, which should match your -output from the above command, use the following command where `` is -replaced by one of the names under `COLUMN` in the above output: - -``` -conf ircd.m.dbs..cache.size -``` - -To alter a cache size, set the configuration item with a byte value. In the -example below we will set the `_event_json` cache size to 256 MiB. This change -will take effect immediately and the cache will grow or shrink to that size. - -``` -conf set ircd.m.dbs._event_json.cache.size 268435456 -``` - -> Tip: The best metric to figure out which caches are inadequate is not -necessarily the utilization percentage. Caches that are too small generally -exhibit high values under `INSERT TOTAL` as well as full utilization. If this -value is several times higher than the cache size and growing, consider -increasing that cache's size. - - -### Client Pool Tuning - -(TODO)