Upgrades Hydra to the latest master/flake branch. To perform this
upgrade, it's needed to do a non-trivial db-migration which provides a
massive performance-improvement[1].
The basic ideas behind multi-step upgrades of services between NixOS versions
have been gathered already[2]. For further context it's recommended to
read this first.
Basically, the following steps are needed:
* Upgrade to a non-breaking version of Hydra with the db-changes
(columns are still nullable here). If `system.stateVersion` is set to
something older than 20.03, the package will be selected
automatically, otherwise `pkgs.hydra-migration` needs to be used.
* Run `hydra-backfill-ids` on the server.
* Deploy either `pkgs.hydra-unstable` (for Hydra master) or
`pkgs.hydra-flakes` (for flakes-support) to activate the optimization.
The steps are also documented in the release-notes and in the module
using `warnings`.
`pkgs.hydra` has been removed as latest Hydra doesn't compile with
`pkgs.nixStable` and to ensure a graceful migration using the newly
introduced packages.
To verify the approach, a simple vm-test has been added which verifies
the migration steps.
[1] https://github.com/NixOS/hydra/pull/711
[2] https://github.com/NixOS/nixpkgs/pull/82353#issuecomment-598269471
While our ETag patch works pretty fine if it comes to serving data off
store paths, it unfortunately broke something that might be a bit more
common, namely when using regexes to extract path components of
location directives for example.
Recently, @devhell has reported a bug with a nginx location directive
like this:
location ~^/\~([a-z0-9_]+)(/.*)?$" {
alias /home/$1/public_html$2;
}
While this might look harmless at first glance, it does however cause
issues with our ETag patch. The alias directive gets broken up by nginx
like this:
*2 http script copy: "/home/"
*2 http script capture: "foo"
*2 http script copy: "/public_html/"
*2 http script capture: "bar.txt"
In our patch however, we use realpath(3) to get the canonicalised path
from ngx_http_core_loc_conf_s.root, which returns the *configured* value
from the root or alias directive. So in the example above, realpath(3)
boils down to the following syscalls:
lstat("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("/home/$1", 0x7ffd08da6f60) = -1 ENOENT (No such file or directory)
During my review[1] of the initial patch, I didn't actually notice that
what we're doing here is returning NGX_ERROR if the realpath(3) call
fails, which in turn causes an HTTP 500 error.
Since our patch actually made the canonicalisation (and thus additional
syscalls) necessary, we really shouldn't introduce an additional error
so let's - at least for now - silently skip return value if realpath(3)
has failed.
However since we're using the unaltered root from the config we have
another issue, consider this root:
/nix/store/...-abcde/$1
Calling realpath(3) on this path will fail (except if there's a file
called "$1" of course), so even this fix is not enough because it
results in the ETag not being set to the store path hash.
While this is very ugly and we should fix this very soon, it's not as
serious as getting HTTP 500 errors for serving static files.
I added a small NixOS VM test, which uses the example above as a
regression test.
It seems that my memory is failing these days, since apparently I *knew*
about this issue since digging for existing issues in nixpkgs, I found
this similar pull request which I even reviewed:
https://github.com/NixOS/nixpkgs/pull/66532
However, since the comments weren't addressed and the author hasn't
responded to the pull request, I decided to keep this very commit and do
a follow-up pull request.
[1]: https://github.com/NixOS/nixpkgs/pull/48337
Signed-off-by: aszlig <aszlig@nix.build>
Reported-by: @devhell
Acked-by: @7c6f434c
Acked-by: @yorickvP
Merges: https://github.com/NixOS/nixpkgs/pull/80671
Fixes: https://github.com/NixOS/nixpkgs/pull/66532
Dropbear lags behind OpenSSH significantly in both support for modern
key formats like `ssh-ed25519`, let alone the recently-introduced
U2F/FIDO2-based `sk-ssh-ed25519@openssh.com` (as I found when I switched
my `authorizedKeys` over to it and promptly locked myself out of my
server's initrd SSH, breaking reboots), as well as security features
like multiprocess isolation. Using the same SSH daemon for stage-1 and
the main system ensures key formats will always remain compatible, as
well as more conveniently allowing the sharing of configuration and
host keys.
The main reason to use Dropbear over OpenSSH would be initrd space
concerns, but NixOS initrds are already large (17 MiB currently on my
server), and the size difference between the two isn't huge (the test's
initrd goes from 9.7 MiB to 12 MiB with this change). If the size is
still a problem, then it would be easy to shrink sshd down to a few
hundred kilobytes by using an initrd-specific build that uses musl and
disables things like Kerberos support.
This passes the test and works on my server, but more rigorous testing
and review from people who use initrd SSH would be appreciated!
This mirrors the behaviour of systemd - It's udev that parses `.link`
files, not `systemd-networkd`.
This was originally applied in 36ef112a47,
but was reverted due to 1115959a8d causing
evaluation errors on hydra.
...even when networkd is disabled
This reverts commit ce78f3ac70, reversing
changes made to dc34da0755.
I'm sorry; Hydra has been unable to evaluate, always returning
> error: unexpected EOF reading a line
and I've been unable to reproduce the problem locally. Bisecting
pointed to this merge, but I still can't see what exactly was wrong.
- Fix misspelled option. mkRenamedOptionModule is not used because the
option hasn't really worked before.
- Add missing cfg.telemetryPath arg to ExecStart.
- Fix mkdir invocation in test.
Add a cage module to nixos. This can be used to make kiosk-style
systems that boot directly to a single application. The user (demo by
default) is automatically logged in by this service and the
program (xterm by default) is automatically started.
This is useful for some embedded, single-user systems where we want
automatic booting. To keep the system secure, the user should have
limited privileges.
Based on the service provided in the Cage wiki here:
https://github.com/Hjdskes/cage/wiki/Starting-Cage-on-boot-with-systemd
Co-Authored-By: Florian Klink <flokli@flokli.de>
* prometheus-nginx-exporter: 0.5.0 -> 0.6.0
* nixos/prometheus-nginx-exporter: update for 0.6.0
Added new option constLabels and updated virtualHost name in the
exporter's test.
This is useful when buildLayeredImage is called in a generic way
that should allow simple (base) images to be built, which may not
reference any store paths.
I am not sure how this ever passed on hydra but 30s is barely enough to
pass the configure phase of opensmtpd. It is likely the package was
built as part of another jobset. Whenever it is built as part of the
test execution the timeout propagates and 30s is clearly not enough for
that.
The subtest was mainly written to demonstrate the VRF-issues with a
5.x-kernel. However this breaks the entire test now as we have 5.4 as
default kernel. Disabling the test for now, I still need to find some
time to investigate.
Depending on the network management backend being used, if the interface
configuration in stage 1 is not cleared, there might still be some old
addresses or routes from stage 1 present in stage 2 after network
configuration has finished.
This makes predictable interfaces names available as soon as possible
with udev by adding the default network link units to initrd which are read
by udev. Also adds some udev rules that are needed but which would normally
loaded from the udev store path which is not included in the initrd.
invalid test was introduced in 297d1598ef
and it is disabled in the shipped daemon.conf.
I forgot to reflect that in the module, which caused the daemon to print the following on start-up:
FuEngine invalid has incorrect built version invalid
and the command to warn:
WARNING: The daemon has loaded 3rd party code and is no longer supported by the upstream developers!
To reduce the change of this happening in the future, I moved the list of default disabled plug-ins to the package expression.
I also set the value of the NixOS module option in the config section of the module instead of the default value used previously,
which will allow users to not care about these plug-ins.
In 87a19e9048 I merged staging-next into master using the GitHub gui as intended.
In ac241fb7a5 I merged master into staging-next for the next staging cycle, however, I accidentally pushed it to master.
Thinking this may cause trouble, I reverted it in 0be87c7979. This was however wrong, as it "removed" master.
This reverts commit 0be87c7979.
I merged master into staging-next but accidentally pushed it to master.
This should get us back to 87a19e9048.
This reverts commit ac241fb7a5, reversing
changes made to 76a439239e.
The old Quake3 NixOS test was removed in
50ea99cbc1 which served as a nice demo to
showcase what NixOS tests are capable of. This commit adds the same
functionality to run real openarena clients.