forgejo/modules/keying/keying.go

// Copyright 2024 The Forgejo Authors. All rights reserved.
// SPDX-License-Identifier: MIT

// Keying is a module that allows for subkeys to be determistically generated
// from the same master key. It allows for domain seperation to take place by
// using new keys for new subsystems/domains. These subkeys are provided with
// an API to encrypt and decrypt data. The module panics if a bad interaction
// happened, the panic should be seen as an non-recoverable error.
//
// HKDF (per RFC 5869) is used to derive new subkeys in a safe manner. It
// provides a KDF security property, which is required for Forgejo, as the
// secret key would be an ASCII string and isn't a random uniform bit string.
// XChaCha-Poly1305 (per draft-irtf-cfrg-xchacha-01) is used as AEAD to encrypt
// and decrypt messages. A new fresh random nonce is generated for every
// encryption. The nonce gets prepended to the ciphertext.
package keying

import (
	"crypto/rand"
	"crypto/sha256"
	"encoding/binary"

	"golang.org/x/crypto/chacha20poly1305"
	"golang.org/x/crypto/hkdf"
)

var (
	// The hash used for HKDF.
	hash = sha256.New
	// The AEAD used for encryption/decryption.
	aead          = chacha20poly1305.NewX
	aeadKeySize   = chacha20poly1305.KeySize
	aeadNonceSize = chacha20poly1305.NonceSizeX
	// The pseudorandom key generated by HKDF-Extract.
	prk []byte
)

// Set the main IKM for this module.
func Init(ikm []byte) {
	// Salt is intentionally left empty, it's not useful to Forgejo's use case.
	prk = hkdf.Extract(hash, ikm, nil)
}

// Specifies the context for which a subkey should be derived for.
// This must be a hardcoded string and must not be arbitrarily constructed.
type Context string

// Used for the `push_mirror` table.
var ContextPushMirror Context = "pushmirror"

// Derive *the* key for a given context, this is a determistic function. The
// same key will be provided for the same context.
func DeriveKey(context Context) *Key {
	if len(prk) == 0 {
		panic("keying: not initialized")
	}

	r := hkdf.Expand(hash, prk, []byte(context))

	key := make([]byte, aeadKeySize)
	// This should never return an error, but if it does, panic.
	if _, err := r.Read(key); err != nil {
		panic(err)
	}

	return &Key{key}
}

type Key struct {
	key []byte
}

// Encrypts the specified plaintext with some additional data that is tied to
// this plaintext. The additional data can be seen as the context in which the
// data is being encrypted for, this is different than the context for which the
// key was derrived this allows for more granuality without deriving new keys.
// Avoid any user-generated data to be passed into the additional data. The most
// common usage of this would be to encrypt a database field, in that case use
// the ID and database column name as additional data. The additional data isn't
// appended to the ciphertext and may be publicly known, it must be available
// when decryping the ciphertext.
func (k *Key) Encrypt(plaintext, additionalData []byte) []byte {
	// Construct a new AEAD with the key.
	e, err := aead(k.key)
	if err != nil {
		panic(err)
	}

	// Generate a random nonce.
	nonce := make([]byte, aeadNonceSize)
	if _, err := rand.Read(nonce); err != nil {
		panic(err)
	}

	// Returns the ciphertext of this plaintext.
	return e.Seal(nonce, nonce, plaintext, additionalData)
}

// Decrypts the ciphertext and authenticates it against the given additional
// data that was given when it was encrypted. It returns an error if the
// authentication failed.
func (k *Key) Decrypt(ciphertext, additionalData []byte) ([]byte, error) {
	if len(ciphertext) <= aeadNonceSize {
		panic("keying: ciphertext is too short")
	}

	e, err := aead(k.key)
	if err != nil {
		panic(err)
	}

	nonce, ciphertext := ciphertext[:aeadNonceSize], ciphertext[aeadNonceSize:]

	return e.Open(nil, nonce, ciphertext, additionalData)
}

// ColumnAndID generates a context that can be used as additional context for
// encrypting and decrypting data. It requires the column name and the row ID
// (this requires to be known beforehand). Be careful when using this, as the
// table name isn't part of this context. This means it's not bound to a
// particular table. The table should be part of the context that the key was
// derived for, in which case it binds through that.
func ColumnAndID(column string, id int64) []byte {
	return binary.BigEndian.AppendUint64(append([]byte(column), ':'), uint64(id))
}
[SEC] Add `keying` module The keying modules tries to solve two problems, the lack of key separation and the lack of AEAD being used for encryption. The currently used `secrets` doesn't provide this and is hard to adjust to provide this functionality. For encryption, the additional data is now a parameter that can be used, as the underlying primitive is an AEAD constructions. This allows for context binding to happen and can be seen as defense-in-depth; it ensures that if a value X is encrypted for context Y (e.g. ID=3, Column="private_key") it will only decrypt if that context Y is also given in the Decrypt function. This makes confused deputy attack harder to exploit.[^1] For key separation, HKDF is used to derives subkeys from some IKM, which is the value of the `[service].SECRET_KEY` config setting. The context for subkeys are hardcoded, any variable should be shuffled into the the additional data parameter when encrypting. [^1]: This is still possible, because the used AEAD construction is not key-comitting. For Forgejo's current use-case this risk is negligible, because the subkeys aren't known to a malicious user (which is required for such attack), unless they also have access to the IKM (at which point you can assume the whole system is compromised). See https://scottarc.blog/2022/10/17/lucid-multi-key-deputies-require-commitment/ 2024-08-20 23:13:04 +02:00			`// Copyright 2024 The Forgejo Authors. All rights reserved.`
			`// SPDX-License-Identifier: MIT`

			`// Keying is a module that allows for subkeys to be determistically generated`
			`// from the same master key. It allows for domain seperation to take place by`
			`// using new keys for new subsystems/domains. These subkeys are provided with`
			`// an API to encrypt and decrypt data. The module panics if a bad interaction`
			`// happened, the panic should be seen as an non-recoverable error.`
			`//`
			`// HKDF (per RFC 5869) is used to derive new subkeys in a safe manner. It`
			`// provides a KDF security property, which is required for Forgejo, as the`
			`// secret key would be an ASCII string and isn't a random uniform bit string.`
			`// XChaCha-Poly1305 (per draft-irtf-cfrg-xchacha-01) is used as AEAD to encrypt`
			`// and decrypt messages. A new fresh random nonce is generated for every`
			`// encryption. The nonce gets prepended to the ciphertext.`
			`package keying`

			`import (`
			`"crypto/rand"`
			`"crypto/sha256"`
[FEAT] Allow pushmirror to use publickey authentication - Continuation of https://github.com/go-gitea/gitea/pull/18835 (by @Gusted, so it's fine to change copyright holder to Forgejo). - Add the option to use SSH for push mirrors, this would allow for the deploy keys feature to be used and not require tokens to be used which cannot be limited to a specific repository. The private key is stored encrypted (via the `keying` module) on the database and NEVER given to the user, to avoid accidental exposure and misuse. - CAVEAT: This does require the `ssh` binary to be present, which may not be available in containerized environments, this could be solved by adding a SSH client into forgejo itself and use the forgejo binary as SSH command, but should be done in another PR. - CAVEAT: Mirroring of LFS content is not supported, this would require the previous stated problem to be solved due to LFS authentication (an attempt was made at forgejo/forgejo#2544). - Integration test added. - Resolves #4416 2024-08-04 20:46:05 +02:00			`"encoding/binary"`
[SEC] Add `keying` module The keying modules tries to solve two problems, the lack of key separation and the lack of AEAD being used for encryption. The currently used `secrets` doesn't provide this and is hard to adjust to provide this functionality. For encryption, the additional data is now a parameter that can be used, as the underlying primitive is an AEAD constructions. This allows for context binding to happen and can be seen as defense-in-depth; it ensures that if a value X is encrypted for context Y (e.g. ID=3, Column="private_key") it will only decrypt if that context Y is also given in the Decrypt function. This makes confused deputy attack harder to exploit.[^1] For key separation, HKDF is used to derives subkeys from some IKM, which is the value of the `[service].SECRET_KEY` config setting. The context for subkeys are hardcoded, any variable should be shuffled into the the additional data parameter when encrypting. [^1]: This is still possible, because the used AEAD construction is not key-comitting. For Forgejo's current use-case this risk is negligible, because the subkeys aren't known to a malicious user (which is required for such attack), unless they also have access to the IKM (at which point you can assume the whole system is compromised). See https://scottarc.blog/2022/10/17/lucid-multi-key-deputies-require-commitment/ 2024-08-20 23:13:04 +02:00
			`"golang.org/x/crypto/chacha20poly1305"`
			`"golang.org/x/crypto/hkdf"`
			`)`

			`var (`
			`// The hash used for HKDF.`
			`hash = sha256.New`
			`// The AEAD used for encryption/decryption.`
			`aead = chacha20poly1305.NewX`
			`aeadKeySize = chacha20poly1305.KeySize`
			`aeadNonceSize = chacha20poly1305.NonceSizeX`
			`// The pseudorandom key generated by HKDF-Extract.`
			`prk []byte`
			`)`

			`// Set the main IKM for this module.`
			`func Init(ikm []byte) {`
			`// Salt is intentionally left empty, it's not useful to Forgejo's use case.`
			`prk = hkdf.Extract(hash, ikm, nil)`
			`}`

			`// Specifies the context for which a subkey should be derived for.`
			`// This must be a hardcoded string and must not be arbitrarily constructed.`
			`type Context string`

[FEAT] Allow pushmirror to use publickey authentication - Continuation of https://github.com/go-gitea/gitea/pull/18835 (by @Gusted, so it's fine to change copyright holder to Forgejo). - Add the option to use SSH for push mirrors, this would allow for the deploy keys feature to be used and not require tokens to be used which cannot be limited to a specific repository. The private key is stored encrypted (via the `keying` module) on the database and NEVER given to the user, to avoid accidental exposure and misuse. - CAVEAT: This does require the `ssh` binary to be present, which may not be available in containerized environments, this could be solved by adding a SSH client into forgejo itself and use the forgejo binary as SSH command, but should be done in another PR. - CAVEAT: Mirroring of LFS content is not supported, this would require the previous stated problem to be solved due to LFS authentication (an attempt was made at forgejo/forgejo#2544). - Integration test added. - Resolves #4416 2024-08-04 20:46:05 +02:00			// Used for the `push_mirror` table.
			`var ContextPushMirror Context = "pushmirror"`

[SEC] Add `keying` module The keying modules tries to solve two problems, the lack of key separation and the lack of AEAD being used for encryption. The currently used `secrets` doesn't provide this and is hard to adjust to provide this functionality. For encryption, the additional data is now a parameter that can be used, as the underlying primitive is an AEAD constructions. This allows for context binding to happen and can be seen as defense-in-depth; it ensures that if a value X is encrypted for context Y (e.g. ID=3, Column="private_key") it will only decrypt if that context Y is also given in the Decrypt function. This makes confused deputy attack harder to exploit.[^1] For key separation, HKDF is used to derives subkeys from some IKM, which is the value of the `[service].SECRET_KEY` config setting. The context for subkeys are hardcoded, any variable should be shuffled into the the additional data parameter when encrypting. [^1]: This is still possible, because the used AEAD construction is not key-comitting. For Forgejo's current use-case this risk is negligible, because the subkeys aren't known to a malicious user (which is required for such attack), unless they also have access to the IKM (at which point you can assume the whole system is compromised). See https://scottarc.blog/2022/10/17/lucid-multi-key-deputies-require-commitment/ 2024-08-20 23:13:04 +02:00			`// Derive the key for a given context, this is a determistic function. The`
			`// same key will be provided for the same context.`
			`func DeriveKey(context Context) *Key {`
			`if len(prk) == 0 {`
			`panic("keying: not initialized")`
			`}`

			`r := hkdf.Expand(hash, prk, []byte(context))`

			`key := make([]byte, aeadKeySize)`
			`// This should never return an error, but if it does, panic.`
			`if _, err := r.Read(key); err != nil {`
			`panic(err)`
			`}`

			`return &Key{key}`
			`}`

			`type Key struct {`
			`key []byte`
			`}`

			`// Encrypts the specified plaintext with some additional data that is tied to`
			`// this plaintext. The additional data can be seen as the context in which the`
			`// data is being encrypted for, this is different than the context for which the`
			`// key was derrived this allows for more granuality without deriving new keys.`
			`// Avoid any user-generated data to be passed into the additional data. The most`
			`// common usage of this would be to encrypt a database field, in that case use`
			`// the ID and database column name as additional data. The additional data isn't`
			`// appended to the ciphertext and may be publicly known, it must be available`
			`// when decryping the ciphertext.`
			`func (k *Key) Encrypt(plaintext, additionalData []byte) []byte {`
			`// Construct a new AEAD with the key.`
			`e, err := aead(k.key)`
			`if err != nil {`
			`panic(err)`
			`}`

			`// Generate a random nonce.`
			`nonce := make([]byte, aeadNonceSize)`
			`if _, err := rand.Read(nonce); err != nil {`
			`panic(err)`
			`}`

			`// Returns the ciphertext of this plaintext.`
			`return e.Seal(nonce, nonce, plaintext, additionalData)`
			`}`

			`// Decrypts the ciphertext and authenticates it against the given additional`
			`// data that was given when it was encrypted. It returns an error if the`
			`// authentication failed.`
			`func (k *Key) Decrypt(ciphertext, additionalData []byte) ([]byte, error) {`
			`if len(ciphertext) <= aeadNonceSize {`
			`panic("keying: ciphertext is too short")`
			`}`

			`e, err := aead(k.key)`
			`if err != nil {`
			`panic(err)`
			`}`

			`nonce, ciphertext := ciphertext[:aeadNonceSize], ciphertext[aeadNonceSize:]`

			`return e.Open(nil, nonce, ciphertext, additionalData)`
			`}`
[FEAT] Allow pushmirror to use publickey authentication - Continuation of https://github.com/go-gitea/gitea/pull/18835 (by @Gusted, so it's fine to change copyright holder to Forgejo). - Add the option to use SSH for push mirrors, this would allow for the deploy keys feature to be used and not require tokens to be used which cannot be limited to a specific repository. The private key is stored encrypted (via the `keying` module) on the database and NEVER given to the user, to avoid accidental exposure and misuse. - CAVEAT: This does require the `ssh` binary to be present, which may not be available in containerized environments, this could be solved by adding a SSH client into forgejo itself and use the forgejo binary as SSH command, but should be done in another PR. - CAVEAT: Mirroring of LFS content is not supported, this would require the previous stated problem to be solved due to LFS authentication (an attempt was made at forgejo/forgejo#2544). - Integration test added. - Resolves #4416 2024-08-04 20:46:05 +02:00
			`// ColumnAndID generates a context that can be used as additional context for`
			`// encrypting and decrypting data. It requires the column name and the row ID`
			`// (this requires to be known beforehand). Be careful when using this, as the`
			`// table name isn't part of this context. This means it's not bound to a`
			`// particular table. The table should be part of the context that the key was`
			`// derived for, in which case it binds through that.`
			`func ColumnAndID(column string, id int64) []byte {`
			`return binary.BigEndian.AppendUint64(append([]byte(column), ':'), uint64(id))`
			`}`