Authors: Aljosha Judmayer, Philipp Schindler
Status: DRAFT
Version: 0.1
Last revision: 2021-05-07
Initial release: -
This document describes a secret sharing scheme with a particular focus on the use case of sharing random mnemonic phrases. For example, this scheme may be used within the context of hierarchical deterministic (HD) cryptocurrency wallets. With this focus in mind, this specification is designed to be fully compatible with existing encodings of mnemonic phrases, in particular BIP39. It can be used to split a given (already existing) BIP39 mnemonic phrase of 12, 15, 18, 21, or 24 words, into up to shares, such that any configurable subset of shares of size can be used to recover the original mnemonic phrase from the shares. However, this specification is not limited to the use of BIP39 for encoding and decoding of mnemonic phrases, but rather specifies the secret sharing on the level of byte sequences. Alternative encodings such as, e.g., bytewords, may be used instead of BIP39. The design is intentionally kept as simple as possible, and only adds minor modifications, in particular an integrated checksum mechanism, to the original secret sharing approach described by Shamir (1979).
This document is not an official standard; it is published without any warranty for informational purposes and aims to: consolidate community best practices, foster public review as well as contribution and motivate the creation of official standards based on the herein provided material.
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at https://github.com/de-centralized-systems/sssmp.
This work is licensed under the Creative Commons Attribution 4.0 International License.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 (RFC2119, RFC8174) when, and only when, they appear in all capitals, as shown here.
The demand of deriving and managing an increasing number of secret keys has risen in recent years. This is mostly due to the widened adoption of cryptocurrencies and their increase in valuation. These developments have fostered the creation of several different approaches to derive, encode and manage cryptographic keying material, within the respective community.
This specification attempts to consolidate community best practices and well known and established cryptographic techniques, in relation to the secret sharing of cryptographic keying material. Therefore, core requirements and design goals are formulated and separated in into different layers and processing steps of managing and deriving cryptographic keying material. This structure should provide the necessary bigger picture and define the context in which this specification has to be interpreted. The aim of this document is to specify a minimalistic and interoperable method for the secret sharing of cryptographic keying material intended for manual/human processing, e.g., as used in private offline/paper backups of mnemonic seed phrases.
To clarify the scope of this document, the relation to other processing steps in deriving and managing cryptographic keying material is outlined in the following.
Generation of a cryptographic key At this point a new key is generated from a cryptographically secure source of randomness. In most cases, this marks the first step in a hierarchically deterministic derivation of further keying material. Therefore, an initial key is sometimes also called seed-key. This step is not within the scope of this document and MUST have taken place before the herein specified scheme can be applied to a key .
Derivation of a cryptographic key A generated, or derived, key can be used to derive one or multiple further keys from the input key according to a specified scheme like for example BIP32. or BIP44. The resulting key is then called derived key. This step is not within the scope of this document and MAY have taken place before the here specified scheme can be applied to a derived key .
Hidden derivation of a cryptographic key A hidden derivation step can be applied to a generated, or derived, key to allow for hidden derivation paths, plausible deniability through multiple derivation paths, or prevent the computation of subsequent derivation steps in the key hierarchy. Therefore, this step requires an additional secret/key as input, such as the utilization of a passphrase in BIP39. The additional user supplied passphrase is then used as a message in PBKDF2 with the reconstructed secret as the password. This step is out-of-scope of this document and MAY be use in combination with the herein specified scheme. For example, such a step can be applied after a secret has been reconstructed and before it is processed in further derivation steps.
Mnemonic encodeing/decodeing of a cryptographic key A generated or derived key can be represented as a mnemonic phrase to improve human readability and easier manual copying of keying material or key shares. Examples for such schemes are BIP39, or bytewords. This step is out-of-scope of this document and such techniques SHOULD be used in combination with the herein specified scheme.
Secret sharing of a cryptographic key A generated or derived key is shared among a set of participants, where should be able to reconstruct the original key . In this specification we focus exclusively on this step.
This specification is based on Shamir Secret Sharing (SSS) and adds a checksum mechanism in the construction step. The added checksum does not affect the reconstruction of classical SSS if it is not checked. The specified scheme is intended to share high min-entropy cryptographic keying material, i.e., a secret , among a set of participants, such that at least are required to reconstruct the original secret . A detailed comparison between related implementations of secret sharing schemes based on SSS and utilized in this context of cryptocurrencies (like BIP39, BC-SSKR, Shamir39, etc.) is out of scope of this document.
The following table provides an overview of the used notation in the remainder of this document.
| Symbol | Description |
|---|---|
| total number of shares to be created | |
| minimum number of shares required for the recovery process | |
| number of bytes of a given secret or share | |
| the secret value to be shared, represented by a sequence of bytes | |
| an index variable used for the secret shares; or depending on the context | |
| an index variable used for the coefficient of the secret sharing polynomials; | |
| an index variable used to denote the byte in the secret, a share or the secret sharing polynomial; | |
| a share of the secret ; consists a share index (an integer between and ) and share value (a sequence of bytes) | |
| the coefficients of the secret sharing polynomials as a sequence of bytes; the byte represents the coefficient of the secret sharing polynomial | |
| the secret sharing polynomial used to share the byte of the secret | |
| concatenation of two sequences of bytes and | |
| the byte of the sequence of bytes , 0-based indices used | |
| the first bytes of the sequence of bytes |
In the following, we specify the process of creating shares for a given secret (a sequence of bytes) such that any subset of at least of those shares can be used to recover the shared value.
Input:
Output:
Note that typical values for and are in the range of . If , the number of shares required for the recovery is one. In this case, each share created is just the value being shared. If , all created shares are required for the recovery process. If, for example, and , then shares are created in total. and each combination of at least shares can be used to recover the shared secret.
The process of secret sharing is executed on a byte-per-byte basis, i.e., each of the bytes of the secret to be shared is processed individually. This allows for a simple and efficient implementation using the finite field with 256 elements. In this field addition and subtraction of two field elements (two bytes) are defined using the bitwise xor operation. Multiplication of two field elements (again two bytes) is defined using the AES reducing polynomial . For additional background information regarding finite field arithmetic we refer the reader to the Wikipedia page on Finite Field Arithmetic. The used secret sharing polynomials of degree are evaluated over :
Note that here the values of are used to represent lists of coefficients, whereas the element of each of the lists of coefficients is used to define the secret sharing polynomial for sharing the byte of the secret .
Each list of coefficients is defined as follows:
Hereby, the function creates a list of random bytes. The function MUST use a cryptographically secure source of randomness. If , the last coefficient is derived from the secret and a sequence of random bytes. This is done as defined by the function for the purpose of including a checksum as the last bytes of .
Here, is used to denote the first (leftmost) bytes of the given HMAC. The use of as argument to the HMAC message, ensures that all bits of the coefficients are (pseudo-) random. Therefore, even if a the same secret is shared multiple times, the last coefficient is different with high probability. Given this definition of coefficients (and thus of the secret sharing polynomials), the shares are computed as follows:
Note that a share is a tuple of two values and MUST include the share index as well as the share value, obtained by evaluating the secret sharing polynomials and concatenating the results.
The only difference to Shamir's original secret sharing is the definition of coefficient . In Shamir's original description, all coefficients are selected at random, whereas in our approach all but the last bytes of the list of coefficients are selected at random, whereas the last bytes are pseudo-randomly derived from the secret and all but the last bytes of using an HMAC.
This section describes how to reconstruct a secret from a set of given shares, computed with the method specified in Section Sharing. Implementations MUST be able to successfully recover the shared secret , if all given shares are valid and a sufficient number of shares is provided.
Implementation MAY provide additional functionality to aid with the recovery process in case invalid shares are provided. In this case implementations MUST provide a suitable warning message to the user to indicate the issue(s) with the provided shares. Advanced recovery functionality MUST explicitly be invoked by the user.
Input:
Output:
Implementations MUST reject inputs prior to attempting the recovery process, and display a corresponding error message in the following cases:
If the inputs are successfully validated according to these plausibility checks, implementations attempt to recover the shared secret from the given set of shares . For this purpose the underlying secret sharing polynomials are evaluated using the Lagrange interpolation formula:
To recover the shared secret we apply the above formula for :
Neglecting the notational differences covering the byte-per-byte execution of the process, this approach is essentially equal to Shamir's original description.
For the case where , each generated share is equal to the shared secret itself. In this case there is no integrated checksum and the checksum verification steps are skipped. The value is directly returned as result of the recovery process.
Otherwise, after recovery, but before returning the recovered secret to the user, the integrated checksum MUST be verified. In case the verification fails, the user MUST be informed about the invalid checksum, and the recovered secret MUST NOT be displayed without such a warning.
The last list of coefficients used to define the secret sharing polynomials were generated to include a (randomized) checksum using the HMAC mechanism introduced in Section Sharing. In order to verify this checksum, first the value of the coefficients are computed from the set of shares supplied for the recovery process. This, e.g., MAY be accomplished using the schemata of divided differences illustrated in the following. The process of recovering the values of the coefficients using the schemata of divided differences, as the secret sharing process itself, is performed on a byte-per-byte basis for each byte in the list of coefficients using as index variable:
Here, denotes the dividing difference of the secret sharing polynomial and is recursively defined as given below:
Note that, as with the secret sharing process itself, all computations are performed in . Subtraction in is the same as addition and implemented using a binary xor operation.
With the value of recovered using the above process, and the values of obtained during the recovery process, the integrated checksum is verified by first getting the value as all but the last bytes of and then recomputing and comparing the value of using the function:
The checksum is valid if, and only if, the above equality holds. If the checksum is found valid, the value is returned as result of the recovery process. Otherwise the user MUST be informed of the failed recovery attempt. In this case, implementations MAY choose to return the recovery secret anyway, but they MUST NOT do so without provided the user with a clear indication of the failed checksum verification.
Besides a set of at least valid shares with their associated indices, no additional data is needed to reconstruct the previously shared secret. Although, for practical purposes it is RECOMMENDED that implementations also generate a unique ID which represents the individual secret sharing session.
This secret sharing session identifier SHOULD be written down together with the mnemonic encoding of the share as well as its index. This ensures that shares of multiple different sessions do not get mixed and thus confused with each other.
Note that, if shares of multiple different sessions with large values become mixed, this might lead to a situation where it is no longer computationally feasible to perform a brute-force search to find the valid combinations that allow for the reconstruction of the original secrets.
The minimal input size of the secret to be shared with the secret sharing mechanism specified in this document is limited to bits ( bytes). This is equivalent to the lower bound for cryptographic secrets considered by a range of other designs in this field. Any implementation of this specification MUST NOT allow secrets smaller than this size to be shared. In contrast to Shamir's original secret sharing description, which provides information theoretic security, our variant (due to the inclusion of an integrated checksum) is not designed for sharing low-entropy secrets and may become insecure under these circumstances. Also this specification MUST NOT be used as basis for an implementation for secret sharing of arbitrary data. It is instead intended to share cryptographic secrets with a minimum of bits of entropy. If against this rule, a very low entropy secret were to be shared, the inclusion of a bit checksum may allow an attacker with possession of shares to brute-force the shared secret, using the checksum as a means to verify if a guess of the secret is correct.
To share arbitrary data, it is RECOMMENDED to encrypt the data using a suitable (symmetric) encryption algorithm and only share the (symmetric) encryption key using the method specified in this document.
The specification defines the secret sharing process on the level of bytes. It is intended to be used with encoding schemes such as BIP39 or Bytewords to encode/decode the in- and outputs. These encoding schemes are already designed to add additional safeguards to detect errors, e.g., they include a dedicated checksum and represent bytes as words with a minimum required Damerau-Levenshtein distance etc. Therefore, no additional checksums and metadata is appended to the raw data output by the secret sharing algorithm in this specification.
Unfortunately, these encoding-level checksums used in mnemonics do not allow for error detection in cases where syntactically correct, but semantically invalid, shares are provided to the recovery process. These cases include, the following scenarios:
In both cases the recovery process would fail silently resulting in a recovered secret that is different to the secret originally shared, if no additional countermeasures are implemented. It is highly desirable to detect, and potentially be able to recover from, these scenarios. Therefore, this specification includes a checksum mechanism, by deriving the last coefficient of the secret sharing polynomial in a specific way. While there are many different approaches to cover these scenarios, the approach used in this specification achieves a range of desirable properties while minimizing potential drawbacks:
A theoretical drawback is the loss of information theoretical security as the value of includes a bit checksum, derived cryptographically from (i) the secret to shared and (ii) at least 64 additional bits of entropy.
The specification is intentionally designed such that it can be used with different mnemonic schemes and different key derivation schemes building on top of the shared cryptographic keying material. This approach follows the separation of concerns principle as generation, derivation, mnemonic encoding/decoding and secret sharing of cryptographic key material are separate tasks and can thus be defined and implemented separately (see Section Scope for more details). As long as the intermediate formats are well defined, this loose coupling allows for easier replacement and updates of individual components in software stacks and tool chains. The document at hand specifies a method for secret sharing of cryptographic keying material.
The specified secret sharing scheme in this document MAY safely be invoked multiple times with the same secret as input. The resulting shares for each invocation are different and independent from the shares of the other invocations (for the special of case all generated shares are equal to the secret being shared). Shares from different secret sharing instances cannot (and are by design not supposed to) be combined to recover a secret shared this way.
Even in the worst case, i.e., the shortest possible secret ( bytes) with the recovery threshold of , the probability of picking the same secret sharing polynomial (in this case only defined by the secret and random bits) is low for reasonable numbers of invocations. The probability can be computed via the formula of the birthday problem given below, using to denote the number of secret sharing invocations:
For for example, this results in a probability of .
The specification implicitly covers the use case of sharing the same secret among different (independent) groups of stakeholders and/or further sharing shares of a shared secret.
We consider this as an advanced use case, and in contrast to, e.g., the SLIP39 proposal, deliberately do not complicate this specification by introducing an additional notation for groups to explicitly capture such use cases. Yet, our specification transparently allows to re-share the shares for a given secret in a natural way: This is simply accomplished by feeding a share, which should be further split up, as secret into the secret sharing procedure. While special care must be taken to ensure that the way secrets are shared and may later be recombined is noted, our approach in principle allows for an arbitrary number of nested secret sharing layers.
This design performs all secret sharing and recovery computations in the finite field . This allows the specified algorithms to process the data on a byte-per-byte level and simplifies implementations in programming languages or on hardware platform with no immediate support for arbitrary precision integer arithmetics. Using this approach, share indices are restricted to integers from to . Thus the maximum number of shares is limited to . However in practice, we only see this as minor limitation for the indented use case of sharing mnemonic phrases, where human interaction is involved.
The files
test_vectors.json and
test_vectors.py in
./src directory of the
Github repository
of this specification contain a list of secret sharing computation performed with the reference implementation.
An example test vector is given below:
{
"s": "243f6a8885a308d313198a2e03707344a4093822299f31d0082efa98ec4e6c89",
"n": 5,
"t": 3,
"c": [
"243f6a8885a308d313198a2e03707344a4093822299f31d0082efa98ec4e6c89",
"b7e151628aed2a6abf7158809cf4f3c762e7160f38b4da56a784d9045190cfef",
"324e7738926cfbe5f4bf8d8d8c31d763da06c80abb1185eb676dea5f8d95cb78"
],
"shares": [
[1, "a1904cd29d22d95c58d75f2313b557e01ce8e627aa3a6e6dc8c7c9c3304b681e"],
[2, "99c50facf4c99dbe8b3138372647ff4625c4191483a8bcfdda92d6f74c17e8b7"],
[3, "1c6a29f6ec484c31c0ffed3a3682dbe29d25c711000de3401a7be5ac9012ec20"],
[4, "c31a04b678a089b200c3f9105db04d1f38d854be8c72fca18882910fbbab79d9"],
[5, "46b522ec6021583d4b0d2c1d4d7569bb80398abb0fd7a31c486ba25467ae7d4e"]
]
},
The test vectors include secret sharing examples for all values of . As the secret sharing process is inherently randomized, the values of the coefficients used for the secret sharing process are given in the test vectors. This allows other implementations to check if the secret sharing leads to the expected shares for the given coefficients. The last 8 bytes of the coefficients in the test vectors are computed according to this specification and can be used to check the correctness of the checksum computation. Implementations MUST NOT use the provided values for purposes other than testing the correctness of the implementation.
All implementation MUST support the following functionality as specified later in this document:
Implementations MAY support additionally support functionality to aid the recovery process in the following circumstances given as non-exhaustive list below:
Depending on the number of shares and the secret sharing threshold used, in particular for small values, recovery under these (and other) circumstances is possible with high probability due to the included checksum.
Feel free to contact us if you have also implemented the specification and want your implementation to be listed on this page.
The reference implementation can be found in the
./src directory of the
Github repository of this specification.
An implementation which uses SSSMP as a component is the BIP39 Toolkit. The BIP39 toolkit was developed alongside this specification and provides a standalone Python implementation with a CLI-based interface using BIP39 as input and output format. Moreover it supports the generation of cryptographic keys from scratch.
2021-05-07