# Cryptography¶

Uplink does not implement any new cryptographic protocols, the authentication, encryption and validation routines used are well-established protocols that are adopted across industry and recognized by standards bodies as acceptable for secure handling of commerce, trade and financial transactions.

The main cryptographic building blocks in our system are: collision-resistant hash function for the Merkle trees, pseudo-random functions, commitments, and encryption. These are outlined below:

## Hash Functions¶

Internally Uplink uses two primary hash functions for it’s block hashing, transaction hashing, Merkle tree construction, and address derivation. These are used both for data integrity and as a random oracle process.

- SHA-256 (returns 256 bit unsigned integers)
- RIPEMD-160 (returns 160 bit unsigned integers)

The chosen hash functions are cryptographic hash functions which have the properties:

**Preimage Resistence**: Given a hash value x it is computationally difficult to find a preimage y such that.

**Second Preimage Resistence**: Given an input x0 is computationally difficult to find another input x1 such that:

**Collision Resistence**: For all inputs, it is computationally difficult to find two different preimages x0 and x1 such that their hashes collide:

## Elliptic Curves¶

Elliptic curves are algebraic curves defined over an underlying field \(\mathbb{F}\). The set of all points \((x,y) \in \mathbb{F}^2\) on the elliptic curve E over the field \(\mathbb{F}\) together with a special “point at infinity” \(\mathcal{O}\) is written as \(E(\mathbb{F})\). An algebraic group structure emerges for the set \(E(\mathbb{F})\) with the point \(\mathcal{O}\) taken to be the zero element with chord-and-tangent taken as the sum operation between two points \(P\) and \(Q\).

Elliptic curve cryptography is a modern standard for constructing key pairs
based on the intractability of the elliptic curve discrete logarithm problem
(*ECDLP*). Given a multiple \(Q\) of \(P\), the elliptic curve discrete
log problem is to find \(n\in\mathbb{Z}\) such that \(nP=Q\). This
problem is widely believed not to be computable in sub-exponential time.

Elliptic curves allow the same degree of security as classic integer techniques
(i.e. RSA) but are more space efficient in the resulting keys and computations.
All transactions in the Uplink digitally signed using the Elliptic Curve Digital
Signature Algorithm (*ECDSA*) to verify the originator of all submitted
transactions on the system. This prevents repudiation of transactions and
associates all users with a unique universally identifiable address which is
used by the system to verify integrity of the ledger state.

The fields typically used are defined over a finite integer field modulo a large prime \(p\).

The parameters of the curve a and b are the coefficients in the elliptic
curve and define the space of points. In practice, several standard named curves
are used which are widely known and implemented. The parameters that define a
curve are it’s *domain parameters*.

These parameters entirely define the elliptic curve implementation:

```
p = prime field
a = curve polynomial coefficient
b = curve polynomial coefficient
g = generator base point
n = order
h = cofactor
```

Given a random integer \(k\) in \(\mathbb{F}_p\) we can construct a
point \(Q = k G\). In our elliptic curve cryptosystems the *private key* is
k and the *public key* is Q. Recovering the private key from the public key
would involve solving the elliptic curve discrete logarithm problem for
\(Q\). Elliptic curve multiplication also provides a continent set of
trapdoor functions which use the public and private keys to perform operations
with computationally irreversible inputs.

The three curves used in Uplink’s cryptographic protocol are:

### Secp256k1¶

The curve secp25k1 is a 256-bit Koblitz curve with verifiably random parameters over the finite prime field \(\mathbb{F}_p\). Where p is the prime:

The curve E: \(y^2 = x^3+ax+b\) over \(F_p\) is defined by:

```
a = 0x0
b = 0x7
```

With base point:

```
G_x = 0x79BE667EF9DCBBAC55A06295CE870B07029BFCDB2DCE28D959F2815B16F81798
G_y = 0x483ADA7726A3C4655DA4FBFC0E1108A8FD17B448A68554199C47D08FFB10D4B8
```

Finally the order n of G and the cofactor are:

```
n = FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFE BAAEDCE6 AF48A03B BFD25E8C D0364141
```

```
h = 01
```

Reference: SECG Elliptic Curve Parameters

### Curve25519¶

Curve25519 is Montgomery curve providing 128 bits of security defined as:

over prime field p where:

```
p = 0x7FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFED
```

```
a = 0x76D06
```

With base point g, order n, and cofactor h.:

```
G_x = 0x9
G_y = 0x20AE19A1B8A086B4E01EDD2C7748D14C923D4D7E6D7C61B229E9C5A27ECED3D9
```

```
n = 0x1000000000000000000000000000000000000000000000005812631A5CF5D3ED
```

```
h = 0x8
```

Reference: IETF Elliptic Curves for Security

### BN128¶

Bn128 is Barreto-Naehrig pairing friendly curve providing 128 bits of security defined as:

over the prime field \(\mathbb{F}_q\) where:

```
q = 21888242871839275222246405745257275088696311157297823662689037894645226208583
r = 21888242871839275222246405745257275088548364400416034343698204186575808495617
b = 3
```

This curve choice admits a pairing construction over the two groups \(\mathbb{G}_1\) and \(\mathbb{G}_2\) where

- \(\mathbb{G}_1\) is a Barreto-Naehrig curve over \(\mathbb{F}_q\).
- \(\mathbb{G}_2\) is a the subgroup of order r in the sextic twist of \(\mathbb{G}_2\) over \(\mathbb{F}_{q^2}\) with equation \(y^2 = x^3 + \frac{b}{\zeta}\).
- \(\mathbb{G}_T\) is the subgroup of the \(r^{\text{th}}\) roots of unity in \(\mathbb{F}_{q^{12}}\).

With the bilinear pairing e with:

## Signature schemes¶

A digital signature scheme is a triple of probabilistic polynomial time algorithms:

**Key generation**: Generates a public and private key pair.**Signing**: Given a message and a private key, produces a signature.**Signature verifying**: Given the message, public key and signature, either accepts or rejects the message’s claim to authenticity.

### ECDSA¶

Uplink uses the Elliptic Curve Digital Signature Algorithm (ECDSA) to sign and verify blocks and transactions within the network, such that the origin of blocks or transactions can not be disputed within the network.

**Sign**

To sign a document is to provide two numbers (computed through a hash function). Inputs for the signing function are:

- The document as a stream of bytes, \(\text{msg}\)
- The elliptic curve private-key, \(d_A\)

And the outputs of the signing function is a two-tuple \((r,s)\) referred to as the signature.

The signing algorithm will be described in terms of the curve Secp256k1, where \(p\) is the secp256k1 prime, and \(G\) is it’s respective generator point.

- Generate a random value \(k \in {1,..,n-1}\)
- Hash the document byte stream such that \(z = \mathcal{H}(\text{msg})\)
- Compute \((x,y) = kG\)
- Compute \(r = x \bmod n\) , if \(r = 0\) go back to step 3
- Compute \(s = k^{-1} (z + rd_A) \bmod n\) , if \(s = 0\):

go back to step 3

The resulting \((r,s)\) pair is the signature.

**Verify**

Anyone with the possession of public-key, any ledger participant, can validate that the signature by using a verification function. Inputs to the verification function are:

- The document as a stream of bytes, \(\text{msg}\)
- The signature, \((r,s)\)
- The elliptic curve public-key, \(H_A\)

The verification algorithm then yields a true or false whether the document was signed by the private key associated with the public key.

- Compute \(u = (s^{-1} z) \bmod n\)
- Compute \(u' = (s^{-1} r) \bmod n\)
- Let \((x,y) = uG + u'H_A\)
- Verify that \(r = x \bmod n\)

If the result of the verification is True, the document has been signed by holder of the private key with respect to the public key \(H_A\).

**Key Recovery**

For secp25k1 curve, public keys can be recovered from the signature \((r,s)\) via the following technique.

For a given signature compute two points \(R\), \(R′\) which have the value \(r\) as the \(x\)-coordinate. Then compute \(r^{−1}\), which is the multiplicative inverse of the value \(r\) from the signature (modulo the order of the generator of the curve). Then compute \(z\) which is the lowest \(n\) bits of the hash of the message (where \(n\) is the bit size of the curve). The two public key candidates are then:

The two candidates are then run through the usual verification routine for the given message to determine which yields the original public key of the signer. This allows transactions inside the system to only include signature of the transactions without having to transmit the public keys while still maintaining the non-repudiation of digital signature verification.

### Schnorr Signatures¶

Schnorr signatures are an efficient and compact form of digital signatures.

The Schnorr’s identification protocol is an example of a Sigma protocol, a three-step protocol in which communication between prover and verifier goes forwards once, then backwards, then forwards again.

The prover aims to convince the verifier that he knows some value a. Let a be her private key and P her public key. In order to prove knowledge of it, the prover interacts with the verifier in three passes:

- The prover commits to a random private commitment value v, chosen in the range \([1, n-1]\). This is the first message \(commitment = v * G\).
- The verifier replies with a \(challenge\) chosen at random from \([0, 2^t - 1]\).
- After receiving the challenge, the prover sends the third and last message (the response) \(response = (v - challenge * a) \bmod n\).

The verifier accepts, if:

- The prover’s public key, P, is a valid public key. It means that it must be a valid point on the curve and \(h * P\) is not a point at infinity, where h is the cofactor of the curve.
- The prover’s \(commitment\) value is equal to \(response * G + challenge * P\)

**Schnorr NIZK proof**

The Schnorr NIZK proof is a non-interactive variant of the three-pass Schnorr identification scheme.

The original Schnorr identification scheme is made non-interactive through a Fiat-Shamir transformation, assuming that there exists a secure cryptographic hash function (i.e. the so-called random oracle model).

An oracle is considered to be a black box that outputs unpredictable but deterministic random values in response to a certain input. That means that, given the same input, the oracle will give back the same random output. The input to the random oracle, in the Fiat-Shamir heuristic, is specifically the transcript of the interaction up to that point. The challenge is then redefined as \(challenge = H(g || V || A)\), where H is a secure cryptographic hash function like SHA-256.

## Elliptic Curve Diffie–Hellman¶

Elliptic Curve Diffie–Hellman (ECDH) algorithm is an anonymous key agreement protocol that allows two parties, each having an elliptic curve public–private key pair, to establish a shared secret over an insecure channel. This shared secret may be directly used as a key, or to derive another key which can then be used to encrypt subsequent communications using a symmetric key cipher. It is a variant of the classical Diffie–Hellman protocol instead using elliptic curve cryptography.

In order for both parties to be able to agree on a secred sharet, they need to
be in agreement over the *domain parameters* beforehand. This means, both
parties have an elliptic curve public-private keypair using the *same* curve.
Domain parameters are:

Consider two parties, Alice and Bob, both having agreed on the domain parameters to use and having created public-private keypairs on these each. Each has access to their own keypair and the counterparty’s public key. Alice’s keypair is denoted by \((d_A, Q_A)\), Bob’s by \((d_B, Q_B)\) where \(Q_X = d_XG\). Alice computes the values \((x_k, y_k) = d_A Q_B\) using her private key \(d_A\) and Bob’s public key \(Q_B\). Bob computes \((x_k, y_k) = d_BQ_A\) using the appropriate keys. \(x_k\) is used as the shared secret between Alice and Bob. Both parties computations result in the same value \(x_k\) due to the following holding: \(d_A Q_B = d_A d_B G = d_B d_A G = d_B Q_A\).

## Double Ratchet¶

The Double Ratchet algorithm specifies an encryption scheme with rolling encryption keys between two counterparties. Rolling encryption keys are used in order to avoid decryption of all encrypted messages when specific information is leaked. All messages encrypted in the rolling scheme before the information leak will remain undecipherable. This is in contrast to regular encryption keys, where all previous and future encrypted data could be decrypted.

Reference: The Double Ratchet Algorithm

## Data Structures¶

### Hashchains¶

A hashchain is made of blocks of data which are linked together using the hashes of previous blocks. In public chains, the data happens to be a list of transactions, which were performed since the last block. Anybody can add blocks and link them into the hashchain. The hashchain remains a very authentic source of data. A few rogue elements will not be able to disrupt the authenticity of the data. Even if they try to add blocks and link them in a disruptive way, the rest of the network will catch up and discard or neutralise their rogue blocks. To understand how this is done, we must first understand how the hashchain is formed and how blocks are added to it.

The blocks in a hashchain serve two purposes:

- Discretize finite sets of transactions into time intervals.
- Produce a proposed sequence of transactions during this time interval.

Blocks themselves are formed of a block header, which includes metadata such as the block reference number, and a link back to the previous block; and the block contents where the list of validating transactions is stored, each containing the transaction data, sender, and recipient address. Every transaction included in each block is public and immutable.

The consensus algorithm on the network consists of two processes for incorporating blocks into the chain. A consensus voting mechanism by which a participant(s) vote on the inclusion of a block as the next element in the chain. And a chain scoring mechanism by which forks in the chain are assigned a rank based on the computational or authority power involved in their length.

A chain reconfiguration is the process by which a majority of network participants agree to move to a proposed chain fork based on a higher rank chain being presented to the network. At which point sibling chain is unwound and transactions are propagated to the network memory pool to be included in new blocks.

### Merkle Trees¶

A Merkle tree is a tamper-resistant data structure that allows a large amount of data to be compressed into a single number and can be queried for the presence of specific elements in the data with a proof constructed in logarithmic space.

A Merkle tree is a binary tree of hashes, in which all the leaf nodes are the individual data elements in the block. To construct a merkle tree, the initial data elements are first hashed using the merkle tree hash function (usually sha256) to generate the leaf nodes of the tree. The resulting hashed data are subsequently hashed together in pairs to create the parent nodes of the leaf nodes. This process continues until it results in a single hash known as the merkle root.

Nodes on a distributed ledger network are usually distinguished by being either “heavy” or “light”; Where heavy nodes store the entire list of transactions on each block in the form of a merkle tree, light nodes store only a merkle root hash of each block.

The transaction list in the block impacts the block-header. The block-header changes if the sequence of the same list of transactions is altered. The block-header contains a field called hashMerkleRoot. This is a 32 byte hash value of all the transactions in the block.

Let us assume there are 3 transactions (T1, T2, T3) in the block. A double hash is applied on each transaction. A double hash is exactly like the double hash in prevBlockHeader:

```
d1 = sha256( sha256(T1) )
d2 = sha256( sha256(T2) )
d3 = sha256( sha256(T3) )
```

Now d1 and d2 are concatenated and another double hash is generated.:

```
d12 = sha256( sha256 (d1 concat d2) )
```

As we are left with only one double hash corresponding to transaction T3, we concatenate d3 with itself and generate the double hash again.:

```
d34 = sha256( sha256 (d3 concat d3) )
```

We follow this process until only one double hash is left, and we call it the merkle root. To generate the merkle root:

```
hashMerkleRoot = sha256( sha256(d12 concat d34) )
```

Perhaps the most important result of using a merkle tree to represent the block data is the ability to construct a merkle proof and query any node to verify the inclusion of a transaction t in a specific block b. Since light nodes in the network do not store the entire transaction list of each block, they do not have the ability to verify the inclusion of the transaction t in a particular block b without first querying a heavy node for a merkle proof.

A merkle proof consists of a list of hashes, beginning with the hash of the transaction in question, and followed by the sibling hashes of each parent node in the merkle tree. For a light node to verify the validity of a transaction using a merkle proof it received from a heavy node, it must simply verify that the hashing of the transaction in question and it’s sibling match the next hash element in the merkle proof, repeatedly, until the final merkle root is constructed. If at any point the resulting hash from the previous pair of hashes does not match the next hash element in the merkle proof, the light node can assert that the transaction in question was not included in the block from which the merkle proof was constructed.

This implementation allows for minimal redundant storage of transactions in the block chain network as full merkle trees (containing, at the base, a list of transactions) do not need to be stored on every node in the network. Subsequently, transaction inclusion can be quickly verified using minimal data transfer between nodes, dramatically reducing network bandwidth and redundant data storage.

## Hash Based Message Authentication¶

A message authentication code (MAC) is a short piece of information used to
authenticate a message—in other words, to confirm that the message came from the
stated sender (its authenticity) and has not been changed in transit (its
integrity). Keyed-hash message authentication code (HMAC) provides stronger
security properties than regular MAC. Specifically speaking, HMACs are resistant
to *length extension attacks*.

A HMAC is defined as:

where:

- H : a cryptographic hash function
- K : the secret key
- m : the message to be authenticated
- K’ : is another secret key, derived from the original key K
- || : concatenation,
- ⊕ : exclusive or (XOR),
- opad : outer padding (one-block-long hexadecimal constant),
- ipad : inner padding (one-block-long hexadecimal constant).

K’ is derived as follows: It is derived from the original key K (by padding K to the right with extra zeroes to the input block size of the hash function, or by hashing K if it is longer than that block size).

## Shamir Secret Sharing¶

Shamir’s Algorithm for secret sharing allows us to split a secret S into any M shares S(1) to S(M) such that any N of those shares can be used to reconstruct S but any less than N shares yields no information whatsoever.

We use the Galois Field \(\text{GF}(256)\) which is the values 0 to 255. Addition in any Galois Field of the form \(\text{GF}(2^n)\) is directly equivalent to bitwise exclusive-or.

## Advanced Encryption Standard (AES)¶

The Advanced Encryption Standard (AES) is a block cipher specification for the encryption of electronic data established by the U.S. National Institute of Standards and Technology in 2001. The cipher AES-256 is an industry standard and used among other places in SSL/TLS for a significant fraction of all internet traffic.

AES is is the first publicly accessible cipher approved by the National Security Agency (NSA) for top secret information and meets the standards specified by the European Union for data protection laws. It supersedes the Data Encryption Standard (DES), which was published in 1977. The algorithm described by AES is a symmetric-key algorithm, meaning the same key is used for both encrypting and decrypting the data. In the United States, AES was announced by the NIST as U.S. FIPS PUB 197. This announcement followed a five-year standardization process in which fifteen competing designs were presented and evaluated, before the Rijndael cipher was selected as the most suitable.

AES is covered in the following industry standard bodies:

## Commitment Schemes¶

Commitment schemes are a way for one counterparty to commit to a value such that
the value committed remains private, but can be revealed at a later time when
the committing party divulges a necessary parameter of the commitment process.
Strong commitment schemes must be both information *hiding* and computationally
*binding*. Hiding refers the the notion that a given value \(x\) and a
commitment of that value \(C(x)\) should be unrelatable; That is,
\(C(x)\) should reveal no information about \(x\). A commitment scheme
is binding if there is no plausible way that two different values can result in
the same commitment.

### Pedersen Commitment Scheme¶

The Pedersen Commitment scheme is an information-theoretically, computationally
binding, non-interactive commitment scheme in which there is only one way
interaction; One counterparty *a* sends the commited value and the commitment
parameters to the other counterparty *b*, and when the commitment is to be
revealed, send the revealing commitment parameter. Formally:

- let \(q\) be a large prime
- let \(p\) be a large, safe prime such that \(p = 2q + 1\)

Randomly choose a generator \(g\), such that:

Randomly choose a number \(a\) and compute \(h\) such that:

**Pedersen Commitment**

To commit a value \(x\), another random value \(r\) is chosen such that:

and the value \(x\) is commited by computing:

The generator \(g\) and resulting number \(h\) are known as the commitment bases, and should be shared along with \(C(x)\) with whomever wishes to open the value.

**Pedersen Decommitment**

To open the value, the counterparty who committed the value must share the original value \(x\) that completes the commitment equation. The counterparty wishing to open the value \(C(x)\) will then compute the commitment again to verify that the original value shared indeed matches the commitment \(C(x)\) initially received.

Pedersen commitments are also additionally homomorphic, such that for messages \(m_0\) and \(m_1\) and blinding factors \(r_0\) and \(r_1\) we have:

Reference: Non-Interactive and Information-Theoretic Secure Verifaible Secret Sharing.

**Vector Pedersen Commitments**

The Vector Pedersen Commitment is a more powerful variant of the previous Pedersen commitment. It commits to a vector v instead of a scalar. This extended form is defined as:

where \(v_1\), \(v_2\), …, \(v_n\) are scalars that multiply each point \(G_1\), \(G_2\), …, \(G_n\) respectively in the elliptic curve. It is the result of the dot product between two vectors v and G of arbitrary large number of elements. Each element \(G_i\) is a NUMS (“Nothing Up My Sleeve”) generator that can be created using a hash function H such that H(encode(G) || i) and a “coerce-hash-to-point” function to construct the point from the randomized hash value.

The new commitment C’ is still a point in the curve and a valid Pedersen commitment. It also holds the hiding and binding properties and the same additive homomorphic properties as the Pedersen Commitment:

where v and w are now vectors.

### Mutually Independent Commitment Protocol¶

The Pedersen commitment scheme works well when one party wishes to commit to a value and send it to another party, but it breaks down when both parties wish to commit to a value. For a trivial example, if Alice wishes to commit to her lowest price \(a\) and send that commitment \(C(a)\) to Bob, then Bob can commit his highest bid to be \(a\) by copying Alice’s commitment. Thus Bob would force Alice to always sell at her lowest price.

The Mutually Independent Commitment Protocol (MICP) provides *non-correlation*,
or the property that “A dishonest party cannot commit to a value that is in some
significant way correlated to the honest party’s value.”

The pseudo code for the MICP can be found here. It should be noted that the implementation used in Uplink differs in the fact that the protocol is initially presented to mutually and independently commit a single bit of information. We have altered the protocol such that values of an arbitrary number of bits can be committed.

## Pairing-based cryptography¶

Let \(G_1\), \(G_2\) and \(G_T\) be abelian groups of prime order q and let g and h elements of \(G_1\) and \(G_2\) respectively . A pairing is a non-degenerate bilinear map \(e : G_1 \times G_2 \rightarrow G_T\).

This bilinearity property is what makes pairings such a powerful primitive in cryptography. It satisfies:

The non-degeneracy property guarantees non-trivial pairings for non-trivial arguments. In other words, being non-degenerate means that:

The Tate pairing is a map:

where \(P \in F_{q^k}[r]\), \(Q\) is any representative in a equivalence class in \(E(F_{q^k}) / r E \left ( F_{q^k} \right )\) and \(F^*_{q^k} / \left (F^*_{q^k} \right )^r\) is the set of equivalence classes of \(F^*_{q^k}\) under the equivalence relation \(a ≡ b\) iff \(a / b \in \left ( F^*_{q^k} \right )^r\). By exponentiating elements in \(F^*_{q^k} / \left ( F^*_{q^k} \right )^r\) to the power of \((q^k - 1) / r\) we map all elements in an equivalence class to the same value.

Tate pairings use Miller’s algorithm, which is essentially the double-and-add algorithm for elliptic curve point multiplication combined with evaluation of the functions used in the addition process.

Miller’s algorithm in the Tate pairing iterates as far as the prime group order r, which is a large prime number. The ate pairing comes up as an optimization of the Tate pairing. It achieves a much shorter loop of length \(T = t - 1\) on an ordinary curve, where t is the trace of the Frobenius endomorphism. The ate pairing is defined as: