Bitcoin Fundamentals: Cryptography

One of the reasons why Bitcoin keeps attracting so much attention is its exceptional "math". Satoshi Nakamoto has managed to create a system that can function with a complete absence of trust between its participants. All interactions are based on strict mathematics, no human factor involved. That was the revolutionary nature of the idea, and not the peer-to-peer networking, as many people may think. Therefore, we’ve decided to devote the first part of the article to the mathematical basics of Bitcoin.

Below we’ll explain the most basic things like elliptic curves, ECC (elliptic-curve cryptography), private/public keys and so on. Whenever possible, we will illustrate with examples of code, mostly in Python 2.7.

Introduction

Cryptography is the fundamental part of Bitcoin; nothing would work without it. And that's why we will start here.

The so-called elliptic-curve cryptography is used in Bitcoin. It’s based on a special function—an elliptic curve (not to be confused with an ellipse). We’ll describe what this function is like and why it’s so remarkable.

Elliptic curve

According to Wikipedia, an elliptic curve over field $K$ is a non-singular cubic curve on the projective plane over $\hat{K}$ (the algebraic closure of field $K$) given by a third-degree equation with coefficients from the field $K$ and a "point at infinity."

In lay terms, an elliptic curve may seem to be a rather simple function. As a rule, it can be written in the so-called Weierstrass’s form: $y^2 = x^3 + ax + b$

Depending on the values of parameters $a$ and $b$, the graph of this function may look different:

import numpy as np
import matplotlib.pyplot as plt

def main():
    a = -1
    b = 1

    y, x = np.ogrid[-5:5:100j, -5:5:100j]
    plt.contour(x.ravel(), y.ravel(), pow(y, 2) - pow(x, 3) - x * a - b, [0])
    plt.grid()
    plt.show()

if __name__ == '__main__':
    main()

According to Wikipedia, this function was first seen in the works of Diophantus. Later, in the 17th century, Newton became interested in it. His research, that we’re going to discuss, has largely led to formulas for the addition of points on an elliptic curve. From this point on, we shall consider an elliptic curve $\alpha$.

Let there be two points $P, Q \in \alpha$. Their sum is point $R \in \alpha$ that is defined in the simplest case as follows: draw a line through $P$ and $Q$ —it will intersect the $\alpha$ curve  at a single point, let’s call it $R$. By changing the $y$ coordinate of point $-R$ to the opposite in sign, we obtain point $R$ that we will call the sum of $P$ and $Q$, that is, $P + Q = R$.

It’s important to note that we're introducing this addition operation. If you add points in the usual sense, that is, by adding the appropriate coordinates, you will get a completely different point $R'(x_1 + x_2, y_1 + y_2)$, which most likely has nothing to do with $R$ or $-R$ and doesn’t even lie on the $\alpha$ curve.

Careful readers have already asked themselves what will happen if we draw, for example, a straight line through two points with coordinates of form $P(a, b)$ and $Q(a, -b)$. That is, the straight line going through them will be parallel to the y-axis (number three in the picture below).

It is easy to see that in this case there is no third intersection with the $\alpha$ curve we called $-R$. To avoid this event, we introduce the so-called point of infinity usually denoted as $O$ or simply $0$ as in the picture. We will say that if there’s no intersection, $P + Q = O$.

Of particular interest to us is the case when we want to add a point to itself (number two in the picture above, point $Q$). In this case, simply draw a tangent to the $Q$ point and reflect the resulting intersection point with respect to $y$.

Now, with a subtle movement of the hand, we can introduce an operation of multiplying the point by the $\mathbb{N}$ number.  As a result, we obtain a new point $K = G * k$, that is, $K = G + G + ... + G, k$ times. This picture should make it all clear:

Elliptic curve over a finite field

Absolutely the same curve is used as in the ECC, but it’s considered over some finite field $F_p = \mathbb{Z}/\mathbb{Z}_p = \{0, 1, ..., p - 1\}$, $p$ is a prime number. That is,

$y^2\ mod\ p = x^3 + ax + b\ (mod\ p)$

All the above properties (addition, multiplication, a point at infinity) remain valid for this function. However, if we try to render this function, it will only vaguely (at best) resemble the usual elliptic curve. As for the concept of the "tangent to a function at a point," it loses its meaning, but there’s nothing to worry about. Here is an example of the function $y^2 = x^3 + 7$ for $p = 17$:

As for $p = 59$, it’s almost a random set of points. The only thing that still resembles the origin of this graph is the symmetry regarding the $X$ axis.

P. S. In case you want to know how to compute the coordinates of  point $R(x_3, y_3)$ in the instance of a curve over a finite field knowing coordinates of $P(x_1, y_1)$ and $Q(x_2, y_2)$, take a look at “An Introduction to Bitcoin, Elliptic Curves and the Mathematics of ECDSA” by N. Mistry. Everything is described there in detail; it’s understandable if you know 8th grade level Math.

P.P.S. In case our examples did not satisfy your inquisitive mind, here’s a website for drawing curves of all kinds, so have some fun.

SECP256k1

Getting back to Bitcoin, the SECP256k1 curve is used in it. It has the form of $y^2 = x^3 + 7$ and is considered over field $F_p$, where $p$ is a huge prime number, namely $2^{256} - 2^{32} - 2^9 - 2^8 - 2^7 - 2^6 - 2^4 - 1$.

The so-called base point, aka generator, is defined for SECP256k1. It’s just a point, usually denoted as $G$, lying on this curve. It’s necessary for creating a public key that will be described below.

Here’s a simple example: using Python, let’s check whether point $G(x, y)$ belongs to the SECP256k1 curve.

>>> p = 115792089237316195423570985008687907853269984665640564039457584007908834671663
>>> x = 55066263022277343669578718895168534326250603453777594175500187360389116729240
>>> y = 32670510020758816978083085130507043184471273380659243275938904335757337482424
>>> (x ** 3 + 7) % p == y**2 % p
True

Digital signature

A digital signature is a mathematical scheme for demonstrating the authenticity of digital messages or documents. A valid digital signature gives a recipient reason to believe that the message was created by a known sender (authentication), that the sender cannot deny having sent the message (non-repudiation), and that the message was not altered in transit (integrity) — Wikipedia

The general idea is this: Alice wants to transfer 1 BTC to Bob. To do this, she creates a message like:

{
    "from" : 1FXySbm7jpJfHEJRjSNPPUqnpRTcSuS8aN, // Alice's address
    "to" : 1Eqm3z1yu6D4Y1c1LXKqReqo1gvZNrmfvN, // Bob's address
    "amount" : 1 // Send 1 BTC
}

Then Alice takes her private key (for now you can assume that it’s a number known only to Alice), a hash of the message, and a function of form $sign\_text(private\_key, text)$. At the output, she obtains the signature of her message. In the case of ECDSA, it will be a pair of integers, while the signature may look different for other algorithms. After that, she sends out the original message, the signature and her public key to all participants of the network

As a result, every Bob can optionally take this trio, the function of form $validate\_signature(public\_key, signature, text)$ and check whether the owner of the private key has signed this message. And if everyone inside the network knows that the $public\_key$ belongs to Alice, you can understand whether she has sent this money or someone else is trying to do it on her behalf.

Moreover, suppose that there is a person who got in between Alice and the rest of the network. Let’s say he has intercepted Alice's message and changed something in it, literally 1 bit out of a billion. But even in this case the signature verification for validity $validate\_signature(public\_key, signature, text')$ will show that the message has been changed.

This is a very important feature for Bitcoin since the network is distributed. We cannot know in advance who gets our transaction with the requirement to transfer 1000 BTC. But he cannot change it (for example, specify his address as recipient) since the transaction has been signed by your private key and other participants of the network will immediately understand that there is something wrong.

ATTENTION! In reality, the process is quite different from the one described above. We just described in lay terms what a digital signature is and why we need it. The real algorithm is described in the part about Transaction.

Private key

Private key is quite a general term and different types of private keys can be used in various electronic signature algorithms.

As you may have noticed, Bitcoin uses the ECDSA algorithm. In its case, a private key is a natural 256 bit number, that is, an integer from $1$ to $2^{256}$. Technically, even number 123456 will be a valid private key, but you will soon learn that your coins only "belong" to you until the attacker has your private key, and values like 123456 are very easy to traverse.

It is important to note that it’s impossible to traverse all the keys due to the fact that $2^{256}$ is an enormously large number.

But let’s try to imagine it: according to this article, there are a bit less than $10^{22}$ grains of sand on the Earth. Let's use the fact that $2^{10} \approx 10^3$, that is $10^{22} \approx 10^{80}$ grains of sand. And we have in the total of $2^{256}$ addresses, about $2^{80^3}$.

Hence we can take all the sand on the Earth and turn every grain of sand into a new Earth.  Then, we will turn every grain of sand on each planet in the resulting amount of planets into a new Earth. The total number of grains of sand will still be much less than the number of possible private keys.

For the same reason, most Bitcoin clients simply take 256 random bits when creating a private key—the probability of collision is extremely low. The following are three examples:

Python

>>> import random
>>> private_key = ''.join(['%x' % random.randrange(16) for x in range(0, 64)])
>>> private_key
'9ceb87fc34ec40408fd8ab3fa81a93f7b4ebd40bba7811ebef7cbc80252a9815'
>>> # or
>>> import os
>>> private_key = os.urandom(32).encode('hex')
>>> private_key
'0a56184c7a383d8bcce0c78e6e7a4b4b161b2f80a126caa48bde823a4625521f'

Python, ECDSA

>>> import binascii
>>> import ecdsa # sudo pip install ecdsa
>>> private_key = ecdsa.SigningKey.generate(curve=ecdsa.SECP256k1)
>>> binascii.hexlify(private_key.to_string()).decode('ascii').upper()
u'CE47C04A097522D33B4B003B25DD7E8D7945EA52FA8931FD9AA55B315A39DC62'

Bitcoin-cli

$ bitcoin-cli getnewaddress
14RVpC4su4PzSafjCKVWP2YBHv3f6zNf6U
$ bitcoin-cli dumpprivkey 14RVpC4su4PzSafjCKVWP2YBHv3f6zNf6U
L3SPdkFWMnFyDGyV3vkCjroGi4zfD59Wsc5CHdB1LirjN6s2vii9

Public key

Let $k$ be our private key and $G$ is the base point. Then the public key is $K = G * k$. So the public key is actually a point lying on the SECP256k1 curve.

There are two important things here. First, it's easy to see that the operation of obtaining a public key is uniquely defined. That is, there is always only one public key for a particular private key. Second, the reverse operation is computationally difficult. Normally we can obtain a private key from the public one only by a complete traversal of the first one.

Below you will find out that there’s exactly the same connection between the public key and the address, but it’s about the irreversibility of hash functions.

Examples of obtaining a Public Key:

Python, ECDSA

>>> import binascii
>>> import ecdsa
>>> private_key = ecdsa.SigningKey.generate(curve=ecdsa.SECP256k1)
>>> public_key = private_key.get_verifying_key()
>>> binascii.hexlify(public_key.to_string()).decode('ascii').upper()
u'D5C08F1BFC9C26A5D18FE9254E7923DEBBD34AFB92AC23ABFC6388D2659446C1F04CCDEBB677EAABFED9294663EE79D71B57CA6A6B76BC47E6F8670FE759D746'

C++, libbitcoin

#include <bitcoin/bitcoin.hpp>
#include <iostream>

int main() {
    // Private key
    bc::ec_secret secret = bc::decode_hash(
    "038109007313a5807b2eccc082c8c3fbb988a973cacf1a7df9ce725c31b14776");
    // Get public key
    bc::ec_point public_key = bc::secret_to_public_key(secret);
    std::cout << "Public key: " << bc::encode_hex(public_key) << std::endl;
}

For compilation and startup, we use (after previously installing libbitcoin):

$ g++ -o public_key <filename> $$(pkg-config --cflags --libs libbitcoin)
$ ./public_key
Public key: 0202a406624211f2abbdc68da3df929f938c3399dd79fac1b51b0e4ad1d26a47aa

You can see that public key formats in the first and second examples are different (at least in length). We'll discuss this in more detail below.

Formats & address

Base58Check encoding

We will see this encoding throughout the book. That's why we’d better understand how it works and why we need it.

The idea is to write the sequence of bytes as short as possible in a readable format and, at the same time, make the probability of possible misprints even less. You understand that safety cannot be superfluous with Bitcoin. Use one wrong character and money will go to an address, the keys to which will most likely never be found. Here’s a comment about encoding in base58:

/**
 * Why base-58 instead of standard base-64 encoding?
 * - Don't want 0OIl characters that look the same in some fonts and
 *      could be used to create visually identical looking data.
 * - A string with non-alphanumeric characters is not as easily accepted as input.
 * - E-mail usually won't line-break if there's no punctuation to break at.
 * - Double-clicking selects the whole string as one word if it's all alphanumeric.
 */

The easiest way to implement shortness of the record is by using the fairly common Base64 encoding. That is, using the 64 base numeral system, where digits 0,1, ..., 9, and letters a-z and A-Z  are used for the record. This gives 62 characters and the remaining two may be anything, depending on the implementation.

The first difference of Base58Check is that the characters 0, O, l, I have been removed in case someone decides to confuse them. Thus, there are 58 characters, you can check it yourself:

b58 = '123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz'

def base58encode(n):
    result = ''
    while n > 0:
        result = b58[n%58] + result
        n /= 58
    return result

# print "Base58 encode for '123123':", base58encode(123123)
# # Base58 encode for '123123': dbp

The second difference is the check itself. checksum is added again at the end of the string—the first four bytes of SHA256(SHA256(str)). Well, we also need to add as many ones as the number of leading zeros there used to be before encoding in base58 but that’s easy.

import hashlib

def base58encode(n):
    b58 = '123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz'
    result = ''
    while n > 0:
        result = b58[n % 58] + result
        n /= 58
    return result

# Will be used to decode raw bytes and then encode them to the base58
def base256decode(s):
    result = 0
    for c in s:
        result = result * 256 + ord(c)
    return result

def countLeadingZeroes(s):
    count = 0
    for c in s:
        if c == '\0':
            count += 1
        else:
            break
    return count

def base58CheckEncode(prefix, payload):
    s = chr(prefix) + payload
    checksum = hashlib.sha256(hashlib.sha256(s).digest()).digest()[0:4]
    result = s + checksum
    return '1' * countLeadingZeroes(result) + base58encode(base256decode(result))

Private key formats

The most obvious way to store a private key is to write 256 bits in the form of a heap of zeros and ones. But any technically literate person understands that it’s going to be much easier to provide the same sequence in the form of 32 bytes where each byte corresponds to two characters in the hexadecimal notation. Reminding you that  numbers 0,1,...,9 and letters A,B,C,D,E,F are used in this case. We’ve used this format in the examples above. It’s sometimes even separated by spaces for clarity.

E9 87 3D 79 C6 D8 7D C0 FB 6A 57 78 63 33 89 F4 45 32 13 30 3D A6 1F 20 BD 67 FC 23 3A A3 32 62

Another more progressive format is WIF (Wallet Import Format). It’s quite simply built:

  1. Take a private key, say 0C28FCA386C7A227600B2FE50B7CAE11EC86D3BF1FBE471BE89827E19D72AA1D
  2. Write it in Base58Check with the prefix 0x80. That’s all.
private_key = '0a56184c7a383d8bcce0c78e6e7a4b4b161b2f80a126caa48bde823a4625521f'

def privateKeyToWif(key_hex):
    return base58CheckEncode(0x80, key_hex.decode('hex'))
    
# print "Private key in WIF format:", privateKeyToWif(private_key)
# # Private key in WIF format: 5HtqcFguVHA22E3bcjJR2p4HHMEGnEXxVL5hnxmPQvRedSQSuT4

Public key formats

First recall that a public key is just a point on the SECP256k1 straight line. The first and most common variant of writing it is the uncompressed format, 32 bytes for X and Y coordinates. To avoid confusion, the 0x04 prefix is used for a total of 65 bytes.

import ecdsa

private_key = '0a56184c7a383d8bcce0c78e6e7a4b4b161b2f80a126caa48bde823a4625521f'

def privateKeyToPublicKey(s):
    sk = ecdsa.SigningKey.from_string(s.decode('hex'), curve=ecdsa.SECP256k1)
    vk = sk.verifying_key
    return ('\04' + sk.verifying_key.to_string()).encode('hex')

uncompressed_public_key = privateKeyToPublicKey(private_key)

# print "Uncompressed public key: {}, size: {}".format(uncompressed_public_key, len(uncompressed_public_key) / 2)
# # Uncompressed public key: 045fbbe96332b2fc2bcc1b6a267678785401ee3b75674e061ca3616bbb66777b4f946bdd2a6a8ce419eacc5d05718bd718dc8d90c497cee74f5994681af0a1f842, size: 65

However, as you might have guessed from the name of it, it’s not the best way for storing a public key.

You will be surprised but the second format is called compressed. Its idea is as follows: a public key is a point on a curve, that is, a pair of numbers satisfying the equation $y^2\ mod\ p = x^2 + ax + b\ (mod\ p)$. This means you can write only the X coordinate and we’ll simply solve the equation if we need the Y coordinate. By doing so, we reduce the size of the public key by almost 50%!

There’s just one problem—if a point lies on a curve, there exist two solutions to solve the equation for its X coordinate (look at the graphs above in case you are in doubt). Usually we would simply save the sign for the Y coordinate, but when it comes to a function over a finite field, we should use the following feature: if there are two solutions of the equation for the X coordinate, one of the points will have an even Y coordinate and the second will have an odd one (but then again, you can make sure yourself).

Prefix 0x02 is used in the first case, and 0x03 in the second one. Here is an illustration of the process:

Address

As has already been mentioned, the address is obtained from a public key in a unique way. Moreover, it is impossible to reverse the operation, since cryptographically persistent hash functions are used—RIPEMD160 and SHA256. Here is the algorithm for converting a public key into the address:

  1. Take a private key, for example 45b0c38fa54766354cf3409d38b873255dfa9ed3407a542ba48eb9cab9dfca67
  2. Obtain the public key from it in the uncompressed format, in our case it is 04162ebcd38c90b56fbdb4b0390695afb471c944a6003cb334bbf030a89c42b584f089012beb4842483692bdff9fcab8676fed42c47bffb081001209079bbcb8db.
  3. Calculate RIPEMD160(SHA256(public_key)), and obtain 5879DB1D96FC29B2A6BDC593E67EDD2C5876F64C
  4. Convert in Base58Check with prefix 0x0017JdJpDyu3tB5GD3jwZP784W5KbRdfb84X. This is the address.
def pubKeyToAddr(s):
    ripemd160 = hashlib.new('ripemd160')
    ripemd160.update(hashlib.sha256(s.decode('hex')).digest())
    return base58CheckEncode(0, ripemd160.digest())

def keyToAddr(s):
    return pubKeyToAddr(privateKeyToPublicKey(s))

# print keyToAddr("45b0c38fa54766354cf3409d38b873255dfa9ed3407a542ba48eb9cab9dfca67")
# # '17JdJpDyu3tB5GD3jwZP784W5KbRdfb84X'

Sign & verify

You really don't need to know the technical details of exactly how ECDSA signs and verifies messages as you will still use ready-made libraries. The main thing is that you have a general understanding of why you need it. In case you’re still curious, read Layman’s Guide to Elliptic Curve Digital Signatures. It has a clear visualization of the entire process. You can try it yourself.