Blockchain

The basics of blockchain technology

The principles of blockchains

Blockchain is a political technology advocating the following general principles:

decentralization: the organization of work is horizontal and not hierarchical
responsibility: individuals assume responsibilities themselves, they do not delegate them. This implies a higher level of commitment and knowledge
disintermediation: interaction takes place directly between the parties involved, without intermediaries
autonomy: the system, once initiated and settled, evolves with minimal intervention from the outside
transparency: information is public and verifiable by all
incentive: every rational individual has both intrinsic and extrinsic incentives to participate actively and honestly to the system, increasing its value
sovereignty of the individual: individuals own what they create; they can value and monetize it

Instead of command-and-control structures, blockchains set the incentives in a way that individual actors optimize their own utility if they cooperate according to the rules of the game.

Blockchains are interdisciplinary

A blockchain is:

a distributed system
using cryptography
to secure an evolving consensus
about a token with social or economic value

Blockchains bring together:

mathematics (cryptography)
computer science (distributed systems)
politics (mechanisms for reaching consensus)
economics (exchange of valued tokens)

The blockchain spectrum. Blockchain technology merge distributed systems, cryptography, politics and economics.

Origins of blockchain

The origins of blockchain go back to the crypto-anarchism and cypherpunk movements of the late 80s, both influenced by the Austrian school of economics. These movements advocate widespread use of strong cryptography (highly resistant to cryptanalysis) in an effort to protect privacy, political and economic freedom.

An excerpt from the Crypto Anarchist Manifesto by Timothy C. May (1988):

Combined with emerging information markets, crypto anarchy will create a liquid market for any and all material which can be put into words and pictures. Timothy C. May, Crypto Anarchist Manifesto, 1988

The technical specification of blockchain was proposed in 1991 by Stuart Haber, a cryptographer, and Scott Stornetta, a physicist. They published their work in The Journal of Cryptography in 1991 under the title How to Time-Stamp a Digital Document and one year later they registered it with a US patent.

Haber and Stornetta were trying to deal with the epistemological problem of truth in the digital age:

The prospect of a world in which all text, audio, picture and video documents are in digital form on easily modifiable media raises the issue of how to certify when a document was created or last changed. Haber and Stornetta, How to Time-Stamp a Digital Document, 1991

In particular, they started from two questions:

If it is so easy to manipulate a digital file on a personal computer, how will we know what was true about the past?
How can we trust what we know of the past without having to trust a central authority to keep the record?

They include in the front page of the paper the following citation:

Time's glory is to calm contending kings, To unmask falsehood, and bring truth to light, To stamp the seal of time in aged things, To wake the morn, and sentinel the night, To wrong the wronger till he render right.
William Shakespeare - The Rape of Lucrece

Notable cypherpunks

Eric Hughes: co-founder of the Cypherpunks movement; author of the Cypherpunk Manifesto
Timothy C. May: co-founder of the Cypherpunks movement; author of the Crypto Anarchist Manifesto
John Gilmore: co-founder of the Cypherpunks movement
Jude Milhon: co-founder of the Cypherpunk mailing list; patron saint of hackers by the name of St. Jude
Satoshi Nakamoto: pseudonym of the inventor of Bitcoin. In Japanese, "Satoshi" means "wisdom"; "Naka" can mean "tool"; "Moto" can mean "creation"
David Lee Chaum: digital currency pioneer, author of the first proposal for a blockchain protocol
Stuart Haber: co-author of the first academic paper on blockchain technology
Scott Stornetta: co-author of the first academic paper on blockchain technology
Adam Back: inventor of Hashcash (proof-of-work consensus algorithm used by Bitcoin and until 2022 also by Ethereum)
Ralph Merkle: pioneer of asymmetric cryptography; inventor of cryptographic hashing and the Merkle tree
Ron Rivest: co-creator of RSA (asymmetric cryptography algorithm)
Nick Szabo: inventor of smart contracts (programs on blockchain); designer of bit gold (a precursor to Bitcoin)
Wei Dai: creator of B-money (a precursor to Bitcoin)
Philip Zimmermann: creator of the original version of PGP (the world's most widely adopted cryptosystem)
Hal Finney: author of PGP 2. 0 and Satoshi Nakamoto's first collaborator; recipient of the first transaction on Bitcoin
Bram Cohen: creator of BitTorrent (peer-to-peer protocol aimed at exchanging files in the network)
Richard Stallman: founder of the Free Software Foundation
Julian Assange: founder of WikiLeaks
Tim Berners-Lee: inventor of the Web

Blockchain components

The numerous components of blockchain technology can make it challenging to understand. However, each component can be described simply and used as a building block to understand the larger complex system:

Blocks

A block is a container for data

In its simplest form it contains:

an identification number
a timestamp of block creation
a bunch of data (usually, transactions)

Hashes

Each block has a fingerprint called hash that is used to certify the information content of the block.

Each block has a fingerprint called hash that is used to certify the information content of the block

Hashes of blocks are created using cryptographic hash functions, that are mathematical algorithms that maps data of arbitrary size to a bit string of a fixed size
a popular hash algorithm is SHA-256, designed by the United States National Security Agency (NSA)
it uses a hash of 256 bits (32 bytes), represented by an hexadecimal string of 64 figures
$2^{256} \approx 10^{77}$ is huge (more or less the estimated number of atoms of our universe), an infinite number for any practical purposes

The ideal cryptographic hash function

it is deterministic so the same message always results in the same hash
it is quick to compute the hash value for any given message
it is chaotic meaning a small change to a message should change the hash value extensively
it is infeasible (but not impossible) to reverse a message: generate a message from its hash value
it is infeasible (but not impossible) to generate a collision: two different messages with the same hash value

Let's create some hashes from quite similar lines of a famous poem:

Code to create a hash from a string

# load library
library(digest)

# hash a string
digest("Così tra questa immensità s'annega il pensier mio", "sha256")

# hash a slightly different string
digest("Così tra questa infinità s'annega il pensier mio", "sha256")

Chain

Blocks are chronologically concatenated into a chain by adding to the block a field with the hash of the previous block in the chain
it follows that the hash of each block is computed using also the hash of the previous block
this means that the hash of a block encodes all previous history of the blockchain
moreover, if you alter one block you need to modify not only the hash of it but that of all following blocks for the chain to be valid
the first block of the chain is called the genesis block and represents the initial state of the system

Each block is linked to the previous one using a previous hash field that contains the hash of the previous block

Two notable genesis blocks are:

the genesis block of Bitcoin blockchain
the genesis block of Ethereum blockchain

Next we write some code to mine a new block and to create a chain of blocks:

A function that mines a new block

mine <- function(previous_block, genesis = FALSE){
  if (genesis) {
    # define genesis block
    new_block <- list(number = 0,
                      timestamp = Sys.time(),
                      data = "I'm the genesis block",
                      parent_hash = "0")  
  } else {
    # create new block
    current_number = previous_block$number + 1
    new_block <- list(number = current_number,
                      timestamp = Sys.time(),
                      data = paste0("I'm block ", current_number),
                      parent_hash = previous_block$hash)
  }
  # add hash 
  new_block$hash <- digest(new_block, "sha256")
  return(new_block)
}

A function that creates a chain of blocks using the mine function

chain = function(nblocks) {
  # mine genesis block
  block_genesis <- mine(NULL, TRUE)   
  
  # first block is the genesis block
  blockchain <- list(block_genesis)

  if (nblocks >= 2) {
    # add new blocks to the chain
    for (i in 2:nblocks){
      blockchain[[i]] <- mine(blockchain[[i-1]], FALSE) 
    }
  }
  
  return(blockchain)
}

Consensus

Byzantine Generals Problem

Several generals are on the verge of attacking an enemy city during a siege. They are located in different strategic areas and can only communicate via messengers in order to coordinate the decisive attack
However, among these messengers, it is highly probable that there are traitors. The traitors carry messages that contradict the army's strategy
The problem, therefore, lies in the ability to carry out the attack effectively despite the risk of treason. This is known as decentralized consensus

The problem faced by the Byzantine generals is the same as that faced by distributed computing systems, such as blockchain systems.

How to reach a consensus on a distributed network where some nodes may be faulty or voluntarily corrupted?

In the blockchain setting, the problem is solved using one of the following consensus mechanisms:

proof-of-work (currently used by Bitcoin)
proof-of-stake (currently used by Ethereum)

In proof-of-work blockchains, miners work to find a solution to a computational problem that is hard to solve and easy to verify
this is a cryptographic puzzle that can be attacked only with a brute-force approach (trying many possibilities), so that only computational power counts
typically, the proof of work problem involves finding a number (called nonce) that once added to the block is such that the corresponding block hash starts with a string of leading zeros of a given length called difficulty
the average work that a miner needs to perform in order to find a valid nonce is exponential in the difficulty, while one can verify the validity of the block efficiently by executing a single hash function
the implementation of this consensus model uses resource intensive computations

Here is the code implementing the proof-of-work method, as well as the updated mine and chain functions using proof-of-work:

A function that creates a block using proof-of-work

proof_of_work = function(block, difficulty) {
  block$nonce <- 0
  block$hash = digest(block, "sha256")
  zero = paste(rep("0", difficulty), collapse="")
  while(substr(block$hash, 1, difficulty) != zero) {
      block$nonce = block$nonce + 1
      block$hash = digest(block, "sha256")  
  }
  return(block)
}

A function that mines a new block using proof-of-work

mine <- function(previous_block, difficulty = 3, genesis = FALSE) {  
  if (genesis) {
    # define genesis block
    new_block <-  list(number = 0,
                       timestamp = Sys.time(),
                       data = "I'm the genesis block",
                       parent_hash = "0")  
  } else {
    # create new block
    current_number <- previous_block$number + 1
    new_block <- list(number = current_number,
                      timestamp = Sys.time(),
                      data = paste0("I'm block ", current_number),
                      parent_hash = previous_block$hash)
  }
  
  # add nonce and hash with proof of work
  new_block <- proof_of_work(new_block, difficulty)
  
  return(new_block)
}

A function that creates a chain of blocks using the mine function

chain = function(nblocks, difficulty = 3) {
  # mine genesis block
  block_genesis = mine(NULL, difficulty, TRUE)   
  
  # first block is the genesis block
  blockchain <- list(block_genesis)

  if (nblocks >= 2) {
    # add new blocks to the chain
    for (i in 2:nblocks){
      blockchain[[i]] <- mine(blockchain[[i-1]], difficulty) 
    }  
  }
  return(blockchain)
}

Hashrate

Hashrate measures the speed at which a machine – or a collective of machines – can process a proof-of-work algorithm. It is expressed in number of hashes per second that a machine can generate.

Total hashrate generally refers to the aggregate computing power of all mining hardware attempting to solve the puzzle at a given point in time.

Bitcoin total hashrate is measured in EH/s (exahashes per second, where 1 exa = $10^{18}$ ). Here is a chart of bitcoin total hashrate.

Transactions

A block contains a header with metadata (like block number and timestamp) and a data field with a certain number of transactions
a transaction represents an interaction between parties, typically a transfer from sender to receiver of cryptocurrencies or of any other token, possibly mediated by a smart contract
each transaction has a fee that must be paid by the sender and depends on the network congestion and the complexity of the transaction
here is a real transaction on the Ethereum blockchain:
- user 0x9674 (the buyer) interacts with SuperRare smart contract 0x65b4 (an NFT marketplace) and buys an NFT sold by user 0xf8b3 (the seller)
- the buyer pays 2.06 ETH (1.7 goes to the seller and 0.36 to the marketplace as a fee)
- the seller transfers the NFT to the buyer
- the transaction fee is 0.02 ETH and goes to the miner

Digital signature

Blockchain uses asymmetric cryptography (also known as public-key cryptography) to implement digital signatures of transactions
asymmetric cryptography uses a pair of keys: a public key and a private key
the public key is made public, but the private key must remain secret
each transaction is signed with the sender's private key and anyone can verify the authenticity of the transaction using the sender's public key

Each transaction is signed with the sender's private key and anyone can verify the authenticity of the transaction using the sender's public key

RSA

RSA (Rivest–Shamir–Adleman) is one of the first asymmetric cryptography algorithms and is widely used for secure data transmission
in RSA, the asymmetry between private and public keys is based on the practical difficulty of the factorization of the product of two large prime numbers, the factoring problem
there are currently no published methods to defeat the system if a large enough key is used

Here is a code chunk that implements digital signature:

Asymmetric cryptography

# load library
library(openssl)

# generate a private key (prikey) and a public key (pubkey)
prikey <- rsa_keygen()
pubkey <- prikey$pubkey

# Write keys in Privacy-Enhanced Mail (PEM) format
write_pem(pubkey)
write_pem(prikey)

# build a transaction
trans = list(sender = "A", receiver = "B", amount = "100")

# serialize data
data <- serialize(trans, NULL)

# sign (a hash of) the transaction with private key
sig <- signature_create(data, sha256, key = prikey)

# verify the message with public key
signature_verify(data, sig, sha256, pubkey = pubkey)

Asymmetric encryption is also employed to secure (encrypt) messages, as explained in the following video from Simply Explained that digs deeper into symmetric vs. asymmetric encryption:

Peer-to-peer network

Finally, the blockchain ledger is distributed over a peer-to-peer network of nodes (computers serving the blockchain and running its protocol software). In this way, no central authority has control on the blockchain.

The steps to run the network are as follows:

new transactions are broadcast to all nodes using a gossip protocol: the software wallet that emits the transaction is connected to some nodes of the network and share the transaction with all connected peers, which in turn do the same. This ensures the transaction spreads rapidly in a breath-first visit of the network
each node collects some transactions into a block
on a proof-of-work blockchain, each node works on finding a difficult proof of work for its block; the first node that finds the solution becomes the miner of the block
in a proof-of-stake blockchain, a validator node is randomly selected - the likelihood of a node being chosen is proportional to the the stake of the node
the miner/validator broadcasts the block to its peers in the network using the same gossip protocol as above
nodes accept the block only if all transactions in it are authentic and not already spent
nodes express their acceptance of the block by working on creating the next block in the chain, using the hash of the accepted block as the previous hash (notice that all previous work is lost in case of PoW since miners need to work on a different block)
the reward for the miner/validator is inserted as a first transaction (called coinbase) of the mined block; in this way the miner/validator has an incentive to remain honest
here is a (funny) blockchain transaction visualizer

Dig deeper: Building a blockchain in JavaScript

All the code shown in this part is contained in the following R Markdown document:

Private blockchains

While a public blockchain brings transparency, decentralization, and security, a private blockchain enacts governing rules to write, edit, or even delete blockchain entries. The ledger is maintained by very few validators known as trusted intermediaries. To become a trusted intermediary, actors must disclose their identity and receive approval from a consortium. Then, block validation is arbitrarily operated by these validators relying on e.g., proof-of-authority (PoA), a consensus mechanism where blocks are individually signed by nodes, not depending on capital or energy tenet but on confidence.

Proof-of-authority enables a high level of scalability owing to the limited number of nodes verifying and adding new blocks. However, access restrictions inevitably lead to a centralization of node operators and thus to a strong dependence on one or a few actors, as well as transparency issues, rendering private solutions unsuitable for many applications on public blockchains.

The impact of quantum computing on blockchain

the cryptographic algorithms utilized for digital signature will need to be replaced
the hashing algorithms are much less susceptible but are still weakened

Game theory and blockchain

Game theory is one of the fundamental ingredients of blockchain, which can be viewed as a game where players are miners or validators. Let's explore how two important concepts in game theory - Nash equilibrium and Pareto optimality - have an impact on blockchain systems.

Nash equilibrium is a concept in game theory, named after the mathematician John Nash. It represents a situation in which each participant in a strategic interaction makes decisions, taking into account the choices of others, and no player has an individual incentive to unilaterally change their strategy given the choices of the others.

Nash equilibrium is a fundamental concept in various fields, including economics, political science, and computer science. It is widely used to analyze and understand strategic interactions among rational decision-makers.

Ron Howard's movie A Beautiful Mind tells the story of John Nash, played by Russell Crowe.

On the other hand, Pareto optimality, named after economist Vilfredo Pareto, concentrates on overall social welfare and efficiency. While Nash equilibrium focuses on stable individual strategies, a Pareto optimal outcome signifies an efficient allocation of resources where no further changes can be made to improve overall welfare without adversely affecting someone.

Pareto optimality is related to maximizing collective welfare without making any individual worse off.
On the other hand, Nash equilibrium focuses on individual self-interest and lack of incentive to deviate from one's strategy given the strategy of others.

In a game context, Nash equilibria do not always lead to Pareto optimal outcomes. This means that a situation in which each player maximizes their own gain may not be the best for overall welfare.

Let's explore the two different concepts using the famous prisoner's dilemma. The prisoner's dilemma is a game theory thought experiment that involves two rational agents, each of whom can cooperate for mutual benefit or betray their partner for individual reward. It models many real-world situations involving strategic behavior.

Two members of a criminal gang are arrested and imprisoned. Each prisoner is in solitary confinement with no means of speaking to or exchanging messages with the other. The police admit they don't have enough evidence to convict the pair on the principal charge. They plan to sentence both to a year in prison on a lesser charge. Simultaneously, the police offer each prisoner a Faustian bargain. If the prisoner testifies against the partner, he will go free while the partner will get three years in prison on the main charge. If both prisoners testify against each other, both will be sentenced to two years in jail. The prisoners are given a little time to think this over, but in no case may either learn what the other has decided until he has irrevocably made his decision. Each is informed that the other prisoner is being offered the very same deal.

This leads to four different possible outcomes for prisoners A and B:

If A and B both remain silent, they will each serve 1 year in prison.
If A testifies against B but B remains silent, A will be set free (0 years in prison) while B serves 3 years in prison.
If A remains silent but B testifies against A, A will serve 3 years in prison and B will be set free (0 years in prison).
If A and B testify against each other, they will each serve 2 years.

The Nash equilibrium is the strategy in which both prisoners testify against (betray) each other. This strategy optimizes individual self-interest. Indeed, if A betrays B, then:

A serves 0 years in prison if B stays silent, or
A serves 2 years in prison if B also betrays A.

Hence, on average, A serves 1 year in prison.

On the other hand, if A remains silent, then

A serves 1 year in prison if B also stays silent, or
A serves 3 years in prison if B betrays A.

Hence, on average, A serves 2 years in prison. Notice that the penalty of the case A remains silent is (1, 3) and totally dominates the penalty of the other case A testifies which is (0, 2).

In this case the Nash equilibrium is not Pareto efficient: the best solution for the group of two prisoners is clearly achieved when both remain silent (do not betray), collecting 2 years of prison overall. All the other solutions are worse (namely, 3, 3, and 4 years).

Let's investigate some examples of these game theoretic concepts concerning the blockchain. Imagine a blockchain where miners compete to validate transactions and add blocks to the chain. Each miner decides whether to participate in a mining pool or mine independently. The reward structure favors participation in a pool due to more consistent returns, but there are concerns about centralization.

The Nash equilibrium might occur when a majority of miners join mining pools because, given the choices of others, individual miners find it more profitable to pool their resources for consistent rewards. Deviating from this strategy (e.g., mining independently) could lead to a lower expected return. However, this is not a Pareto optimum, since the best for the whole collectivity is decentralization of the blockchain, hence individual mining or mining in small pools, because a decentralized system is not controlled by any single entity and hence it is more robust.

As another example, consider mining as a competitive game, where each miner challenges others to create a new block and receive a reward, with no binding agreements between participants. Personal incentive is maximized by following the rules of mining game. In fact, if a miner who has just mined a brand new block misbehaves, such as by changing the coinbase transaction that rewards them by inserting a larger amount, then that block will be discarded from the network. Consequently the miner will lose the reward and also their reputation as a miner. Note that the Nash equilibrium strategy of maximum individual profit, which corresponds to a sound blockchain running stably, is also the Pareto optimum, i.e., it also realizes the best scenario for the entire collectivity of blockchain users, including miners themselves.

In summary, a clever economic incentive design that promotes honesty over cheating underpins proof-of-work blockchain consensus process. Miners voluntarily incur financial costs ex ante in the expectation of a potential future reward. The threat of sunk costs (i.e. not receiving the block reward because of dishonest behaviour but having already paid for the performed work) — creates the financial incentive for miners to play by the rules.

Assuming miners are profit-maximising economic agents, honesty is the most rational strategy. As a result, Bitcoin may be considered less a technical innovation and more a carefully calibrated socio-economic system that relies on a complex combination of economic incentives, game theory, and a solid technical foundation.

As Vitalik Buterin points out, however, money in not the only incentive in blockchain systems:

We've seen time and time again that purely financial incentives do not yield stable systems. Power concentrations, pump and dump schemes and rug pulls are all profitable to some, while damaging to most. When participants act only for their own profit, with no regard for the long game, every system is doomed.
We could say that Bitcoin miners don't collude because they are worried about their reputation, and the reputation of the Bitcoin blockchain as a whole. The Dilemma of Soulbound Tokens

In this podcast Vito Lops interviews Federico Rivi about the importance of game theory for the blockchain (in Italian).

Major blockchains

Bitcoin was proposed in 2008 by Satoshi Nakamoto (a pseudonymous) and launched in 2009.

Transactions in Bitcoin blockchain contain:

one or more inputs
one or more outputs
an amount to be transferred

To allow value to be split and combined, transactions contain multiple inputs and outputs. Normally there will be either a single input from a larger previous transaction or multiple inputs combining smaller amounts, and at most two outputs: one for the payment, and one returning the change, if any, back to the sender — Satoshi Nakamoto, Bitcoin white paper

The fee paid by the sender for a transaction on the Bitcoin network depends on how congested is the network at the transaction time and on the size of the transaction, which is affected primarily by the number of inputs.

The corresponding cryptocurrency is bitcoin (ticker: BTC).

Cryptocurrencies

A cryptocurrency is a permissionless system that operates without a central authority. Users are free to use the network and transact without prior approval by others. Like physical cash, users can transact pseudonymously and remain in full control of their own funds, a feature called self-custody. No single actor can unilaterally seize assets, reverse transactions, or change the ruleset. The blockchain also operates 24/7 around the clock and is cross-jurisdictional by nature. A cryptocurrency is secured by cryptography, which makes it nearly impossible to counterfeit or double-spend.

These properties do come at significant costs, however – a large network with massive redundancies, scalability and performance constraints, slow coordination and decision-making, and, sometimes, an expensive and energy-intensive consensus mechanism.

The market cap of a cryptocurrency (or of a company) is the the circulating supply multiplied by the current price of the crypto (or company's share). Here is a comparison of market caps of cryptos and largest US companies:

the market cap of principal cryptocurrencies
the market cap of the largest US companies

PreviousWeb3 NextWallets

Last updated 4 months ago

Was this helpful?

Table of contents