Blockchain

The basics of blockchain technology

Table of contents

The principles of blockchains

Blockchain is a political technology advocating the following general principles:

  • decentralization: the organization of work is horizontal and not hierarchical

  • responsibility: individuals assume responsibilities themselves, they do not delegate them. This implies a higher level of commitment and knowledge

  • disintermediation: interaction takes place directly between the parties involved, without intermediaries

  • autonomy: the system, once initiated and settled, evolves with minimal intervention from the outside

  • transparency: information is public and verifiable by all

  • incentive: every rational individual has both intrinsic and extrinsic incentives to participate actively and honestly to the system, increasing its value

  • sovereignty of the individual: individuals own what they create; they can value and monetize it

Instead of command-and-control structures, blockchains set the incentives in a way that individual actors optimize their own utility if they cooperate according to the rules of the game.

Blockchains are interdisciplinary

A blockchain is:

  • a distributed system

  • using cryptography

  • to secure an evolving consensus

  • about a token with social or economic value

Blockchains bring together:

  • mathematics (cryptography)

  • computer science (distributed systems)

  • politics (mechanisms for reaching consensus)

  • economics (exchange of valued tokens)

Origins of blockchain

The origins of blockchain go back to the crypto-anarchism and cypherpunk movements of the late 80s, both influenced by the Austrian school of economics. These movements advocate widespread use of strong cryptography (highly resistant to cryptanalysis) in an effort to protect privacy, political and economic freedom.

An excerpt from the Crypto Anarchist Manifesto by Timothy C. May (1988):

Combined with emerging information markets, crypto anarchy will create a liquid market for any and all material which can be put into words and pictures. Timothy C. May, Crypto Anarchist Manifesto, 1988

The technical specification of blockchain was proposed in 1991 by Stuart Haber, a cryptographer, and Scott Stornetta, a physicist. They published their work in The Journal of Cryptography in 1991 under the title How to Time-Stamp a Digital Document and one year later they registered it with a US patent.

Haber and Stornetta were trying to deal with the epistemological problem of truth in the digital age:

The prospect of a world in which all text, audio, picture and video documents are in digital form on easily modifiable media raises the issue of how to certify when a document was created or last changed. The problem is to time-tamp the data, not the medium. Haber and Stornetta, How to Time-Stamp a Digital Document, 1991

In particular, they started from two questions:

  1. If it is so easy to manipulate a digital file on a personal computer, how will we know what was true about the past?

  2. How can we trust what we know of the past without having to trust a central authority to keep the record?

They include in the front page of the paper the following citation:

Time's glory is to calm contending kings, To unmask falsehood, and bring truth to light, To stamp the seal of time in aged things, To wake the morn, and sentinel the night, To wrong the wronger till he render right.

William Shakespeare - The Rape of Lucrece

Notable cypherpunks
  1. Eric Hughes: co-founder of the Cypherpunks movement; author of the Cypherpunk Manifesto

  2. Timothy C. May: co-founder of the Cypherpunks movement; author of the Crypto Anarchist Manifesto

  3. John Gilmore: co-founder of the Cypherpunks movement

  4. Jude Milhon: co-founder of the Cypherpunk mailing list; patron saint of hackers by the name of St. Jude

  5. Satoshi Nakamoto: pseudonym of the inventor of Bitcoin. In Japanese, "Satoshi" means "wisdom"; "Naka" can mean "tool"; "Moto" can mean "creation"

  6. David Lee Chaum: digital currency pioneer, author of the first proposal for a blockchain protocol

  7. Stuart Haber: co-author of the first academic paper on blockchain technology

  8. Scott Stornetta: co-author of the first academic paper on blockchain technology

  9. Adam Back: inventor of Hashcash (proof-of-work consensus algorithm used by Bitcoin and until 2022 also by Ethereum)

  10. Ralph Merkle: pioneer of asymmetric cryptography; inventor of cryptographic hashing and the Merkle tree

  11. Ron Rivest: co-creator of RSA (asymmetric cryptography algorithm)

  12. Nick Szabo: inventor of smart contracts (programs on blockchain); designer of bit gold (a precursor to Bitcoin)

  13. Wei Dai: creator of B-money (a precursor to Bitcoin)

  14. Philip Zimmermann: creator of the original version of PGP (the world's most widely adopted cryptosystem)

  15. Hal Finney: author of PGP 2. 0 and Satoshi Nakamoto's first collaborator; recipient of the first transaction on Bitcoin

  16. Bram Cohen: creator of BitTorrent (peer-to-peer protocol aimed at exchanging files in the network)

  17. Richard Stallman: founder of the Free Software Foundation

  18. Julian Assange: founder of WikiLeaks

  19. Tim Berners-Lee: inventor of the Web

Blockchain components

The numerous components of blockchain technology can make it challenging to understand. However, each component can be described simply and used as a building block to understand the larger complex system:

Blocks

A block is a container for data

In its simplest form it contains:

  • an identification number

  • a timestamp of block creation

  • a bunch of data (usually, transactions)

Hashes

Each block has a fingerprint called hash that is used to certify the information content of the block.

  • Hashes of blocks are created using cryptographic hash functions, that are mathematical algorithms that maps data of arbitrary size to a bit string of a fixed size

  • a popular hash algorithm is SHA-256, designed by the United States National Security Agency (NSA)

  • it uses a hash of 256 bits (32 bytes), represented by an hexadecimal string of 64 figures

  • 225610772^{256} \approx 10^{77} is huge (more or less the estimated number of atoms of our universe), an infinite number for any practical purposes

The ideal cryptographic hash function
  1. it is deterministic so the same message always results in the same hash

  2. it is quick to compute the hash value for any given message

  3. it is chaotic meaning a small change to a message should change the hash value extensively

  4. it is infeasible (but not impossible) to generate a message from its hash value

  5. it is infeasible (but not impossible) to find two different messages with the same hash value

Let's create some hashes from quite similar lines of a famous poem:

Code to create a hash from a string
# load library
library(digest)

# hash a string
digest("Così tra questa immensità s'annega il pensier mio", "sha256")

# hash a slightly different string
digest("Così tra questa infinità s'annega il pensier mio", "sha256")

Chain

  • Blocks are chronologically concatenated into a chain by adding to the block a field with the hash of the previous block in the chain

  • it follows that the hash of each block is computed using also the hash of the previous block

  • this means that the hash of a block encodes all previous history of the blockchain

  • moreover, if you alter one block you need to modify not only the hash of it but that of all following blocks for the chain to be valid

  • the first block of the chain is called the genesis block and represents the initial state of the system

Two notable genesis blocks are:

  1. the genesis block of Bitcoin blockchain

  2. the genesis block of Ethereum blockchain

Next we write some code to mine a new block and to create a chain of blocks:

A function that mines a new block
mine <- function(previous_block, genesis = FALSE){
  if (genesis) {
    # define genesis block
    new_block <- list(number = 0,
                      timestamp = Sys.time(),
                      data = "I'm the genesis block",
                      parent_hash = "0")  
  } else {
    # create new block
    current_number = previous_block$number + 1
    new_block <- list(number = current_number,
                      timestamp = Sys.time(),
                      data = paste0("I'm block ", current_number),
                      parent_hash = previous_block$hash)
  }
  # add hash 
  new_block$hash <- digest(new_block, "sha256")
  return(new_block)
}

Consensus

Byzantine Generals Problem
  • Several generals are on the verge of attacking an enemy city during a siege. They are located in different strategic areas and can only communicate via messengers in order to coordinate the decisive attack

  • however, among these messengers, it is highly probable that there are traitors. The traitors carry messages that contradict the army's strategy

  • the problem, therefore, lies in the ability to carry out the attack effectively despite the risk of treason. This is known as decentralized consensus

The problem faced by the Byzantine generals is the same as that faced by distributed computing systems, such as blockchain systems.

How to reach a consensus on a distributed network where some nodes may be faulty or voluntarily corrupted?

In the blockchain setting, the problem is solved using one of the following consensus mechanisms:

  1. proof-of-work (currently used by Bitcoin)

  2. proof-of-stake (currently used by Ethereum)

  • In proof-of-work blockchains, miners work to find a solution to a computational problem that is hard to solve and easy to verify

  • this is a cryptographic puzzle that can be attacked only with a brute-force approach (trying many possibilities), so that only computational power counts

  • typically, the proof of work problem involves finding a number (called nonce) that once added to the block is such that the corresponding block hash starts with a string of leading zeros of a given length called difficulty

  • the average work that a miner needs to perform in order to find a valid nonce is exponential in the difficulty, while one can verify the validity of the block by executing a single hash function

  • the implementation of this consensus model uses resource intensive computations

Here is the code implementing the proof-of-work method, as well as the updated mine and chain functions using proof-of-work:

A function that creates a block using proof-of-work
proof_of_work = function(block, difficulty) {
  block$nonce <- 0
  block$hash = digest(block, "sha256")
  zero = paste(rep("0", difficulty), collapse="")
  while(substr(block$hash, 1, difficulty) != zero) {
      block$nonce = block$nonce + 1
      block$hash = digest(block, "sha256")  
  }
  return(block)
}
Hashrate

Hashrate measures the speed at which a machine – or a collective of machines – can process a proof-of-work algorithm. It is expressed in number of hashes per second that a machine can generate.

Total hashrate generally refers to the aggregate computing power of all mining hardware attempting to solve the puzzle at a given point in time. As of February 20th, 2024, Bitcoin’s total hashrate is 537.84 EH/s (exahashes per second, where 1 exa = 101810^{18}).

Transactions

  • A block contains a header with metadata (like block number and timestamp) and a data field with a certain number of transactions

  • a transaction represents an interaction between parties, typically a transfer from sender to receiver of cryptocurrencies or of any other token, possibly mediated by a smart contract

  • each transaction has a fee that must be paid by the sender and depends on the network congestion and the complexity of the transaction

  • here is a real transaction on the Ethereum blockchain:

    • user 0x9674 (the buyer) interacts with SuperRare smart contract 0x65b4 (an NFT marketplace) and buys an NFT sold by user 0xf8b3 (the seller)

    • the buyer pays 2.06 ETH (1.7 goes to the seller and 0.36 to the marketplace as a fee)

    • the seller transfers the NFT to the buyer

    • the transaction fee is 0.02 ETH and goes to the miner

Digital signature

  • Blockchain uses asymmetric cryptography (also known as public-key cryptography) to implement digital signatures of transactions

  • asymmetric cryptography uses a pair of keys: a public key and a private key

  • the public key is made public, but the private key must remain secret

  • each transaction is signed with the sender's private key and anyone can verify the authenticity of the transaction using the sender's public key

RSA
  • RSA (Rivest–Shamir–Adleman) is one of the first asymmetric cryptography algorithms and is widely used for secure data transmission

  • in RSA, the asymmetry between private and public keys is based on the practical difficulty of the factorization of the product of two large prime numbers, the factoring problem

  • there are currently no published methods to defeat the system if a large enough key is used

Here is a code chunk that implements digital signature:

Asymmetric cryptography
# load library
library(openssl)

# generate a private key (prikey) and a public key (pubkey)
prikey <- rsa_keygen()
pubkey <- prikey$pubkey

# Write keys in Privacy-Enhanced Mail (PEM) format
write_pem(pubkey)
write_pem(prikey)

# build a transaction
trans = list(sender = "A", receiver = "B", amount = "100")

# serialize data
data <- serialize(trans, NULL)

# sign (a hash of) the transaction with private key
sig <- signature_create(data, sha256, key = prikey)

# verify the message with public key
signature_verify(data, sig, sha256, pubkey = pubkey)

Asymmetric encryption is also employed to secure (encrypt) messages, as explained in the following video from Simply Explained that digs deeper into symmetric vs. asymmetric encryption:

Peer-to-peer network

Finally, the blockchain ledger is distributed over a peer-to-peer network of nodes (computers serving the blockchain). In this way, no central authority has control on the blockchain.

The steps to run the network are as follows:

  1. new transactions are broadcast to all nodes

  2. each node collects some transactions into a block

  3. on a proof-of-work blockchain, each node works on finding a difficult proof of work for its block; the first node that finds the solution becomes the miner of the block

  4. in a proof-of-stake blockchain, a validator node is randomly selected - the likelihood of a node being chosen is proportional to the the stake of the node

  5. the miner/validator broadcasts the block to all nodes of the network

  6. nodes accept the block only if all transactions in it are authentic and not already spent

  7. nodes express their acceptance of the block by working on creating the next block in the chain, using the hash of the accepted block as the previous hash (notice that all previous work is lost in case of PoW)

  8. the reward for the miner/validator is inserted as a first transaction (called coinbase) of the mined block; in this way the miner/validator has an incentive to remain honest

  9. here is a (funny) blockchain transaction visualizer

All the code shown in this part is contained in the following R Markdown document:

Private blockchains

While a public blockchain brings transparency, decentralization, and security, a private blockchain enacts governing rules to write, edit, or even delete blockchain entries. The ledger is maintained by very few validators known as trusted intermediaries. To become a trusted intermediary, actors must disclose their identity and receive approval from a consortium. Then, block validation is arbitrarily operated by these validators relying on e.g., proof-of-authority (PoA), a consensus mechanism where blocks are individually signed by nodes, not depending on capital or energy tenet but on confidence.

Proof-of-authority enables a high level of scalability owing to the limited number of nodes verifying and adding new blocks. However, access restrictions inevitably lead to a centralization of node operators and thus to a strong dependence on one or a few actors, as well as transparency issues, rendering private solutions unsuitable for many applications on public blockchains.

The impact of quantum computing on blockchain
  • the cryptographic algorithms utilized within most blockchain technologies for asymmetric pairs (digital signature) will need to be replaced

  • the hashing algorithms used by blockchain networks are much less susceptible but are still weakened

Game theory and blockchain

Game theory is one of the fundamental ingredients of blockchain, which can be viewed as a game where players are miners or validators. Let's explore how two important concepts in game theory - Nash equilibrium and Pareto optimality - have an impact on blockchain systems.

Nash equilibrium is a concept in game theory, named after the mathematician John Nash. It represents a situation in which each participant in a strategic interaction makes decisions, taking into account the choices of others, and no player has an incentive to unilaterally change their strategy given the choices of the others.

Nash equilibrium is a fundamental concept in various fields, including economics, political science, and computer science. It is widely used to analyze and understand strategic interactions among rational decision-makers.

Ron Howard's movie A Beautiful Mind tells the story of John Nash, played by Russell Crowe.

On the other hand, Pareto optimality, named after economist Vilfredo Pareto, concentrates on overall social welfare and efficiency. While Nash equilibrium focuses on stable individual strategies, a Pareto optimal outcome signifies an efficient allocation of resources where no further changes can be made to improve overall welfare without adversely affecting someone.

The relationship between Nash and Pareto is often explored in game theory and economics.

  • Pareto optimality is related to maximizing collective welfare without making any individual worse off.

  • On the other hand, Nash equilibrium focuses on individual self-interest and lack of incentive to deviate from one's strategy given the strategy of others.

In a game context, an outcome can be either a Nash equilibrium or a Pareto optimum. However, Nash equilibria do not always lead to Pareto optimal outcomes. This means that a situation in which each player maximizes his or her own gain may not be the best for overall welfare.

Let's explore the two different concepts using the famous prisoner's dilemma. The prisoner's dilemma is a game theory thought experiment that involves two rational agents, each of whom can cooperate for mutual benefit or betray their partner for individual reward. It models many real-world situations involving strategic behavior. In casual usage, the label prisoner's dilemma may be applied to any situation in which two entities could gain important benefits from cooperating or suffer from failing to do so, but find it difficult or expensive to coordinate their activities.

Two members of a criminal gang are arrested and imprisoned. Each prisoner is in solitary confinement with no means of speaking to or exchanging messages with the other. The police admit they don't have enough evidence to convict the pair on the principal charge. They plan to sentence both to a year in prison on a lesser charge. Simultaneously, the police offer each prisoner a Faustian bargain. If the prisoner testifies against the partner, he will go free while the partner will get three years in prison on the main charge. If both prisoners testify against each other, both will be sentenced to two years in jail. The prisoners are given a little time to think this over, but in no case may either learn what the other has decided until he has irrevocably made his decision. Each is informed that the other prisoner is being offered the very same deal. Each prisoner is concerned only with his own welfare—with minimizing his own prison sentence.

This leads to four different possible outcomes for prisoners A and B:

  1. If A and B both remain silent, they will each serve 1 year in prison.

  2. If A testifies against B but B remains silent, A will be set free (0 years in prison) while B serves 3 years in prison.

  3. If A remains silent but B testifies against A, A will serve 3 years in prison and B will be set free (0 years in prison).

  4. If A and B testify against each other, they will each serve 2 years.

The Nash equilibrium is the strategy in which both prisoners testify against (betray) each other. This strategy optimizes individual self-interest. Indeed, if A betrays B, then:

  • A serves 0 years in prison if B stays silent, or

  • A serves 2 years in prison if B also betrays A.

Hence, on average, A serves 1 year in prison.

On the other hand, if A remains silent, then

  • A serves 1 year in prison if B also stays silent, or

  • A serves 3 years in prison if B betrays A.

Hence, on average, A serves 2 years in prison. Notice that the penalty of the case A remains silent is (1, 3) and totally dominates the penalty of the other case A testifies which is (0, 2).

In this case the Nash equilibrium is not Pareto efficient: the best solution for the group of two prisoners is clearly achieved when both remain silent (do not betray), collecting 2 years of prison overall. All the other solutions are worse (namely, 3, 3, and 4 years).

Let's investigate some examples of these game theoretic concepts concerning the blockchain. Imagine a blockchain where miners compete to validate transactions and add blocks to the chain. Each miner decides whether to participate in a mining pool or mine independently. The reward structure favors participation in a pool due to more consistent returns, but there are concerns about centralization.

The Nash equilibrium might occur when a majority of miners join mining pools because, given the choices of others, individual miners find it more profitable to pool their resources for consistent rewards. Deviating from this strategy (e.g., mining independently) could lead to a lower expected return. However, this is not a Pareto optimum, since the best for the whole collectivity is decentralization of the blockchain, hence individual mining or mining in small pools, because a decentralized system is not controlled by any single entity and hence it is more robust.

As another example, consider mining as a competitive game, where each miner challenges others to create a new block and receive a reward, with no binding agreements between participants. Personal incentive is maximized by following the rules of mining game. In fact, if a miner who has just mined a brand new block misbehaves, such as by changing the coinbase transaction that rewards them by inserting a larger amount, then that block will be discarded from the network. Consequently the miner will lose the reward and also their reputation as a miner. Note that the Nash equilibrium strategy of maximum individual profit, which corresponds to a sound blockchain running stably, is also the Pareto optimum, i.e., it also realizes the best scenario for the entire collectivity of blockchain users, including miners themselves.

In summary, a clever economic incentive design that promotes honesty over cheating underpins blockchain consensus process. Miners voluntarily incur financial costs ex ante in the expectation of a potential future reward. The threat of sunk costs (i.e. not receiving the block reward because of dishonest behaviour but having already paid for the performed work) — creates the financial incentive for miners to play by the rules.

Assuming miners are profit-maximising economic agents, honesty is the most rational strategy. As a result, Bitcoin may be considered less a technical innovation and more a carefully calibrated socio-economic system that relies on a complex combination of economic incentives, game theory, and a solid technical foundation.

As Vitalik Buterin points out, however, money in not the only incentive in blockchain systems:

We've seen time and time again that purely financial incentives do not yield stable systems. Power concentrations, pump and dump schemes and rug pulls are all profitable to some, while damaging to most. When participants act only for their own profit, with no regard for the long game, every system is doomed.

We could say that Bitcoin miners don't collude because they are worried about their reputation, and the reputation of the Bitcoin blockchain as a whole. The Dilemma of Soulbound Tokens

In this podcast Vito Lops interviews Federico Rivi about the importance of game theory for the blockchain (in Italian).

Bitcoin / Ethereum / Tezos

The two major blockchains are Bitcoin and Ethereum. Tezos in a playground.

Bitcoin was proposed in 2008 by Satoshi Nakamoto (a pseudonymous) and launched in 2009.

Transactions in Bitcoin blockchain contain:

  1. one or more inputs

  2. one or more outputs

  3. an amount to be transferred

To allow value to be split and combined, transactions contain multiple inputs and outputs. Normally there will be either a single input from a larger previous transaction or multiple inputs combining smaller amounts, and at most two outputs: one for the payment, and one returning the change, if any, back to the sender — Satoshi Nakamoto, Bitcoin white paper

The fee paid by the sender for a transaction on the Bitcoin network depends on how congested is the network at the transaction time and on the size of the transaction, which is affected primarily by the number of inputs.

The corresponding cryptocurrency is bitcoin (ticker: BTC).

Cryptocurrencies

A cryptocurrency is a digital currency secured by cryptography, which makes it nearly impossible to counterfeit or double-spend.

A defining feature of cryptocurrencies is that they are not issued by any central authority, as opposed to centralized digital currency or fiat money. Instead, they work on decentralized networks based on blockchain technology that serve as a public financial transaction database. This renders them theoretically immune to government interference or manipulation.

How are cryptocurrencies different from traditional electronic payment systems?

Electronic payments are typically enabled through closed book-entry systems where customer accounts are centrally maintained by operators such as commercial banks or credit card companies. These institutions act as gatekeepers that exercise discretionary control over the payment network. As a result, users may be denied access, have accounts closed, or see transactions flagged and reversed.

In contrast, a cryptocurrency is a permissionless system that operates without a central authority. Users are free to use the network and transact without prior approval by others. Like physical cash, users can transact pseudonymously and remain in full control of their own funds (self-custody). No single actor can unilaterally seize assets, reverse transactions, or change the ruleset. The blockchain also operates 24/7 around the clock and is cross-jurisdictional by nature.

These properties do come at significant costs, however – a large network with massive redundancies, scalability and performance constraints, slow coordination and decision-making, and, sometimes, an expensive and energy-intensive consensus mechanism.

What is the financial nature of cryptocurrencies?

Commodities or securities? At least bitcoin does not have a formal issuer and thus is not someone’s liability. This makes it more akin to digital commodities (like gold and oil) than digital currencies or securities (like stocks and bonds). In fact bitcoin is for many reasons similar to digital gold.

The market cap of a cryptocurrency (or of a company) is the the circulating supply multiplied by the current price of the crypto (or company's share). Here is a comparison of market caps of cryptos and largest US companies:

An ETF, or Exchange-Traded Fund, is an investment vehicle that operates like a stock, but it tracks the performance of an underlying asset or index rather than an individual company.

ETFs allow investors to gain exposure to various assets, such as gold or oil, without directly owning them. These funds trade on traditional stock exchanges, and their values typically mirror the fluctuations of the underlying asset's price.

Last updated