Stanford CS251: Cryptocurrencies, blockchains, and smart contracts

Lectures

  1. Introduction
  2. Creating a Digital currency
  3. Bitcoin Overview
  4. Bitcoin Blockchain
  5. Bitcoin Mining
  6. Bitcoin Miner interactions and Game Theory
  7. Cryptocurrencies: Community, Economics, and Politics
  8. Alternative Consensus
  9. Wallet & Anonymity
  10. Anonymity on Blockchain
  11. Altcoins
  12. Ethereum
  13. Ethereum
  14. Ethereum Governance
  15. Bitcoin Side-chains (guest talk)
  16. Bitcoin Payment channel
  17. Guest talk on Legal by Ben Lawsky  – does not seem worthy of transcribing
  18. Advanced Topics – Quantum Computing, Threshold Signatures, and storing secret state on public chains
  19. Advanced Topics – Smart property, publicly verifiable randomness, and prediction markets
  20. Guest talk by Adam Ludwin (CEO, chain.com) – does not seem worthy of transcribing

The notes are based on the 2016 version of the course CS251

Stanford CS251: Lecture 19

Lecture 19: Advanced Topics

Topic 1: Smart Property

Manage ownership of some property like stocks on the blockchain. Colored coins allow arbitrary properties on Smart Contract. Similar to Namecoin, there cannot be a light node/SPV for this. Another example is rental, car’s ownership goes from Alice to Carol in a 2-of-2 transaction from Alice to Carol and locked transaction to return the car’s possession after a fixed time. One still has to trust the car’s hardware and manufacturer.

Topic 2: Public Randomness (Lotteries)

  1. Multi-party randomness: Alice sends a commitment to x to Bob, Bob sends y, Alice reveals x and the random number is x XOR y. One can have deposits to force Alice to reveal x.
  2. Public Protocol: Random (unpredictable) beacon, consensus and is manipulation-resistant. One example is weather data coming from a trusted observer. Alternatively, use data from blockchain to get a PRNG. Puzzle solutions are not predictable, but it is not manipulation-resistant since the miner can withhold the block if they don’t like the result (cost is the loss of mining fee). The other issue is a collision. There is a workaround, apply a very slow hash function; for example, SHA-256 applied 240 times for the result. This means that by the time someone calculates the hash, the blockchain would have advanced on one of the chains and then the hash of the block on the main chain counts.

Topic 3: Prediction markets

Idea: Trade shares in future events Share(X) will be worth one if X happens and zero otherwise. Today the share is worth prob(X) assuming efficient markets. Augur is the most famous one on top of Ethereum.

How to trade?

One option is to have order books, so, you need someone else to take the bet on the other side. A better option is “automated market makers” which will always be willing to trade based on a formula.

How to decide what happened?

  1. Arbitration
  2. Reality Keys data feed
  3. m-of-n arbitration
  4. Decentralized approaches – half-baked theories like all stakeholders will vote and if they don’t agree everyone loses.

Stanford CS251: Lecture 18

Lecture 18: Advanced Topics

Three topics are chosen by students (another three for the next lecture)

Topic 1: Quantum Computing

An electron has two states top and bottom spins, represented as |1> and |0>. An electron is in a superposition of those two states with wave functions Ψ0 and Ψ1, so, the combined wave function is Ψ = Ψ0.|0> + Ψ1.|1> with |Ψ0|2 + Ψ12 = 1. |Ψ0|2 is the proability of seeing state |0> and Ψ12 is the probability of seeing state |1> respectively. Thus, Ψ = [Ψ0, Ψ1] is the overall state matrix. The state evolves using a 2X2 Hamiltonian matrix H, such that, second-degree norm won’t change, || H.V || = || V ||. This ensures |Ψ0|2 + Ψ12 = 1 always holds.

Ψn = sum of 2n states = vector of dimension 2n states = Σ Ψw. |w> where w ∊ [0, 1]n with Σ |Ψw|2 = 1, |Ψw|2 is the probability of seeing the state Ψw

Quantum computer
  1. System starts with a simple state says 1.|000...>
  2. Apply a sequence of Hamiltonians. There are two universal Hamiltonians. One flips a single electron, the other swaps the state of two electrons.
  3. Observe final state. Highest amplitude is on the correct answer.

Isolating quantum bits (qubits) from environment is hard. And therefore, building quantum computers is hard.

Two applications
  1. Grover’s algorithm – Consider a problem for a function which takes an n-bit input and a binary output, find the input which produces 1 as output. Classical algorithm takes 2n. Grover’s algorithm takes 2n/2 => quadratic speedup. This has an implication of solving proof of work way faster => difficulty has to adjust from say 270 to 2140=> Death of mining pools, the only people who can mine will be the one who owns quantum computer.
  2. Shor’s Algorithm – Given a periodic function F : G -> Y whose period P is unknown, it will take O(G) time to find the period P via brute-force. Shor’s algorithm can find P in O(Log|G| + time(F)) => exponential speedup. Finding period is sufficient to break Discrete logs => Breaks ECDSA => Given public key one can find the private key. Bitcoin has to move to post-quantum crypto like hash-based signatures which are 1024 bytes, much longer than 64-bytes ECDSA or Lattice-based signatures which are 1024 bytes as well.

Topic 2: Threshold Signatures

Bitcoin uses multi-sig. Standard approach is inflexibles since one has to commit to signers and thresholds in advance. It is not private since script has to be revealed for spent.

Threshold signatures solves this problem. For any signing algorithm like RSA and BLS (but not ECDSA) where signature = H(m)Sk, one can commit to a particular Sk and then find split it across signers. Multiplying the two signed will verify the correctness. If the parties are trutworthy enough to discard their share, one can easily move from 2-of-2 to 2-of-3 or 3-of-3 setup without changing the commitment. ECDSA is hard to thresholdize.

Topic 3: Ethereum Contracts with Secret State

For example, playing Blackjack on Ethereum.

Setup using El-Gamal Public Key encryption. Fix G and g in G. There are two private keys sk1 and sk2 with public values gsk1and gsk2. Public keys = gsk1 + sk2. El-Gamal has an interesting property called re-randomization, so, a ciphertext, can be randomized to another unlinkable ciphertext using the public key. Decryption can happen in phases using sk1 and sk2.

Contract

  1. Store sorted deck of cards – E(pk, 1), E(pk, 2)…
  2. Alice shuffle deck and re-randomize – E(pk, a1), E(pk, a2)… and also prove using Zero knowledge that shuffle is valid.
  3. Bob does the same shuffle and re-randomize. Now contracts stores a shuffled deck ct1, ct2…
  4. Contract deals two cards to Alice. Bob partially decrypts the two cards and then Alice decrypts the cards. Bob does not know which card Alice has.
  5. At the end, reveal sk1 and sk2. Contract re-verifies integrity and pays winner.

Stanford CS251: Lecture 16

Lecture 16: Bitcoin payment channel

Visa ~ 10, 000 transactions per second Bitcoin ~ 3 transactions per second => 60 GB of blockchain data per year

Waiting for 6 blocks ~ 60 mins is a huge wait for Bitcoin. Therefore, tipping or having an ongoing channel of payments on the blockchain is hard. Payment channels help with that.

Funding channel – unidirectional payment channels

Alice is planning to pay Bob.

  1. Alice creates a funding transaction. She puts, say, 100 bitcoin in 2-of-2 multi-sig account C.
  2. When Alice wants to pay Bob, she gives Bob a single funding transaction which sends money from C, 1 Bitcoin to Bob and 99 to Alice.
  3. A later transaction would be sending 2 to Bob and 98 to Alice. This overwrites the previous transaction.

Finally… say the last transaction is 40 go to Alice and 60 to Bob then Bob signs the last transaction and submits it to the chain. Since the last transaction is the best value for Bob.

Bob can host money hostage. Time-locked transactions ensure that Bob can hold Alice’s 100 Bitcoin hostage.

To build a bidirectional channel, one can have two unidirectional channels but they have a limitation, once Alice has exhausted her channel, she cannot send more money to Bob even if Bob has sent Alice money via the other channel.

Alternatively, one funding channel with time lock of 100 can be forked into another funding channel of time lock of 99, which can further be forked into funding channels with the smaller time locks. It is recommended not to go beyond the depth of 11 and time lock of 50 for such an approach since in the worst case, all the funding transactions will have to be committed.

Payment Channel Network (Lightning)

A -> D can be settled as A -> B -> C -> D off-chain using HTLC (Hash-time locked contracts). To do this, D sends a secret S to A and A commits P2SH with H(s), B and C do the same. D will reveal S for C -> D and now C reveals S and uses than for B -> C transaction and B uses that for A -> B transaction.

Stanford CS251: Lecture 15

Lecture 15: Bitcoin guest talk (Greg Maxwell & Pieter Wuille – Blockstream) on sidechains

Forking does not advance Bitcoin since forks suffer from economic acceptance.

UTXO model

UTXO model is less intuitive, more private, and smaller persistent storage footprint. UTXO implicitly prevents a replay attack. Ethereum carries nonce around even for empty accounts to prevent replay attacks.

Validation not computation

Bitcoin addresses are a 160-bit hash of the public key since the public key is unusually long (512-bit). Bitcoin payments can be made to scripts. These scripts are not for computation but spendability conditions. Rather than scripts, a hash of the script is added to the blockchain as a privacy improvement. 10% of Bitcoins are stored using P2SH scripts. MAST is meant to make transactions even smaller. One does not need a Turing-complete language since one only needs to verify and not compute on the blockchain.

Settling outside the main chain

Bitcoin is like a court, you go there for resolution, but you don’t carry all your business in front of the court. Transaction cut-through allows cooperating parties to reduce their fees by eliminating intermediate transactions, A -> B -> C becomes A -> C. Sidechains are like lower courts while Bitcoin main chain is like supreme court of the settlement.

Cut-through + Confidential transactions + Aggregation = Mimblewimble

Cut-through + Payment channels + Hash locked transactions = Lightning

One-way peg like Betacoin gives users a chance to burn Bitcoins and get Betacoins. But given that it is one-way, there is no way for the user to get the Bitcoins back. Therefore, Betacoins become less valuable of choice.

CoinWitness produced first two-way side chain but is not practical right now since the whole system has to be verified under a Zero-knowledge proof, but with SPV it is possible since SPV proofs are simple enough to verify, but there are tons of complications associated with long-chain reorgs.

Zero Knowledge Proofs are slow, but they can be used outside Bitcoin. For example, Alice gives Bob the hash of a key K and a piece of data encrypted with K. Using Zero-knowledge, Bob certified that encrypted solution is correct. Now, Bob sends money using P2SH to who-so-ever who can provide K which hashes to a given hash. A detailed example of a Bitcoin side-chain can be seen at https://github.com/elementsproject

Segregated Witness

Consider a 1-of-2 multi-sig; the transaction ID will depend on who signs, this invalidates successor transactions and is therefore troubling. SegWit solves this in a backward compatible way by making existing signature fields empty and moving them out using P2SH. Non-witness data costs more to store to discourage someone from storing too outside the chain.

Stanford CS251: Lecture 13

Lecture 13: Ethereum

Code: ROM (Read-only memory) calldata: arguments

There are two types of instructions: Arithmetic including SHA3 and sys operations like create [contract], call [contract], and delegate call, etc.

CALL – called code is executed in the context of called contract
CALLLOAD – called code is executed in the context of the current contract, msg.sender is calling contract
DELEGATECALL – similar to callload except for msg.sender remains unchanged

Exceptions: leads to execution halting

Applications

  1. DB: NameCoin
  2. Auctions: decentralized eBay but since everything is public, requires commitments to have sealed bid auctions.

How to safely transfer funds?

check-effects-interaction paradigm

if (amount > 0) {  // check
    pendingReturns[msg.sender] = 0;  // effects
    if (!msg.sender.send(amount)) {  // interaction
        pendingReturns[msg.sender] = amount;
        return false; 
    }
}

Events can be used for getting callbacks and executing a task in response to a change on the chain.

Use selfdestruct(beneficiary) to kill a contract and send the leftover money to the beneficiary.

Stanford CS251: Lecture 14

Lecture 14: Ethereum Governance

When contracts call other contracts, there are four major parameters, g – gas, v – value, in – in size of inputs, out – out size of outputs. The gas must come from the initial transaction, the ongoing calls to different cannot refuel the gas.

By default, all the gas is passed during the contract call and the value passed is 0.

A contract can receive money via contract.send(<money_in_wei>) only if it defines a fallback function

// This function is called with 2300 wei gas by default. This is sufficient for logging.
// Usually left blank
function () {
}
// gas 0 = 2300 wei
f.send(x) = f.value(x).gas(0)();  
The LHS and the RHS are same except for one subtle difference. If send fails it returns false, if the call on the right side fails, it throws an exception.

This has led to subtle bugs, for example, if the call f.send() is made after the stack is already 1024 levels deep then the call to send will fail. A contract not checking its return value can be in trouble.

f.send(100) is safe since it sends only 2300 gas, but f.call.value(100)() is unsafe against reentrancy attacks since it does not have a gas limit by default.

There are three ways to avoid reentrancy attacks – use contract.send, use a mutex to make all public calls non-entrant, and third, use the check-effects-interaction paradigm.

The DAO

The DAO was “the” Decentralized Autonomous Organization launched on April 30, 2016, tokens were available to buy for 27 days. By May 26, 2016, 10.1M Ether was invested in it (10% of all ether). Anyone can table an investment proposal and vote over 14 days, 20-53% was a quorum to put the money in an investment. This itself had a 53% attack, so, anyone with 53% can do whatever he wants. To prevent that 5 of 11 curators have to sign off the proposal. The only way to leave was to do a split which had 7 day signup period, everyone who signs up will leave with you, and then there is 27 day buy-in period. This suffered from a stalking attack since anyone who has majority shares can leave alongside you and get shares in the new DAO as well. The other problem was ambush voting, voting “no” locks one’s shares, so, it was best to not vote till the last moment. 3.6M Eth (5% of all Eth) was stolen via a reentrancy attack on DAO’s splitting code on June 16. July 20 was the deadline when all the forked DAOs to steal money would have been finalized. Hark fork was the only way out to save the lost funds. 81% voted in favor of the hard fork based on the polling by the Ethereum foundation. Some stayed on Ethereum classic on the old chain. Others moved to the new chain “Ethereum”.

Stanford CS251: Lecture 12

Recap: alt-coins

Bitcoin is a replicated state machine, the system moves within S States with I inputs producing O outputs. For Bitcoin, S is the set of UTXOs. For Namecoin, the state consists (name, value).

Ethereum’s goal was to implement this functionality in a general way by building a “consensus computer” expressed in a Turing-complete language.

Ethereum

State: Great arbitrary storage space, arbitrary code (isolated memory space), and account balance. Inputs: (address, input data) Transition: update storage and change account balance

Issues: process isolation ensured via signatures and resource consumption limits ensured by requiring a payment for everything.

An Ethereum block consists of merkle-like trees hashes – a tree hash of state (code, account balance, nonce, and storage) – a tree hash for updates which is a collection of transactions. Each transactions has a sender, money, Pid, and code. – a tree hash of receipts. Each transaction has a corresponding receipt. A receipt contains the final state, gas used, and log data. – Just like Bitcoin, all storage is public.

Accounts

Ethereum has (externally owned) accounts controlled by a private key and contracts (Dapp).

Address of account = SHA3(public key)[:20]Adddress of dapp = SHA3(creator's address, nonce)

Both have account balance and a nonce. EOA has public key, Dapp don’t. Dapp has code, EOA don’t.

Message Format

(to, from, value, data, startgas, gasprice)

Three important message types:

  1. Payment: from:sender, to:recipient, sender, value:$, (value) optional: data, gas_price:transaction fee
  2. Contract call: to: contract address, from:sender, value:$, data: f(), args, start_gas: how much computation are you willing to pay for
  3. Contract create: to: null, from:sender, data: code, value: initial balance, start_gas: pay for contract creation

Note: Nonce goes up every sent transaction to prevent replay attack.

EVM

Code is written in Ethereum Virtual Machine (EVM) bytecode – RAM is 32-bytes (256-bit), and persistent storage is 32-bytes addressable (2256 bytes). All memory is initialized to 0. Ethereum call stack size is limited to 1024. Storage can be word-addresssable, each key is 256-bits in size. Features: crypto (SHA3), interaction with blockchain, send messages, logging/output. Missing features: No RNG or else txns are not reproducible. No floating points.

No one writes EVM code directly. Solidity (Javascript-like) is more popular now. Serpent (Python-like) is less popular. Mutan (C/Go-like) is under development, Visual Basic is under development as well.

Gas (transaction fees)

Every message specifies STARTGAS and GASPRICE. Current gas price is about 30 billion wei (3 * 10-8 ether). You pay start gas * gas price, deducted at the start. If you are out of gas, execution halts, state reverts but miner keeps the gas as the fee.

Contract creation = 32K gas ~ 0.01$
Storage ~ 0.005$ per word

Stanford CS251: Lecture 11

Lecture 11: Altcoins

Three ways to improve Bitcoin

  1. Code update – This can update or change the P2P network
  2. Soft fork – To introduce a stricter verification for example P2SH
  3. Hard fork – transaction improvements and consensus change

Altcoin = Any cryptocurrency except Bitcoin

Launching an altcoin

  1. Sales pitch – new features
  2. Value/exchange rate
  3. Code
  4. Miners – the value of the currency will bring them or go for merge mining (explained below)
  5. Genesis block – For bootstrapping the right blockchain, it can be rooted in Bitcoin as well

Examples,

  1. Mazacoin – For sovereign tribes in the US
  2. Auroracoin – For Iceland. Only 30K out of 300K claimed it. Price immediately tanked.

How to do the initial allocation?

  1. Just start mining – Bitcoin approach.
  2. Pre-mine – allocated to the founders or “pre-mine with delay” to allocate after a certain time has passed
  3. Auction
  4. Hard fork Bitcoin – everyone who owns Bitcoin gets a proportion of the new currency
  5. One-way peg – Proof-by-burn of Bitcoin. The person burns Bitcoin by sending it to H(pk) = “Altcoin Id|Ka”. XCP (Counterparty) did this. One gets the coins by publishing the proof of burn onto a new chain. This sets up the ceil for the exchange rate as well as ceil for the new coin’s price.
  6. Two-way peg – side chains.
    This requires a soft fork of BTC.

Mining

Mining new coin is risky since there is no mining power backing the new coin, coiledcoin was killed by 51% attack. Alternatively, launch using a new Proof-of-work.

Another alternative is merge mining – BTC miner can mine altcoins for free. Miners including the hash of the altcoin block in the coinbase of the BTC block. Altcoin becomes a little less efficient since one has to check both the validity of the altcoin and that the bitcoin block contains the hash of the altcoin block – the bitcoin block does not even have to be valid. That’s why it is possible to merge mine altcoin block faster than bitcoin as well.

Overlay currency

Use Bitcoin blockchain as a ledger. For example, Mastercoin and Counterparty. The only problem is that one cannot prevent double-spending of an altcoin like that, so, one has to parse the full chain of Bitcoin to verify to ensure that the altcoin is not being double-spent => no light nodes are possible.

Application-specific Cryptocurrencies (Namecoin)

Namecoin’s goal was decentralized name-value mapping. Added three op-codes, NAME_NEW to add a new hash(name). NAME_FIRST_UPDATE to add (name, value) pair, and NAME_UPDATE to update the value for an existing name. Name claims expire after one year (unless updated). Hash was done to avoid front-running attack but a randomized commitment would have been better to avoid brute-forcing. This didn’t work that well in practice since all the good names were taken by squatters.

Stanford CS251: Lecture 10

Lecture 10: Anonymity on Blockchain

(Coinjoin continued from the previous lecture)

  1. Each participant writes an input transaction (input address, change address) on say Pastebin. Over Tor, each participant writes an output address. These two entries and not linkable to each other.
  2. Now, each participant signs a CoinJoin transaction only if they like their own input and output entries.
  3. Someone sends the final transaction to the miner.

DoS attacks on CoinJoin:

  1. If anyone drops out or goes offline, then the CoinJoin fails.
  2. If the UTXO mentioned in #1 is spent between #1 and #2,  then the CoinJoin transaction (T_cj) fails.

CoinJoin works only for the small sets ~40. Everyone can collude and unmask a single participant (multiple mixing eliminates it). All outputs must have the same value or else anonymity is lost.

Crypto mixing mechanisms mix over a much larger set. There are two mechanisms for it.

  1. Commitment schemes – One commits to a value without revealing the value. commit(m) -> (c, r). c is commitment and r is a secret to reveal it. verify(c, m, r) -> true/false verifies the commitment. This scheme should have the binding property, so that, no other m can be revealed for the same c. The scheme should also have the hiding property, so that, c should not reveal any information about m. For example, H(m, r) -> c where H is a collision-resistant hash function. A simple example of commitment is coin-flipping where Alice sends a commitment to a bit, Bob sends their bit in clear, and then Alice reveals its bit. XOR of the two bits is the final coin toss outcome.
  2. ZKPK (Zero Knowledge Proof of Knowledge) – Alice wants to prove that for a given X and P(X,w) -> {0,1}, she knows witness w s.t. P(X,w) = 1. She wants to prove this to verifier Bob without revealing w to Bob. ZKPK is possible for all of NP and beyond. Anything that can be proven, can be proven in Zero-Knowledge. Proof can be non-interactive, so, one single message from Alice to Bob is sufficient. Size of the proof ~ size of the program/predicate P. Even simple example is complicated. ZK-SNARK, a succinct non-interactive argument of knowledge, is based on a CRS (common reference string). Alice produces proof using CRS, X, and w. The proof is ~ 300 bytes which are a huge advantage. CRS is huge ~ 1 GB. Verifier Bob can verify the proof and X using CRS. All SNARKs require a trusted setup. CRS creator can prove wrong statements. So, shorter proofs have CRS size penalty.

Confidential Transactions – replace amounts by commitments along with ZKPK (not SNARK) that sum of inputs >= sum of the outputs and the numbers are non-negative. This proof is ~ 2.5KB. Miners can mine the transaction along with the proof. This is implemented in the Bitcoin core but not enabled.

Zerocoin

Let every Zerocoin = 0.1 BTC. Let Bob chooses a 256-bit sequence number s_n and generates a commitment (c, r) for sn.

  1. Bob sends 0.1 BTC to address zc:c. c is linked to Bob but no one knows s_n since r is secret. There are tons of such coins now, c_1, c_2, … c_n. We will create a Merkle tree with root h.
  2. To redeem, Bob constructs (L, sn, π) where L is a fresh Bitcoin address, π is a ZKPK (SNARK) that proves Bob knows (r, c) s. t. c is in Merkle Tree rooted in h and verify(c, sn, r) is true. So, Bob is claiming that he owns one bitcoin in the Merkle tree and he knows how to open it without revealing which bitcoin is his. Miners with check π and zero coin s_n is unspent. If this holds, miners post s_n -> L: 0.1 BTC.  s_n is public now but no one can connect s_n back to Bob. Binding property ensures no double spending. Hiding property ensures that no one can link s_n to c. Note that, if someone maliciously generates the same s_n as an existing one, the only the first withdrawal would succeed.

Note:

  1. Zcash evolved further than Zero coins to allow different amounts.
  2. Some other systems worth looking at are CryptoNote and Monero.