gwylim

Blockchain storage scaling

January 04, 2017

After discovering the Ethereum project, I’ve recently become interested in the idea of the blockchain again. I’ve spent a fair amount of time thinking about blockchain scalability, and have come up with a mechanism which can potentially solve the problem of having to store the state of all accounts. This is done by changing the trie used to store the state to a different data structure, in such a way that nodes only need to store the (logarithmic sized) part of this data structure containing their own accounts. Essentially this would allow all nodes to be light nodes.

This does not address the other scalability issue, which is that all nodes need to process all transactions.

Briefly summarized, these are the benefits and tradeoffs involved:

In order to create or verify new blocks, nodes only need to store the block headers of past blocks.
Each account will have part of the state data associated with it which is needed to use that account in transactions. This data will be logarithmic in size in the number of transactions since the account’s creation. Nodes will need to store this logarithmic amount of data for each account they have, for each block (the data will change for every block).
In order to use an account, nodes will need to process all blocks which occur after the last time the account was used (in order to keep track of the account specific state mentioned above). In particular, just having the address of an account will not be enough to use it. This is relevant in the context of weak subjectivity, where we might want to bootstrap a node starting at a block other than the genesis block.
The same data will also need to be included in transactions.

Background: Merkle trees

Merkle trees essentially allow us to hash a set of items producing a hash H, and then be able to prove that some item exists in the set with hash H using a logarithmic size proof.

These were originally introduced in Bitcoin to allow proving the existence of a transaction in a block without having to download the whole block. While this is plausibly useful, Ethereum went a lot further with this, using self-balancing Merkle trees to keep track of the entire blockchain state.

This has some uses for light clients, and could also be used to implement a fully verifying client which doesn’t store the blockchain state, instead downloading parts of the state as needed from a third party to verify new blocks.

We can even imagine extending the protocol to always send this part of the state necessary to verify the block along with the block. In this case, the only nodes which would need to store the state would be miners. However, using this kind of Merkle tree it’s not possible to get rid of the requirement that miners store the whole state, since they need the whole tree in order to be able to modify and insert new values.

Thus with this state tree, while not all nodes have to store the whole state, some still do.

Functional lists

Balanced binary trees are not the only data structure that can be “Merkelized”. The block chain itself is another example of such a data structure; in that case, a linked list. Unlike a binary search tree, accessing an arbitrary element of a linked list takes O(n) time; this corresponds to the fact that we need to supply O(n) block headers to prove that some arbitary block is a predecessor of another block.

However, there are some constraints on what can be usefully implemented as a Merkle data structure. Firstly, if we want the proofs to be small and updates to be cheap, then the data structure must consist of small separate values which point to each other (like a binary search tree), rather than large monolithic values (like a hash table). Secondly, there can be no cycles, since then we would have no way to compute a hash.

The constraints are essentially identical to those on purely functional data structures in a strict language. In such a data structure, we can’t mutate existing memory but only allocate new memory, so we cannot implement things like hash tables efficiently, and values can only point to older values, so there are no cycles. For any strict purely functional data structure, there is a corresponding Merkle data structure. In general, if an operation takes O(f(n)) in the original data structure, we will be able to prove what result of the operation is in the Merkle data structure with an O(f(n)) sized proof.

We can thus look to functional data structure research for inspiration. In particular, we will be interested in a random access list from Chris Okasaki’s “Purely Functional Data Structures”.

Okasaki describes this as a “numerical representation” since it is similar to a number stored as a list of digits. The data structure consists of a list of complete binary trees of distinct height, in order of height, with values stored at the leaves. We can append to the list by adding a new tree of height 0 to the list, then merging binary trees into larger trees until all the heights are distinct, similar to carrying in binary addition.

To be explicit, here is a simple (partial) implementation:

-- List of complete binary trees together with their heights
type List a = [(Int, BinaryTree a)]

data BinaryTree a
  = Leaf a
  | Branch (BinaryTree a) (BinaryTree a)

-- Adds an element to the head of the list
cons :: a -> List a -> List a
cons x xs = compact ((0, Leaf x) : xs)
  where
    compact list@((i, x) : (j, y) : xs) =
      if i == j then
        compact ((i + 1, Branch x y) : xs)
      else
        list
    compact x = x

The key point with this data structure is that, unlike with a trie where we could potentially need to access any part of the tree in order to insert an arbitrary value (and thus need to store the whole tree), in order to add a value to a random access list we only need access the (logarithmic number of) roots at the top level. Thus in the Merkle version of the data structure, we can maintain the ability to compute what the hash will be after adding a value while only keeping track of a logarithmic amount of data.

Application to blockchains

This could be used in Ethereum by making the following changes:

Instead of having the state tree storing a set of accounts, there would be a state random access list storing an ordered list of accounts.
Every node would store the hashes of the roots of the trees in the random access list.
Each node would have a set of accounts which they are interested in, and for each block they would store a proof of the existence of those accounts and their contents in the state list.
A transaction must specify in advance which accounts it might access.
When a node broadcasts a transaction they would include a proof of the contents of the accounts used in that transaction as of the previous block.
When a miner broadcasts a block they would similarly include proofs for all the accounts used in that block.
When a node processes a block they would compute new proofs for the accounts they are interested in based on the changes to the state list from that block.

Trade offs

There are two main drawbacks to this approach:

The size of transactions is increased by the size of the proofs.
Nodes have to process all blocks from the last time an account was used in order to know the proof for that account.

For (1), it’s obvious that we can freely trade off the transaction size with the storage each node needs to do. For example, if each node stores a fraction O(1/c) of the state, then the amount of data added to each transaction is O(log(c)) per account used. Moreover, this decision does not need to be made at the core protocol level, but can be made at the individual node level: each node can just store the state down to some depth, and tell it’s peers what depth it has stored down to so they can exclude that from the proofs they send.

For (2), there are ways of dealing with this. For example, nodes could provide account storage as a service, either storing specific accounts or all accounts meeting some criteria. A node could easily store proofs for all accounts with balances over some fixed amount (e.g. 10 eth), while still maintaining a bound on the maximum storage they could need. Another possibility is to have each node store a random part of the state according to their storage capacity, and then have a protocol for discovering nodes which hold the data for a particular account.