In the previous chapter, we addressed user onboarding challenges, one of the two main issues for Ethereum mass adoption. The second of them, which we will tackle in this chapter, is scalability. The Ethereum network, as it is today, can handle about 15 transactions per second – and this throughput must be shared among all Ethereum applications globally. This has led to single applications cluttering the entire network due to a spike in their usage to the point of rendering all dapps unusable for brief periods. In this chapter, we will introduce state channels and sidechains, two of the most widely used scalability solutions.
What is Layer 2?
The core limitation is that public blockchains like ethereum require every transaction to be processed by every single node in the network. (...) This is by design — it’s part of what makes public blockchains authoritative. Nodes don’t have to rely on someone else to tell them what the current state of the blockchain is. (...) This puts a fundamental limit on ethereum’s transaction throughput: it cannot be higher than what we are willing to require from an individual node.
—Josh Stark, “Making Sense of Ethereum’s Layer 2 Scaling Solutions: State Channels, Plasma, and Truebit”1
But what if we do not require every transaction to be run through the whole network? For instance, a set of transactions, run between a small group of participants, could be processed on a separate network. And only after a certain period the resulting balances could be uploaded to the main Ethereum network.
Channels are short-lived closed networks, typically between two participants, where they exchange multiple transactions between each other. Each party must acknowledge every transaction by signing it. To open the channel, they first must make a deposit on a smart contract on the Ethereum network. This contract can then validate their signatures to execute the payouts when needed.
Sidechains are parallel networks that use a different consensus algorithm than the main network, such as proof of authority or stake. These are usually bridged to the main network, allowing users to move assets between the sidechain and the main chain. A variant of sidechains are plasma chains, in which the good behavior of the sidechain can be fully enforced by a smart contract on the main network.
External computation solutions do not provide a higher transaction throughput, but they do allow for more interesting tasks to be performed in each transaction. They run computing-intensive tasks outside the main network, tasks that would be prohibitively expensive to run on the EVM, and then inject the result back.
In this chapter, we will explore the first two solutions. Today there are several teams working on implementations or new variations of each of them. We will mention some of them along the way, but not without making our own attempts at each solution first.2
Channels
Channels are a family of layer 2 scalability solutions that span many different variants from unidirectional payment channels to counterfactual generalized state channels. They can also be extrapolated to full channel networks instead of isolated peer-to-peer solutions.
We will begin with payment channels.3 In payment channels, two or more participants open a channel by making an initial deposit on the main network and then perform multiple payments off-chain over the channel. These payments are then settled trustlessly on a smart contract on the main network.
Unidirectional Payment Channels
The easiest variant of payment channels are unidirectional payment channels. Here, there are two distinct parties involved: a recipient and a sender. These are usually a provider that collects multiple payments in exchange for a service provided over time and a user performing these payments. A good example of this is a player performing microtransactions in a game.
How do Channels Work?
Let’s suppose a scenario where a user needs to make several small purchases to a service provider. It does not matter what the service provider is offering, only that the user will need to perform multiple payments to the same recipient over a period of time and that the provider needs a proof of each small payment to continue providing the service.
If each and every one of these small payments is done as a transaction on the blockchain, the accumulated transaction gas fees would probably become considerable against the actual payments. Paying a 20-cent fee to the network for each 20-cent payment is not a good deal. Furthermore, since the service provider requires a proof after each payment, the confirmation times would constantly add significant delays to the service.
A solution could be to have a trusted third party collect a large initial deposit from the user and monitor the service being provided. The user then signs each of these microtransactions with their private key, acknowledging each of the payments to be made. After all micropayments have been made, the third party issues a single on-chain transaction that includes the total payout to the service provider and returns the remainder of the initial deposit to the user. Assuming both the user and the service provider trust this party, this can reduce all micropayments to just two transactions on the network: one for the deposit and the other for the payout.
Within a payment channel, most transactions happen completely off-chain, being sent directly from the user to the recipient. This is why channels are considered a layer 2 solution, built on top of the main Ethereum network, the layer 1, while inheriting many of its security properties.
Channels have some very interesting advantages over layer 1. After a channel has been opened, any transaction sent through it has no gas fees, and once sent, they can also be considered to be instantly finalized, since there is no need to wait for any blocks to be mined to confirm it. Additionally, since transactions are exchanged within the two participants, they are entirely private until they are submitted to the blockchain.
Implementing a Unidirectional Channel
Definition, state variables, and constructor for the unidirectional payment channel contract. We will be using the ECDSA library from [email protected] to verify the signatures on the contract. Note that the constructor is payable, so the sender can make the initial deposit upon deployment
Since most transactions in a payment channel occur off-chain, we will need to implement only two methods. The first method, close, will be called by the service provider to submit the sender’s signature with the payout, collect their funds, and close the channel (Listing 8-2).
Note
We will require that the sender always signs messages for the total to be paid out to the recipient. This allows us to just submit a single signed message to the contract, instead of having to process multiple ones.
This method will only be callable by the recipient to prevent the sender from trying to prematurely close the channel with a message signed by them with zero value. Also, since the sender could sign a message for a total value greater than the deposit present in the channel, we need to limit the value transferred to the contract’s balance. It is up to the recipient to decide whether they will accept a payment note for more value than can be actually paid by the channel.4
Closing the payment channel by the recipient. This requires submitting a signed message by the sender with the value to be transferred. Note that the signed message also includes the address of the contract to prevent replay attacks on other channels with the same sender
Forcefully closing the channel by the sender to recover the deposit if the recipient never cashes out
This implementation can be modified to exchange ERC20 tokens instead of ETH, thus opening the door to token payment channels. Instead of making an initial deposit of ETH, the sender must transfer ERC20 tokens to the channel contract as a deposit. These tokens are then transferred again once the channel is closed. Refer to TokenPaymentChannel.sol in the code samples for an implementation.
Building a Payments App
We will now use our contract to build a simple application, where a sender can set up a payment channel contract with a recipient, send multiple micropayments via a direct off-chain connection, and eventually settle. As in previous chapters, we will use create-react-app for boilerplate.
To keep the application simple, we will establish a connection between two browser windows opened on the same app in the same computer, one of them acting as a sender and the other as receiver. We will use broadcast channels5 to pass messages between the two browser windows. In a real app, you will want to use a different method, such as WebRTC data channels,6 along with a server to manage discovery among your users.
We will make another simplification: instead of using Metamask, we will manage the accounts directly from the web application. This is to avoid difficulties with simulating two different accounts interacting with the same app on the same computer. We will call directly into ganache for sending transactions from both the sender and recipient accounts and for signing messages when needed.
Our application will be built out of two main views: one for the sender and one for the recipient, both set up by a root App component. It will be the App’s responsibility to set up the web3 object and inject the sender and recipient addresses into the components. Refer to src/App.js in the code samples for its implementation.
Sender component function to deploy the payment channel contract, fund it, and notify the recipient of its deployment. The App component passes the web3 instance and the sender and recipient addresses as props. Here, PaymentChannel is a function that returns a new web3 contract instance, and BN is a BigNumber constructor
Sender function to send a micropayment to the recipient. Each message carries the total amount of ETH to be paid out. The component needs to keep track of the total sent so far, so the micropayment amount chosen by the user is added to that value before being signed and sent
Signing each micropayment message using web38
Note
For the sake of brevity, we will skip the implementation of the forceClose call by the sender in this example.
Initializing a new broadcast channel for receiving messages from the sender and adding an event handler. This code is part of the Recipient component constructor
Delegating to different handler functions, discriminating on the message action
Recipient reacts to a new channel deployed by validating it, retrieving its deposit, and updating its own state. Here, PaymentChannel is a function that returns a web3 contract instance at the specified address
Checking the bytecode deployed at an address against the one in the contract compiled Artifact. Note that we check against the deployedBytecode of the contract, not the bytecode, since the latter includes the constructor code that is not saved in the blockchain
Recipient reacts to a payment message, updating the total ETH received and the corresponding signature. Validating each message implies recovering the signing address and checking it against the sender, since an invalid signature would yield a different signer address. We also discard any messages with a total payment less than the latest total
Helper function to recover the signer of a payment message. Note that the hash over which the signature is recovered is calculated exactly like in the signPayment method
Closing the channel by the recipient using the latest value and signature sent by the sender
After this method is called, the channel contract should be destroyed, and the recipient should have received the sum of all micropayments made by the sender.
Now that we have a working application built on top of unidirectional payment channels, it’s time to move into more interesting flavors of channels.
Bidirectional Payment Channels
Another scenario for payment channels is that of two equal parties exchanging funds between each other. Instead of having a distinguished sender and a recipient, both participants in the channel can send and receive funds.
This symmetry in the participants’ roles makes the channel implementation more difficult. Which of the two participants should be allowed to close the channel in this model? Since they are equals, they both should be allowed to do so, but this introduces a problem.
Challenge Periods
This situation is solved by adding a challenge period to the closure of the channel. When Bob requests the channel to be closed, it goes into a “closing” state for a fixed period of time. During this period, Alice can either confirm the closure of the channel, or she can submit a more recent state and start a new closing period (Figure 8-6).
If she fails to send any transaction to the channel during the challenge period, then the channel is closed and the payouts executed according to the state submitted by Bob. This last case is the equivalent of the recipient not submitting a message and having the sender run a forced close. So, adding a challenge period removes the need for having a predefined end time for the channel.
Note
Challenge periods are a very common mechanism in layer 2 solutions, not just channels, as we will see later in this chapter. These allow an action to be carried away unilaterally without having to collect confirmations from every other party, but still let them watch for unlawful behavior and act upon it.
This mechanism requires the smart contract to recognize when a message is more recent than another. In other words, it requires adding a notion of ordering to the messages interchanged by Alice and Bob. In the previous example, this allows the channel to be able to verify that Alice’s message is more recent than Bob’s and hence discard Bob’s in favor of Alice’s.
These challenge mechanics have an important drawback: parties in the channel cannot be offline for any longer than the length of the challenge period. In our Alice and Bob example, if the channel has a challenge period of a few hours, Bob could just submit the state that is convenient for him while Alice is offline, so when she comes back online, the channel would already be closed. On the other hand, extremely long challenge periods can lead to locked deposits for long periods of time, if Bob attempts to rightfully close the channel and Alice never accepts the closure. Choosing the correct challenge period will depend heavily on the use case where the channel is deployed.
Note
As a complement to channels, there are watchtower services that can monitor a channel on behalf of the user in case they go offline and their counterpart attempts to unlawfully close the channel. These providers may demand a fee in exchange for their services proportional to the value locked in the channel.
A Sample Exchange
Alice signs 0.4 ETH by signing balances (0.6, 1.4)9 with nonce 1.
Bob signs 0.3 ETH by signing balances (0.9, 1.1) with nonce 2.
Alice signs 0.1 ETH by signing balances (0.8, 1.2) with nonce 3.
Alice signs 0.1 ETH by signing balances (0.7, 1.3) with nonce 4.
When the exchange ends, Bob rightfully picks the latest message signed by Alice and uses it to close the channel. Note that it never makes sense for him to pick nonce 3 over 4, since 4 has a balance more beneficial to him, as it corresponds to a payment made by Alice.
Bob never signs another message and never closes the channel. In response, Alice uploads the last message that was beneficial to her: the last one signed by Bob (nonce 2). She has no reason to submit any of the more recent ones where she performs additional payments. Alternatively, she could also attempt to close the channel as if no messages were exchanged.
Alice maliciously uploads the message with nonce 2 to try to close the channel. Bob should immediately submit a more recent message, preferably the one with nonce 4.
Bob maliciously attempts to close the channel with the message with nonce 1, where the balance was most in his favor. Then, Alice should submit the message with nonce 2 in response, as it is a more beneficial situation to her. This takes us to the previous scenario, where Bob should submit nonce 4.
As we can see, by signing a message with increasing nonce with each micropayment, participants are then incentivized to always submit the latest message signed by their counterparty. This leads to the most recent message being used to close the channel.
Implementing a Bidirectional Channel
We will now implement a sample bidirectional state channel (Listing 8-14). In our implementation, any user can request the channel’s closure by providing a signed message by the other user. The payouts can then be executed either when the other party confirms it or when the challenge period ends.
Contract variables and initialization functions of our bidirectional state channel implementation. We are using the same ECDSA library as in the unidirectional implementation
Closing function for the bidirectional state channel. It can be called by any of the participants, as long as they submit a message signed by the other user, with a more recent nonce than the last one submitted (if there was one)
Effectively closing a bidirectional payment channel and executing the payouts
Note
We are using send instead of transfer in confirmClose to protect against an attack. If user2 is a contract account instead of an EOA, it can be coded to revert on every incoming transaction. This would make it impossible for user1 to ever close the channel and recover the initial deposit, since the transfer call to user2 would fail, and revert the entire transaction. By using send, the sending of ETH may fail, but the close is allowed to succeed.
Starting a channel closure from the initial state
Note that any user could still call closeWithState after close, in case a party maliciously attempted to close the channel on the initial state.
Optimizations and Extensions
One possible optimization is having a single contract for managing all channels. Instead of deploying one contract per channel, each channel is actually a struct stored in a single payment channel contract. This greatly reduces the cost of creating a new channel, but at the expense of added complexity. Furthermore, it centralizes the funds of all participants in a single contract, opening the door for bugs that could let an attacker drain the funds from all channels simultaneously.
Channels could also be modified to be reused. In our implementations, we required the channel to be closed to execute the payouts, but we could leave the channel open after a payout is executed. This allows for part of the funds to be withdrawn to be used in other applications, without having to destroy the channel contract.
In the case of bidirectional channels, channel closure can be optimized by adding a special message, signed by both users, that signals the agreed finalization at a certain state. This message can be uploaded by any participant and does not require either a second transaction to confirm the closure or going through the challenge period.
An interesting extension to channels is to increase the number of participants. While in all of our examples we explore peer-to-peer channels with two members, a larger number of users can be involved. Coordination may become more complex as the number of users increase, since messages may be required to be signed by several participants in the channel to be considered valid.
State Channels
Payment channels can be seen as a specific case of a more general class of channels called state channels . Instead of having two parties exchanging signed messages regarding the state of balance to be paid out, state channels allow users to exchange messages regarding any state.
As an example, a simple game could be carried over a state channel. The players can sign messages on the state of the game, such as the placement of the pieces in a board. Moving a piece is done by sending signed messages with the move or the new configuration of the board.
State channels typically involve an initial deposit and a payout, much like payment channels. The conditions that rule the payout are defined by a game and can be enforced on-chain by a smart contract.
Coding a Game into a State Channel
Turn-based games are an excellent use case for a state channel, since they have some useful properties. For one, all of the game state can be safely exchanged between messages and processed on-chain if needed. Also, at any point in time, it is well-defined which player should play next, and there can be no disputes regarding who made a move first. Furthermore, the game’s state is all that is needed to resolve a challenge, as they do not depend on any external state.11
Let’s use the tic-tac-toe game as an example. The state can be defined as a 3x3 matrix with three possible values per cell: circle, cross, or empty. The rules of the game are easy to encode, as well as the winning (or draw) conditions.
Alice plays X in the center, so she signs a message with only an X in the center and nonce 1, and sends it to Bob.
Bob plays O in mid-right, signs with nonce 2, and sends it to Alice.
Alice plays X in top-right, signs with nonce 3, and sends it to Bob.
Bob plays O in mid-left, signs with nonce 4, and sends it to Alice.
Alice wins playing X in bottom-left, signs it with nonce 5, and sends it to Bob.
X3 | ||
---|---|---|
O4 | X1 | O2 |
X5 |
After the last step, Bob should sign the message received by Alice and send it back to her, so she can upload it on-chain and claim her prize. However, if Bob was a sore loser, he could refuse to do so. In that case, Alice must be able to just upload the last state signed by Bob (4) on-chain, along with her winning move, and have the state channel verify that she has won. Note that this requires that the state channel must be able to verify that her move is indeed a valid and winning move.
Bob could also simply stall the game. For instance, when he receives message 3 from Alice, as he notices that he is going to lose the game, he could choose to stop playing. In this scenario, Alice must be able to go on-chain with the last state signed by Bob (2), along with her following move (3), and challenge Bob to move. If he does not respond on-chain within an allotted time, then the state channel should declare Alice winner. Here, we are using challenge periods not just for closures but also for enforcing moves.
X3 | ||
---|---|---|
X1 | O2 | |
O4 |
If the state channel contract were not able to verify that his move is invalid (he changed the location of X3), then Alice would be forced to respond on-chain with a move on top of this invalid board.
Note
An alternative to having the state channel contract validate every transition is to have it accept all transitions by default, but accept proofs that a certain move was invalid. In some cases, verifying a proof that a transition is invalid can be much easier than verifying the transition itself. Chess is a good example: verifying checkmate can be prohibitively expensive in terms of gas usage. A way around this is allowing any player to claim checkmate and have the opponent prove that it is not the case by submitting any valid move.12 This pattern is simply another form of challenge-response and follows the motto “verify, don’t compute” of smart contracts.
State channels are then inherently more complex than regular payment channels, since they require the logic of the game being played to verify state transitions.
Note
Most of the optimizations described for payment channels also apply to state channels. For instance, it could be possible to run multiple instances of a game between two participants over a state channel, without requiring to close and open a new channel every time a rematch is desired. Also, a state channel could be set up so its deposits are ERC20 or even non-fungible ERC721 tokens - imagine a representing a trophy as a digital collectible!
Generalized State Channels
As we have seen, a state channel for a given game has two main responsibilities: managing the channel itself and validating the game’s transitions. This makes implementations more convoluted, as the logic for both responsibilities is intertwined. It also makes state channel contracts more expensive, as they need to include the logic on both the channel and the game.
Generalized state channels move all of the on-chain stateful components for blockchain applications off-chain. Rather than require each application developer to build an entire state channel architecture from scratch, a generalized state channel generalized framework is one where state is deposited once and then be used by any application or set of applications afterwards.
—Jeff Coleman, Liam Horne, and Li Xuanji, “Counterfactual: Generalized State Channels”13
This opens the door to a new level of channel reuse. Users can now play multiple instances of a game and even play multiple different games over the same channel, thus creating several subchannels within a single channel. Furthermore, we can build dependencies between these subchannels, such as triggering a payment channel only upon the resolution of a set of game subchannels.
Generalized state channel solutions are heavily under development by different teams, though there is work toward a common standard to provide some degree of interoperability among them. Several of these implementations, such as the one from the Counterfactual team, rely on the concept of counterfactual actions. Here, the term counterfactual is used to refer to an action that any participant in the channel could take on-chain but it is actually not and causes participants to act as if it had actually happened.14 Let’s see what this means.
In our tic-tac-toe state channel, we could say that Alice has counterfactually won if there is a state signed by both Alice and Bob with her winning the game. Both players know that any of them can submit the winning state to the smart contract on-chain to trigger the payouts at any time. However, they can also decide to keep playing a second match, knowing that Alice has already won the first and that it can be taken on-chain whenever needed. In other words, players are dealing with counterfactual state.
Applications in a generalized state channel can be counterfactually installed: if all participants play nicely and a dispute never arises, then the application contract never needs to be actually created, and the entire game can be resolved off-chain. This is also known as counterfactual instantiation of a contract: a contract that could be deployed, but it is not.
The fact that any player can go on-chain and enforce a certain action is enough to promote good off-chain behavior – as long as it is complemented with a set of penalizations for malicious players who force their opponents to waste gas going on-chain.
All in all, counterfactual generalized state channels provide an interesting framework that minimizes the number of on-chain actions and thus reduces the latency and gas fees that are incurred every time an action must be carried out on the Ethereum network. As an additional benefit, they also provide a layer of privacy over the participants’ actions: if no transactions except for the deposit and payouts are taken on-chain, then only the participants know what messages were exchanged via the state channel.
Channel Networks
Channels are useful solutions when it comes to settling payments or state within a fixed small set of participants (typically two). However, the solution falls short when we want to connect a dynamic set of members: for each user we want to transact with, we would need to go on-chain and open a new channel.
To solve this problem, there are protocols for establishing virtual channels between two peers, which leverage a path of channels that goes through multiple intermediaries. This effectively creates a network built from point-to-point connections, where any participant can connect to another as long as there is a valid path between the two, much like the Internet itself.
Caution
This section covers topics currently under heavy research and in ongoing development. Use it as a starting point to run your own up-to-date research if you are considering building on top of a state channel network.
The easiest construction in networks comes once again from Bitcoin, which is multi-hop payment channels. This method allows to securely route payments through one or more intermediaries. For instance, Alice could send payments to an intermediary Ingrid with whom she has a payment channel set up and have Ingrid relay them to Bob (assuming that Ingrid had a channel with Bob as well). Due to how these channels are set up,15 Ingrid has no way to take these funds for herself.
This type of channels leads directly to a simple and effective network layout, a hub-and-spoke channel network, where multiple clients connect to a single hub that acts as an intermediary for all of them. This way, setting up a single channel with the hub allows a user to transact with anyone else on the network. On the other hand, it has the downside of being centralized and requiring the hub to be available.
There are also several projects working on more interesting network layouts, such as the Raiden Network,16 inspired by Bitcoin’s Lightning Network. The ultimate goal for these networks is to allow for any two participants to establish a shared channel, usually called a virtual channel or metachannel , in a trustless manner. These need not to be just payment channels, but can be full generalized state channels, and can potentially run without active participation of the intermediaries on every exchange.
Sidechains
At their most basic version, a sidechain is a parallel Ethereum network that potentially runs a different consensus algorithm, such as proof-of-authority, and is connected to the main network by a bridge. Working on smaller sizes allows sidechains to achieve much higher throughputs than the main Ethereum network.
Note
While it is technically possible for a sidechain to use proof-of-work, this is highly insecure. Remember that proof-of-work relies on an attacker being unable to produce more computing power than the rest of the network. Since sidechains tend to be small compared to the main network, their difficulty is also comparatively low, which makes it easier for an attacker to mount an attack on them. This is why most sidechains work with a closed set of miners or validators.
Proof of Authority
In proof-of-authority, or PoA for short, there is a predefined set of nodes that act as validators for the blocks being added to the network. Validators are the PoA equivalent of miners in PoW, in that they add new blocks. Every certain number of seconds, each of these validators, taking turns, proposes a new block. These blocks are broadcasted to the other validators and need to be approved by a majority of them to be added to the blockchain. The set of validators can be changed over time, with some validators being voted out and new ones allowed into the set.
How the blocks are broadcasted, approved, and agreed upon depends on the specific consensus algorithm being used. Though there are many different consensus algorithms, such as Clique,17 Aura,18 Raft,19 or Istanbul BFT,20 they all share the same basic scheme outlined previously. Different algorithms may offer different guarantees against malicious actors or nodes dropping from the network, as well as different performance.
Another component of a sidechain is the connection to a main network, often called bridge . A bridge is a mechanism for users to move their assets between the main chain and the sidechain. As an example, a simple bridge could allow users to move their assets in a specific ERC20 from the main network to a sidechain by having the users lock their funds in a specific mainnet contract. Sidechain validators watch this contract and create the corresponding funds in the sidechain whenever they register a user locking funds on the main chain. We will implement this mechanism later in this chapter.
Security and Trust
The security of a vanilla PoA sidechain depends entirely on its validators. If a majority of them collude, they can effectively steal all user funds locked in the sidechain. Because of this, it is critical that the set of validators is composed of multiple different parties and are not all controlled by a single organization. A user should trust a PoA network only if he or she trusts a majority of the validator nodes.
To disincentivize malicious behavior like this, some networks rely on proof-of-stake instead of proof-of-authority. In this scheme, the validator nodes are required to deposit (stake) a large amount of funds. If it is proven that they acted maliciously, then their stake is slashed as a penalty - though how this slashing is executed is another matter.
It is important that the value a validator could gain by attacking the network is less than what he would lose in stake in order to keep the incentives in line. There are currently different approaches to proof-of-stake. However, from a user’s perspective, the experience is very similar to operating in a proof-of-authority network.
Note
Given that malicious validators may steal funds from users, the security guarantees from the sidechain are poorer than those of the main chain. This fact makes sidechains not to be considered layer 2 solutions under certain definitions. Nevertheless, there are constructs (such as Plasma, which we will see later) that allow the user to safeguard their assets by calling unlawful validator behavior on a main network contract.
Deploying Our Own Chain
To illustrate how a PoA network works,21 we will manually set one up using the Geth Ethereum node client.22 While Geth can run in the proof-of-work main Ethereum network, it can also be configured to run in PoA networks that use the Clique consensus algorithm (such as the Rinkeby testnet) and act as a validator node.
You should get three different addresses, which we will set up as the validators of this network (make sure to note them down). We will now create a genesis for our network. The genesis is the configuration for the network and will include which is the set of authorized validators, the consensus engine, the initial balances, the block gas limit, and so on. In geth, this information is compiled into a JSON configuration file which is used to bootstrap each node.
Fragment of the Geth puppeth configuration wizard to set up a PoA network with id 1212. You should have a new mysidechain.json genesis file after going through it
Before starting our Geth nodes, we will set up a bootnode. A bootnode is a node in the network whose sole purpose is to aid in the discovery of other nodes. We will use it to simplify the communication between our miner nodes.
Generating a boot key and running a bootnode
Note
We did not need to use the genesis configuration file for the bootnode, since the bootnode does need any information on whether the network is running a proof of work or authority or who the validators are. It only needs to know where the nodes are to share this information with the network.
Starting the Geth validator nodes using the genesis configuration and bootnode ID generated earlier. Run this command three times in different terminals, one for each validator, changing the unlocked address, datadir, port, and rpcport
Our network should be now running, sealing a new block every 5 seconds. Take a look at the logs from the three validators to see how they progress.
Spinning a new node to join the network, with a console enabled. Note that we are not authenticating this new node in any way, since the network is public for anyone to join
Let’s now connect this network with an existing one, such as Rinkeby.
Building a Bridge
- 1.
The user transfers funds to the bridge contract on mainnet.
- 2.
The bridge contract retains the funds and emits an event.
- 3.
The validators note the event, and each of them calls into the bridge contract on the sidechain side requesting to unlock the same amount of funds.
- 4.
The user gets their funds in the sidechain and uses them to operate there.
- 5.
Once the user wants to exit the sidechain, the same process is repeated by transferring the sidechain funds to the sidechain bridge contract and having the validators unlock them on the mainnet bridge contract.
We will begin by building the bridge contract. This contract will have two main responsibilities: (1) accepting and locking user funds and (2) unlocking them at the request of the validators. Note that we will deploy two instances of the contract, one in each chain. The locking function on one chain will have its counterpart on the unlocking function of the other chain and vice versa.
Note
In this example, we are building a bridge that accepts ETH and dispenses the native currency of the sidechain on the other end. However, we could also build bridges that accept a certain ERC20 token on the main chain or even non-fungible ERC721 assets.
Definition of the bridge contract, which is initialized with the validators’ addresses
Note that we are making the constructor payable. When deploying the contract in the sidechain, we need to seed this contract with the maximum amount of ETH we want to allow our users to transfer from the main network (Rinkeby in this case) to our sidechain, so the contract can unlock those funds when prompted by the validators.
Lock function of the bridge contract
Unlocking function of the token bridge. The first time a validator requests an unlock, we will create a new unlock request and then log an approval every time it is called again. When the required number of approvals is reached, the funds are unlocked
Caution
This implementation allows a malicious validator to prevent a user’s funds from being unlocked. The validator could spam the bridge contract with spurious unlock requests for upcoming request IDs. This way, when the honest validators actually try to honor the unlock request, the parameters (such as amount or recipient) will not match and the operation will fail. We will ignore this attack, since the purpose of this bridge is to just illustrate basic usage. Nevertheless, this serves as a reminder that even the most simple implementations may be hiding security issues, and you should always work with reviewed and audited contracts.
Deployment script for the bridge contract. Run this twice with different PROVIDER_URLs: one for the Rinkeby Ethereum network and the other for the sidechain
Watcher script to be run on each validator. Note that we create two web3 instances: one connecting to the main network, where we listen for events on the remote end of the bridge, and the other to the local network, where we execute the unlock operations. We then do the converse, allowing funds to go from the sidechain back to the main network
We can now run this script on each of the validator nodes (or at least on two of them). Once it is running, try calling the lock function in the Rinkeby end of the bridge. A few seconds later, you should have your funds ready to use on the chosen address in your sidechain.
In an actual application, you need to decide how much of this complexity you want to expose to your users. As we have seen in Chapter 7, onboarding is already troublesome enough, and adding another step requiring to send funds from one network to another is not a good idea.
However, you can actually leverage an application-specific sidechain for improving onboarding. You can directly fund your users’ initial accounts on your sidechain or build a viral inviting scheme where existing users can invite new ones directly on the cheaper and faster sidechain. The bridge is then only used for advanced users who want to transfer value from or to the main network, but to an Ethereum neophyte the application simply runs smoothly and fast, without knowing how it is backed.
A good example of this, already mentioned in the previous chapter, is the Burner Wallet.24 This wallet operates on a proof-of-authority sidechain25 with four different teams acting as validators. Users are quickly onboarded by receiving a link with a pre-funded account on the sidechain and can easily transact with others thanks to low gas costs and 5-second blocks. For the most advanced users, there is an option to move the funds onto the main Ethereum network or even seed their burner wallets from their mainnet accounts.
Plasma Chains
The state of the art in terms of layer 2 solutions at the time of this writing are plasma chains,26 originally designed by Vitalik Buterin and Joseph Poon in 2017.
Plasma chains are different from sidechains in that their security can be enforced by the main chain (or parent chain, in plasma terminology) and thus does not depend exclusively on the consensus mechanism of the sidechain. This means that if the set of validators on the child chain (called plasma operators) misbehaves, any user can build a cryptographic proof and take it to a smart contract on the main chain (called root contract). If there is no foul play, transactions on the child chain occur with the reduced gas cost and latency typical of a sidechain.
However, this additional security comes at a cost. Whenever a user wants to exit the child chain (i.e., transfer their assets back to the main chain), they must go through a challenge period, similar to the one we saw in state channels. If a malicious validator creates a fake block where he stole a user’s assets and uses it to take over those assets in the main chain, the user can submit a fraud proof during this challenge period and regain their assets. As such, exiting a plasma chain is not instant and requires a user to wait a certain period of time.27
Also, since a smart contract needs to be able to process whether a set of transactions on the child chain was legitimate, the operations allowed on the child chain cannot be overly complex. In particular, no plasma implementations at the time of this writing support arbitrary smart contracts and only provide the means for exchanging assets between users. Research on this topic is done under generalized plasma implementations.
On the flip side, plasma chains are designed for having a tree-like structure. The parent chain of a plasma chain can be another plasma chain, allowing for massive scalability by simply composing plasma chains within others. This means that if an application-specific plasma chain becomes overcrowded, it can simply spawn new children and move clusters of users to them.
It is worth mentioning that plasma itself is not a specification but a framework for building scalable layer 2 infrastructure. This has led to the development of many different flavors of plasma by different teams, such as minimal viable plasma, plasma cash, plasma debit, or plasma prime.28 It is most likely that by the time this book reaches your hands, there will be new major developments on this front.
Summary
Public blockchains like Bitcoin or Ethereum have traditionally sacrificed performance for trustlessness and security. While there are multiple efforts toward building Ethereum 2.0, which includes sharding mechanics that help the network scale, it is interesting to see many solutions sprouting that build on top of the existing infrastructure to solve the scalability problem.
Some of the solutions presented in this chapter, such as early versions of plasma or state channels, are ready to use and in production today, helping real-life applications scale beyond the limits of the main Ethereum network.
These solutions not only allow your application to achieve a higher transaction throughput, but they can also be used to provide a better user experience overall. Channels provide instant finality to peer-to-peer transactions if both parties behave appropriately instead of having to wait for a dozen confirmations. And sidechains can provide reliable block times much lower than the main network, with considerably lower gas fees.
These techniques can even be combined: you can set up state channels between parties in a sidechain or even use channels as the actual asset being traded in a plasma chain.29 The sky is the limit here.
How you leverage these solutions and present them to (or hide them from) your users will depend on what you are building. Remember what your users need from your application, and use the building blocks available to you to create the best possible experience.
Happy coding!