After a not-so-brief interlude on writing smart contracts, we will review the different ways to connect to the Ethereum network to retrieve data. We will cover different connection methods, as well as patterns for listening to changes, and put it all together in a sample application for monitoring transfers of an ERC20 token.
Connecting to the Network
The first step in retrieving data from the network is to actually connect to an Ethereum node. Since web applications do not connect directly to the network, they depend on a node to answer any queries on the blockchain state. We will start by reviewing node types, connection methods, and the provider object.
About Full and Light Nodes
A typical Ethereum node is a Geth or Parity instance1 that has its own copy (partial or full) of the blockchain, can answer queries from clients (such as a DApp), and relays transactions (more on this in the next chapter). A node with a full copy of the blockchain is called a full node . These nodes either have or can recompute any data from the blockchain history. Most clients run in this mode by default.
Full nodes may also store all historical data. These nodes are called archive nodes , and they are much more infrequent, due to the large amount of disk size needed to support them – nearly 2TB at the time of this writing. They are required in case you want to query particular information from older blocks, such as the state of a contract or a balance of an account from a year ago.
As an alternative to full nodes, some nodes may run in light client mode. These nodes keep only the block headers, and request information from the network as needed. They are much lighter to run than full nodes, which make them suitable for mobile devices, but make a poor choice for the back end of a DApp, since queries take longer to resolve.
Infura and Public Nodes
The next question about nodes is which ones are available for our applications. In an ideal decentralized scenario, every user should be running their own full Ethereum node, in order to validate all transactions themselves, and avoid trusting a third party. Users on mobile or IOT devices may choose to run light nodes instead, which would trust other nodes to relay the information but nevertheless verify it.
In the current landscape, a small fraction of our users will actually be running an Ethereum node . Most of them will be just learning what Ethereum is about, and wondering how to buy their first ETH to pay for the gas to fuel their initial transactions. Having them running their own nodes is still out of the question.
As such, and in order to help the Ethereum adoption process easier, there are a number of public nodes available. An Ethereum node is said to be a public node when it holds no private keys, is available to the public, and is used to answer blockchain queries and relay pre-signed transactions.
In particular, Infura (Japanese for “infrastructure”) is a service that provides HTTP and websocket endpoints to public full nodes for the Ethereum Mainnet, as well as for the Kovan, Ropsten, and Rinkeby testnets. Due to its reliability, and to the fact that it is free to use, it is widely used by many decentralized apps and wallets.
The JSON-RPC Interface
All Ethereum nodes, regardless of the particular implementation, expose a set of well-known methods, which compose the JSON-RPC interface. As the name implies, this is a JSON-based API for executing remote procedure calls, and constitutes the low-level interface for a client to interact with a node. Common methods include call, sendTransaction, getBlockByNumber, accounts, or getBalance. There are even methods for querying the state of the node itself, such as whether its syncing or how many peers it is connected to.
Note
Given it is a low-level interface, it is odd that you will find yourself building JSON-RPC calls manually. Most libraries (such as web3.js or ethers.js) will take care of generating the calls on your behalf and provide you with the responses. Nevertheless, it is always useful to understand what is going on under the hood in case you stumble upon a dreadful abstraction leakage.
It is worth mentioning that certain nodes may not implement all methods. For instance, the Infura HTTP endpoint does not offer costly operations such as newFilter (more on filters later in this chapter). This will be important to keep in mind when we discuss how to connect our app to the Ethereum network.
Connection Protocols
There are three different protocols that can be used as a transport for interchanging JSON-RPC messages. Nodes can be configured to handle any of them.
Alternative APIs
As an alternative to establishing a connection to the JSON-RPC interface of a node, you may opt to query blockchain data from a different source.
Example of executing a getTransactionCount call to the etherscan API (preceding) vs. the standard JSON-RPC call (following). Both return the same JSON object as a response
Certain javascript libraries, such as ethers.js, even include provider objects that abstract a connection to the Etherscan API, so it can be used seamlessly as any other standard JSON-RPC connection. Let’s now go into the role of the provider.
Note
We are not dwelling into domain-specific APIs at this point. A project may decide to offer an API to query relevant data from its domain. You may also choose to set up a centralized server that aggregates blockchain data from your protocol, and relays it to client-side apps.
The Provider Object
As we briefly saw in Chapter 2 while building our first sample DApp, the connection to a node is managed by a provider javascript object. It is the provider’s responsibility to abstract the connection protocol being used and offer a minimal interface for sending JSON-RPC messages and subscribing to notifications.
Note
At the moment of this writing, providers from different libraries have slightly different APIs. There is an effort to standardize the minimal provider as EIP 1193, but is still a draft.
Example [email protected] code for creating a provider and initializing a web3 instance
You will only need to create a provider instance if you have to manually set up a connection to a node. In most scenarios, you will actually delegate this responsibility to the user’s web3-enabled browser.
Metamask and Web3-enabled Browsers
After Chapter 2, you should now be familiar with Metamask , the browser extension that acts as a bridge for a web application and the Ethereum network. There are other options as well, such as the Cipher or the Opera browsers for Android, though we will focus on Metamask throughout the book, as it is the most widespread tool at the moment.
Web3-enabled browsers work by injecting a provider instance in the global scope. How this provider works or how it is backed should not be of importance for your DApp. The DApp should be able to query whichever information it needs and let the provider resolve it.
Snippet for instantiating a web3 object using a provider injected by Metamask
Subproviders
Certain web3 providers may also be composed of subproviders . A subprovider is a non-standard object that intercepts calls made via the provider. Among other uses, subproviders help provide a common interface by filling in any gaps in the feature set of the Ethereum node being used. In this sense, subproviders act as polyfills hidden within the provider.
As an example, a provider that connects to a node that does not offer the filters API (used for polling for specific changes) may include a filter subprovider that emulates that feature client-side. Such is the case with the web3 provider injected by Metamask: since Infura does not offer the filters API, Metamask adds that feature at the provider level via a custom subprovider. This way, you as a developer do not need to worry about which APIs are supported, and are given a consistent interface regardless of the node answering your queries.
We will revisit subproviders in the next chapter, where we discuss about providers and signers, since Metamask implements its signer as another subprovider.
Choosing the Right Connection
Up to this point, we have reviewed different kinds of nodes (full and light, public and private), as well as different connection protocols (ipc, http, and websockets). We have also learned how to set up a provider object and how to enable the one injected by a web3-enabled browser. Given all these options, it begs the question of which connection we should choose for querying information from a DApp.
Respecting the Choice of the User
First and foremost, if our user is using a web3-enabled browser, our DApp should rely on the provider injected by it. A web3-enabled browser means the user is already part of the Ethereum ecosystem, and could be potentially running a node of their own. As such, we need to provide them with the means to choose which node they want to use when browsing our DApp.
While we could reimplement Metamask’s interface for choosing a network connection, it makes little sense to do so. A user who wishes to connect to an alternative node will already be running Metamask or another web3-enabled browser, and have already preconfigured their own nodes. Therefore, an injected web3 provider should always be our first choice for connecting to the network.
Keep in mind that providers need to be enabled in order to access the list of accounts of the user. Nevertheless, if the application does not need this information, this step can be skipped.
Using a Public Node
The next option is simple: connect to a public node. You can either set up your own for your DApp or use one from Infura. Going with your own node has all the benefits and drawbacks of rolling out your own infrastructure: you do not depend on a third party, but you need to watch out for the health of your nodes. Remember that nothing prevents an arbitrary number of users from connecting to your node, so you should be prepared for surges in traffic. Because of this, it may be easier to just rely on an external infrastructure provider.
As an alternative to Infura, you can also rely on a public API such as that of Etherscan. Ethers.js, an alternative to web3.js, connects by default to Infura, and falls back to Etherscan if the connection fails.
Note that in all cases where your DApp relies on a third party, it is relying on a foreign centralized service for fetching data from the blockchain. Since one of the strong points of DApps is precisely decentralization, adding a component that needs to be trusted may be a step backward in this direction. It is up to you to decide on the trade-off between convenience and decentralization for the users of your DApp. As such, a good rule of thumb is to use an injected provider if found, and fall back to a centralized service otherwise.
Putting it all Together
Code snippet for initializing a web3 connection for a DApp, based on the code provided by metamask.io
Retrieving Data
Now that we know how to connect to the network, we can start actually retrieving data. We will review how to access network information, account balances, perform static calls, and subscribe to events. As before, we will be using [email protected] as a library to interact with the Ethereum network, but other libraries should provide similar features.
Network Information
Networks are identified by a numeric identifier. Mainnet is 1, Ropsten is 3, Rinkeby is 4, and Kovan is 42. Ephemeral development networks are typically set up with higher IDs.
Note
Like most requests to an external data source in javascript, calls to the Ethereum network are asynchronous operations. Different libraries may have different ways to handle this, either by using callbacks or returning promises. In particular, web3.js supports traditional error-first callbacks as well as promi-events. Promi-events are promise objects which double as an event emitter, allowing you to listen to different stages of the asynchronous operation. They will become more relevant in the next chapter. For now, we will simply use the async-await syntax for working with promises.
There is more information you can query from a node. Make sure to check out the web3.js reference4 for additional methods.
Account Balances and Code
Querying the balance of an address from ten blocks ago
Using web3.utils.fromWei for converting from Wei to ETH. The reverse method is toWei
This decision is specific to the web3.js library. Other libraries rely on javascript bignumber implementations, such as bignumber.js5 or bn.js6. It is most likely that once support for native bignumbers7 is stabilized in the language, libraries will switch to it. Either way, what is important is that you keep in mind that most numbers in Ethereum cannot be handled using regular javascript numbers, or you risk losing precision.
Keep in mind that this method for checking whether an account is a contract or not is far from robust. If you get no code from an address, it does not necessarily mean it is externally owned: a contract may be deployed to that address later, or a contract may have been deployed there but was eventually self-destructed. All in all, you should avoid relying on whether an arbitrary address is externally owned or not for particularly sensitive operations.
Calling into a Contract
As we saw in Chapter 2, you can call into a contract to query information from it by issuing a JSON-RPC call to its address. Most contracts expose getter functions that return information on their current state or perform pure calculations; these functions can be identified as they are tagged with the view or pure modifiers in Solidity. Like all the functions listed in this chapter, calling into them does not cost any gas, since the call can be answered by any node in the network, and does not need to introduce a change on the blockchain.
Accessing the same token’s total supply via the web3 Contract object. Note how the output is formatted based on its type instead of returned as a raw hexadecimal value
Note
Like getBalance, all calls to a contract can also include an optional block parameter, in case you want to query a contract’s state at a previous point in time. Remember that requesting changes for a block too long ago in the chain requires a connection to an archive node, which is not always available. Also keep in mind that, depending on your use case, it may be prudent to only display information from a dozen blocks ago, to shield yourself against possible chain reorgs. Data this recent is usually always available, regardless of the node keeping an archive or not.
Obtaining the transfer events on the BAT token on mainnet that occurred in the past 100 blocks. In this example, the address starting with 0xAAAAA6 transferred 1.9e21 tokens to address 0x664753
Each log object informs of the block and the transaction where it occurred, as well as the name of the event (in this case, Transfer), and includes the parameters with which it was emitted.
Detecting Changes
We will now go deeper into events. Even though we now know how to query past events, listening to new events is a useful method to detect changes to a contract in our application in real time. We will see three different ways for monitoring changes.
Polling for New Blocks
Polling for new blocks to update the totalSupply of an ERC20 contract. Though we could directly poll for the total supply, this approach is more efficient if there is more data that we need to update on every block
Whenever a new block is spotted, you can query the contract your app is interacting with to retrieve its latest state and update your app accordingly if there were any changes. An alternative would be to run getPastEvents on the new block and only react if there were any events that affect your contract.
Installing Event Filters
newFilter to install a new event filter on a node, which returns a filter ID
getFilterChanges that returns all new logs for a given filter ID since the last time this method was called
uninstallFilter to remove a filter given its ID
Event filters still rely on polling a node for new changes, but they are more convenient to use, since it is now the node that keeps track of exactly what new events need to be sent to the client. This saves the client from needing to issue regular getPastLogs calls to check for new events and allows the node to precalculate the data to send if needed. It is also possible to install filters for new blocks and pending transactions that are sent to the node.
Warning
Some public nodes, such as the ones offered by Infura, may not support installing event filters. To work around this, Metamask ships with a web3 subprovider to fake the behavior of filters completely on the client side. This allows you to code your application using event filters without needing to worry about whether the node you are connecting to actually supports them. However, keep in mind that the performance gain you could get by using filters is completely lost in this scenario.
Block ranges can be used to specify which blocks to monitor for events. By default, filters are created to monitor the latest block mined.
One or more addresses where the logs originate from. Retrieving events from a web3 Contract object will automatically restrict the logs to the address of the contract instance.
The topics used to filter the events. Remember from Chapter 3 that EVM logs can have up to four indexed topics – these are used for filtering them during queries. The first topic is always the event selector, while the remaining topics are the indexed arguments from Solidity. A filter can impose restrictions on any of the topics, requesting a topic to optionally match a set of values.
The web3 library has no support for event filters. Instead, monitoring for events is done via the third and last mechanism for listening to changes: subscriptions.
Creating Subscriptions
A more advanced option to monitor events is to create a subscription. Event subscriptions work similar to event filters in that they are created in a node from a set of filters (block range, addresses, and topics) to indicate which events are of interest to the client. However, subscriptions do not require the client to poll for changes, but rely on two-way connections to directly push new events to the client. For this reason, subscriptions are only available on websockets or IPC connections and not on HTTP ones.
Note
Unlike event filters, Infura does support websocket connections, via the URL wss://mainnet.infura.io/ws/v3/PROJECTID. Still, in the event that the user chooses a custom node via a regular HTTP connection, Metamask also ships with a subprovider to fake subscriptions client-side by relying on polling. Again, this allows you to transparently use event subscriptions on your app, having a subprovider polyfill the feature if the connection or node does not support it.
Setting up a subscription to monitor Transfer events on an ERC20 token contract. The `data` handler fires on every new event, while `error` fires upon an error in the subscription. Events removed from the chain due to a reorganization are fired in `changed`.
Subscriptions are automatically cleared when the connection to the server is closed. Alternatively, they can be removed via the unsubscribe() method on the subscription object, or by using web3.eth.clearSubscriptions(), which removes all active subscriptions.
As with event filters, it is possible to set up subscriptions for events from multiple addresses, as well as for new pending transactions or new blocks. Using the latter, a similar pattern to polling can be implemented, in which a subscription is installed to monitor for new blocks, and upon every block the state of the contract is re-read. Nevertheless, if the contract emits events for all state changes, monitoring them is much more efficient.
Example Application
We will now put together everything we learned in this chapter and build a web application for monitoring transfers on an ERC20 token. This application will just retrieve data from the token and not provide any interface for actually sending transactions.
Setup
Dependencies
As before, try running npm start to make sure that the sample react-app runs successfully. We can now start coding.
Initializing Web3
We will create a network.js file as before to manage a web3 object to connect to the network. We will rely on the injected web3 provider, falling back to a websocket connection to Infura. Note that since we will not request access to user accounts, we can skip the ethereum.enable() call.
Note
Make sure to use a valid URL for the fallback provider, by filling in with your Infura token, in case the user browsing the site does not have Metamask or a web3 compatible browser. Note that it is websocket-based, since we will be using event subscriptions later on the app.
The ERC20 Contract
We retrieve the contract ABI from OpenZeppelin, while we’ll leave the address as a parameter for now. Now that we have all basic components set up, we can get started with the application views.
Building the Application
We will start with a main App component that will initialize the connection and set up the ERC20 contract instance that we will be monitoring. Once we have the contract instance ready, we will begin by retrieving some information from it.
Root Component
The App component will be the root of our component tree (Listing 4-11). This component will manage the connection to the network and the ERC20 contract instance, both of which will be set up on the componentDidMount lifecycle method.
Initial version of the root App component for our application. Note that the highlighted line in componentDidMount that is setting up the ERC20 contract is using the factory method we built earlier
Since the address in the preceding example refers to a contract on mainnet, the app will only work if we have a connection to the Ethereum main network. If we are connecting to another network, the contract will probably not exist at the address listed, causing the app to fail.
Note
If you are curious about the ERC20 address chosen, it is the Augur REP token. Augur is a decentralized oracle and prediction market, and its main token is used reporting and disputing the outcome of events. If you want to experiment with other ERC20s, Etherscan provides a handy list of top tokens at etherscan.io/tokens.
Checking that we are currently connected to mainnet. Note that the contract is only instantiated if we are on the correct network
Updated componentDidMount method to add error handling. Note that the render method also needs to be updated accordingly to display the error message if it is present in the component state
Updated render helper method to display an ERC20 component, which expects the erc20 contract instance as a property
ERC20 Component
This component shall receive the contract instance, knowing that all connection details have been settled by the parent component, and display it. We will start by retrieving some static information, such as the name, symbol, and number of decimals (Listing 4-15). These are optional attributes according to the ERC20 standard, since they are never used as part of the contract’s logic; nevertheless, most tokens do implement them.
We will also retrieve the token’s total supply. This is the total number of tokens created for this contract. Unlike the name, symbol, and decimals, there is no guarantee that this value will stay constant: some tokens have a continuous issuance model, which causes the total supply to increase on every block, while others may have a deflationary model where certain events actually burn tokens.
React component for displaying information on the ERC20 token. Once again, we rely on the componentDidMount method to load async information to populate the state. By using Promise.all(), we fire all four requests simultaneously and only set the return values once we have obtained all of them
Render method to display the static information of an ERC20 token. Note that the totalSupply is adjusted by the decimals of the token
Auxiliary function to format token amounts based on the decimals property of the contract
Displaying Transfer Events
We will now add the last component of our application that will display the latest transfers of the token and listen for any new ones in real time.
Loading Past Transfers
Updated render method from the ERC20 component to include the new Transfers component
Note
For the sake of this application, we are just loading the events from an arbitrary number of blocks ago, in order to seed the component with initial data as it loads. Depending on your use case, you may want to add an option to load more events (for instance, when the user scrolls to the end of the list) by firing subsequent getPastEvents calls.
Displaying each transfer in the collection by using a pure component that simply displays the data received
Simple function for generating a unique identifier for a log. Note that web3.js already assigns an ID to a log entry, calculated as a hash over the same parameters. However, the hash is then truncated to 4 bytes, which may yield collisions if we are dealing with a large number of events
Transfer component for displaying a single Transfer event loaded from the ERC20 token contract
With this code, every time we reload the page, we will see the transfers from the last 1000 blocks for the token. We can now add support for listening to new transfers as they occur.
Monitoring New Transfers
Subscription function to listen for new transfer events, to be called from componentDidMount, using erc20 and blockNumber+1 as arguments. Note that we are storing the subscription object in state to be able to unsubscribe later
Code to stop listening for events when the component is to be unmounted from the tree. Even though we will never unmount the component in this particular application, it is a good practice to always remove the subscriptions when they are no longer used
Adding handlers for the changed and error events of the subscription. The former fires whenever an event is removed from the blockchain, so we remove it from our state, while the latter fires upon an error, which we add to our state to be displayed to the user
Awaiting Confirmations
As the last step in the application, we will avoid displaying unconfirmed transfers to the user. Instead of showing a transfer event as soon as we receive it, we will instead wait for a certain number of blocks to be added to the chain before rendering it in our list.
Updated section of componentDidMount to set the initial block number in the component’s state, and add a subscription to update it as new blocks are received
Updated render method to show only transfers with at least 12 confirmations
Note
Different applications will have different requirements for the number of confirmations, some of them going up to hundreds of blocks. This will depend strictly on your use case.
Summary
In this chapter, we have gone in-depth into how to extract data from the Ethereum network and feed it into our app. We started out by reviewing how connections to nodes work, listing the protocols available for the JSON-RPC interface, and looking into the Provider object used by web3 and other libraries to manage the underlying connection. We also learned that there are different types of nodes available and how a Provider, such as the one injected by web3-enabled browsers, may abstract away some of these differences via the usage of subproviders.
Using web3.js as a sample library, we studied what kind of queries we could issue to the blockchain: general network information, address-specific data such as balance and code, and calls to existing contracts. When connecting to an archive node, these queries can be issued to any block in the past, not just the most recent ones.
We also studied different ways for monitoring changes to contracts in real time in our applications. While polling is a classic method that is always available, event filters or subscriptions may be more interesting options due to better performance or faster notification times.
To wrap up the chapter, we built an application for retrieving information from an ERC20 token contract, and monitor all its transfer events using subscriptions. In the next chapter, we will learn how to make changes to the blockchain, going into all the details involved in sending a new transaction to the network.