Chapter 1

The cryptocurrency enigma

Preston Miller    Consultant at an international cybersecurity and forensics firm, USA

Abstract

Bitcoin has become a fixture in today’s modern society, a source of innovation and mystery, and has begun to change the way we think about currency. Because of Bitcoin’s success, many modern cryptocurrencies are simply variations on the Bitcoin framework. These Bitcoin derivatives are referred to as Altcoins. By understanding the cryptocurrency framework, through analysis of Bitcoin, examiners will be capable of understanding artifacts in a wide range of cryptocurrencies. In this chapter, we learn Bitcoin through example by first understanding the underlying theory and then seeing it in action. By using the MultiBit HD client, we observe transactions coming across the wire and look for their network and disk-based artifacts. From this approach we accomplish two things: the reader obtains a technical understanding of how cryptocurrencies work and how to forensically analyze and evaluate cryptocurrency artifacts.

Keywords

Bitcoin
blockchain
cryptocurrencies
asymmetric encryption
network and disk-based artifacts
MultiBit HD
digital signature
decentralized
proof of work
mining

Purpose

The goal of this chapter is to serve as a primer for cryptocurrency conventions and artifacts. By focusing on the technical aspects of Bitcoin, the most popular cryptocurrency, and examples from wallet software, the examiner should be more knowledgeable and comfortable while investigating crimes involving cryptocurrency.

Introduction

Bitcoin has become a fixture in today’s modern society, a source of innovation and mystery, and has begun to change the way we think about currency. In this chapter, we are going to explore the context, underlying framework and protocol, and artifacts related to Bitcoin. This chapter will focus exclusively on Bitcoin. However, because of Bitcoin’s success, many modern cryptocurrencies are simply variations on the Bitcoin framework. These Bitcoin derivatives are referred to as Altcoins. By understanding the cryptocurrency framework, through analysis of Bitcoin, examiners will be capable of understanding artifacts in a wide range of cryptocurrencies.
Often synonymous with providing anonymity for the acquisition of nefarious goods online, cryptocurrencies, such as Bitcoin, have been brought to the forefront after gaining traction due to recent media attention. Increased exposure has publicized the utility of cryptocurrencies and spurred the production of new currencies at a rate previously unseen. As more consumers depend on cryptocurrencies to purchase both legal and illegal goods, understanding cryptocurrencies and the channels through which they travel is vital.
As of August 2015, there are 678 cryptocurrencies on the market that come in varying types. A list of these cryptocurrencies and their prices can be found at coinmarketcap.com. Cryptocurrencies are often perceived as an unknown quantity; however, for the most part they are extremely well documented for those who want to read the technical details. For most of these cryptocurrencies a white paper is available that explains the underlying framework of the currency.

What makes a currency?

When talking about cryptocurrencies, it is not uncommon to be asked why they have a value at all. These currencies have a cost associated with them, despite lacking a physical component, and for some this is a tough concept to come to terms with. However, nonphysical currencies have value for the same reason physical ones do.
People’s beliefs and economics, another social construct, are the reason anything has value. The worth of an object is entirely dependent on the estimated value held by society and their rules governing commerce. Gold, for example, has long been considered very valuable in our society. The easily recognizable sheen and color of the material is often used in media to illustrate great wealth. But why is this the case?
Outside of modern electronics, gold has very little use. In spite of this gold has been featured as a currency even in ancient civilizations. As a metal gold is quite malleable and would not have been suitable for most tasks. However, it has persisted into modern times as an object of great worth because society still holds it in high regard.
People’s beliefs and economics can quickly change the worth of something. After World War I, Germany printed currency at a very high rate and, by doing so, hyperinflated their currency to pay for war reparations. Because of simple economics, as more currency was printed the value of them individually decreased. During this period, it became more economical to burn the relatively worthless money to heat a house than to actually buy wood.
These examples should illustrate that physical or not, our thoughts and rule governing commerce are the biggest factors in establishing value of an object. The fact that Bitcoin or most cryptocurrencies do not have a physical backing, such as gold for the USD, has little relevance as gold itself has no intrinsic value associated with it.

Cryptocurrency

Cryptocurrencies have different characteristics with respect to the currency we normally associate with. Two main differences are decentralization and, as the name suggests, cryptography. When a currency is decentralized, it is not controlled by one entity. Bitcoins, for example, are released as a byproduct when a batch of transactions are verified. Verification occurs with the computing power of certain users on the network. This, and other features, creates a currency that is shared among its users and is one reason why Bitcoin has many passionate followers.
These features make cryptocurrencies harder to be accepted by a government as control is out of their hands. As seen in the news with trail-blazing Bitcoin, these cryptocurrencies must follow strict regulation to be accepted. While this may be a sticking point for governments, it has proved to be a popular feature for consumers.
Another aspect of cryptocurrencies is the built-in implementation of cryptography in their design. For Bitcoin, this manifests in the form of public-key cryptography also known as asymmetric encryption. Asymmetric encryption employs a private and public key, allowing the creation of a digital “signature” that can be used to authorize transactions.

Public key encryption basics

You are probably most familiar with encryption as a means of obfuscating data from prying eyes. With Bitcoin, encryption is used as a means of digitally signing a transaction like you would sign a check. Without this signature it is not possible to verify valid exchanges.
Asymmetric key cryptography uses a pair of private and public keys to create and verify a signature, respectively. These keys are generated through a computationally hard algorithm, making it all but impossible to determine a private key from knowing the public key. Bitcoin uses an elliptic curve digital signature algorithm (ECDSA) to generate its private and public keys. The technical aspects of ECDSA employed by Bitcoin will be discussed in a later section.
With Bitcoin, the individual that controls the private key controls the bitcoins from the address associated with that key. Therefore, keeping the private key secret and secure is a vital part of preventing “theft” of funds.

Forensic relevance

There are many reasons that an examiner might be required to examine Bitcoin forensic artifacts. Bitcoin does have legitimate purposes; however, it will always be associated with nefarious deeds. Bitcoin has been used for money laundering and purchasing black market goods among others. There are a few reasons Bitcoin is ideal for these illegal purposes.
Bitcoin offers “mixers” that exist for the express purpose of exchanging your currency with other users. Bitcoin is only pseudo-anonymous after all and by using these mixers, if properly configured, one can throw investigators off the proverbial money trail.
There are a variety of mixers, such as Bitcoin Fog, SharedCoin, and Bitmixer. Each mixer has different ways of achieving the same goal, truly anonymous transactions. They typically work by combining bitcoins from various users, mixing them together, and sending the mixed currency to the appropriate recipient. This makes it exceedingly difficult to determine the original sender for a given transaction as it becomes hard to separate the original sender from the noise. To compound this issue, the mixer might be configured to split a transaction into smaller components from multiple addresses before sending it along. In essence, I might send 10 BTC to Bob using a mixing service and he might receive three transactions of 3, 5, and 2 BTC from other users who put BTC into the mixing service.
Bitcoin is often thought of as digital cash which naturally makes it an ideal candidate for purchasing illicit goods online. The most notorious example of this is using Bitcoins to purchase illegal goods from silkroad on the darknet. Silkroad and its successors, which have since been shut down, was an online black market that sold guns, drugs, illegal services, and more with Bitcoin as the medium of exchange.
Beyond Bitcoin as a currency, some see Bitcoin as an opportunity to make money. In fact, since 2012, Bitcoin has been the fastest growing area for venture capitalist funding (Tomtunguz, 2015). Beyond the traditional type of investment, mining, which will be discussed later, is a process that allows users to “mint” new bitcoins and get rewarded for doing so. As a profit is at stake, it is not unheard of for bad actors to make use of a botnet to mine bitcoins.

Bitcoin

Bitcoin, considered the first decentralized cryptocurrency, is the most polarizing and popular cryptocurrency on the market, and therefore worth looking at in greater detail. By understanding Bitcoin, examiners will be equipped to approach and understand the findings in any investigation involving cryptocurrencies. Many cryptocurrencies share a similar backbone to Bitcoin and therefore the theory and expected artifacts are equivalent.

History and current context

Bitcoin is a very volatile currency, perhaps because unlike how the USD is backed by gold, Bitcoin and other cryptocurrencies do not have a physical counterpart. Especially in late 2013 and early 2014, Bitcoin often saw aggressive swings in price based on public perception more than any other currency. But before all of that, bitcoin owes much of its growth to Silkroad, the illegal Tor-based black market.
Bitcoin has long been associated with Silkroad, the well-known online black market place created by Ross Ulbricht in January 2011 (Wired, 2015). Silkroad primarily offered drugs and other illicit services in exchange for bitcoins and was one of the first of its kind to experience a great level of success. The FBI arrested Ross “The Dread Pirate Roberts” Ulbricht in a San Francisco library in October 2013. Soon after Silkroad 2.0 and other marketplaces using bitcoin as a means of payment have popped up on the darknet.
More recently, a DEA agent responsible for helping take down Ross Ulbricht has pled guilty to taking roughly $200,000 dollars to divulge information pertaining to the investigation (Forbes, 2015). Carl Force was reportedly “seduced by the perceived anonymity of virtual currency.” Unfortunately, Bitcoin will have a hard time distancing itself from bad actors and as a means of purchasing illegal goods.
Bitcoin would experience a steady climb from its inception until an unprecedented explosion in November 2013 after sustained high demand from China and positive reception by other governments (Forbes, 2013). The U.S. Congress, for example, held a hearing in late 2013 that was surprisingly positive of the cryptocurrency (Washingtonpost, 2013). While congress did not outright legitimize the currency, clearly the news was well accepted in the community as prices soared to highs of $1242.
Soon after Bitcoin suffered a large loss when the mismanaged exchange Mt. Gox was allegedly hacked to the tune of 450 million USD on February 2, 2013 (Wired, 2014). Up to that point, Mt. Gox was estimated to have managed around 70% of all Bitcoin traffic. Mt. Gox declared bankruptcy as individuals in the bitcoin community began to question if a hack had really occurred or if it was simply fraud. This was a huge blow and many believed this spelled the beginning of the end for the cryptocurrency. And while Bitcoin prices did take a hit, the cryptocurrency has shown no signs of going away.
Bitcoin has also been used as a means of payment for ransomware, like CryptoLocker. These formative events have played a part in shaping the path and social narrative of the cryptocurrency. It is important to keep these events in mind when talking about the cryptocurrency as most have a biased perception of Bitcoin as a vehicle for malicious behavior. Despite this, Bitcoin has continued to see acceptance among various markets and, perhaps ironically, by financial institutes.

Bitcoin framework

Bitcoin’s creation is credited to the alias Satoshi Nakamoto, whose identity is still unknown. The idea behind Bitcoin was initially proposed in October 2008 as a “purely peer-to-peer version of electronic cash” that would cut out the middle man, that is, financial institutes (Nakamoto, Satoshi, 2008). Satoshi’s idea would circumvent the financial institution’s trust-based model in favor of a cryptographic proof of work model. The benefits of this model are threefold:
Double spending protection for sellers
Signed transactions protecting buyers
Able to exchange internationally

Blockchain

One issue that needed to be dealt with was double spending. In the trust model, a trusted third party verifies that the money has not already been spent before allowing a transaction. This obstacle led to Bitcoin’s greatest innovation, the blockchain. The blockchain is essentially a public ledger that is made up of blocks containing all previous transactions of the currency (Fig. 1.1).
image
Figure 1.1 The Blockchain Is Made Up of Blocks
Each block is made up of hundreds of transactions. The first block ever created in the blockchain is referred to as the Genesis Block. Blocks are linked together by the previous block hash in their header. There is only one main chain, however, forks can be created. Blocks in these forks that are not part of the main chain are referred to as orphan blocks.
There are many “blockchain explorers” online that can be used to view the public ledger. In this chapter, we will use blockchain.info due to its simple interface and powerful API. The blockchain.info website has great documentation on its public API and can be used to create custom solutions for bitcoin-related investigations.
Fig. 1.1 is a simplification of the blockchain and its hierarchical structure. Each block contains more data than just an array of transactions (Bitcoin (hashing), n.d.). A block consists of an 80 byte header field followed by the array of transactions stored in the block (Table 1.1). Most importantly, a SHA256 hash of the array of transactions, the Merkle root, is calculated and can be used to verify that a transaction belongs in the block. Transactions that do not belong in the block will not produce the same Merkle root. Additionally, the SHA256 hash of the previous block stored in the header effectively creates a linked list of blocks. The time value refers to the time that the block was verified.

Table 1.1

Structure of a Block

Byte Offset Item
0–3 Version
4–35 Previous Block Header Hash
36–67 Merkle Root
68–71 Time
72–75 Difficulty
76–79 Nonce
80– Array of Transactions

Note that the length of the block depends on the size of the transaction array which can contain anywhere from a few hundred to a few thousand transactions.

The first block, the genesis block, was created in 2009 and released the first 50 BTC onto the market. We can take a closer look at the contents of any block by visiting the following URL and substituting the block’s hash root for the %block-hash%: https://blockchain.info/rawblock/%block-hash% (Blockchain_api, n.d.). The hash of the genesis block is 000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f. Visiting the raw block we can visualize the header structure.
image
The hash we used to look up the block represents the SHA256 hash of the header itself. Naturally, the SHA256 hash of the previous block is represented by all zeroes as no block comes before the genesis block. In all blocks following the genesis block, the previous block value is the header hash of the block that came before it. For example, the block that follows the genesis block would have a prev_block value of 000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f. The Merkle Root of the genesis block is the hash of all transactions in the block. The time represents the unix time that the block was mined. The bits, or difficulty, and nonce values will be explained in the mining section and are involved in the verification process of the block.

Wallets and addresses

Bitcoins are stored in web, mobile, or desktop wallets and, within a wallet, are assigned to certain user-created addresses. Wallets come in a variety of different flavors and offer their own advantages. A list of popular wallets and their features can be found on https://bitcoin.org/en/choose-your-wallet. While wallets are necessary they come with their own risks which can be an advantage for investigations. Most adept Bitcoin users will maintain a backup of their wallet if it is stored on locally such as on a hard drive or a phone’s flash memory. If a hardware failure was to occur, the bitcoins associated with the addresses could be forfeit if the private keys are lost. Most wallet software comes with an option of creating a backup or a backup can be created in the traditional sense.
DarkWallet is a Bitcoin wallet that has interesting forensic implications. It is a wallet plus mixer hybrid and implements a modified version of CoinJoin. We will not be examining this wallet as being in alpha it will go through many revisions before release. That said, it is worth keeping an eye on as the Bitcoin community has received it well. Possession of this wallet is not in itself an indicator of guilt, but much like having CCleaner or SDelete installed on a suspect’s system, does warrant further examination.
Regardless of the wallet it will have at least one address to send and receive bitcoins from. These addresses are equivalent to a mailing address. While requesting or sending a bitcoin transaction the user will specify the address to be used.
Now, as we know bitcoins are not real, and so they are not really stored in the desktop wallet. Instead, they all “exist” in the blockchain, the public ledger, and trading bitcoins just means transferring your bitcoins in the ledger to another address in the ledger. In Bitcoin, an address is what the bitcoins are associated with, not the wallet. This is an important distinction as we will see later. You can have much more than one address. In fact, it is recommended to create a new address for each transaction to ensure security and anonymity.
As discussed previously, Bitcoin employs asymmetric private and public key encryption. Each address is connected to a pair of mathematically linked private and public key. No two addresses should share the same private and public key. The private key is used as a digital “signature” to sign transactions. During the verification process, the address’ public key is used to confirm the signature and validate the transaction.
Let us discuss how these keys are made. Knowing the composition of these keys comes into play when examining evidence that stores the address in a format that must be processed before looking up. The private key is just a 256-bit random integer. In wallets, you will often see the private key in its base58 encoded format (Bitcoin (Private_key), n.d.). The public key is the result of scalar multiplication of the private key and a “base point.” (Coindesk, n.d.). The public key is then the 256-bit x and y values representing a specific point on the elliptic curve (Bitcoin (Elliptic _Curve), n.d.). An additional prefix of 0x04 is added to the public key. The specific elliptic curve Bitcoin uses is referred to as Secp256k1 and has the equation (Bitcoin (Secp256k1), n.d.):

y2=x3+7(Fig. 1.2)

image
image
Figure 1.2 A simple graphical representation of the curve used by Bitcoin to generate public and private keys.
The address itself is directly related to the public key. After the public key is created from ECDSA, it is processed through a series of RIPEM-160 and SHA256 hashes and finally base58 encoded to arrive at the address. Every time an address is created, the process begins again with a new private key generated. When we examine some of the common artifacts we will use this knowledge to take an intermediate form of the public key and convert it into an address. The process of converting a private key into an address proceeds in a one-way direction. It has not been demonstrated as possible to take an address and reverse engineer its private key. During a transaction, the sender signs with his or her private key.
Similarly to the creation of the public key, signing with the private key involves scalar multiplication. Scalar multiplication is used in ECDSA as a means of creating a one-way calculation. It is computationally simple to proceed in one direction but nearly impossible to do the reverse. In this manner, they ensure that knowing the public key or knowing the signature for the transaction will not lead a bad actor to the private key. And since the private key, public key, and signature are related the sender’s public key and signature are used during the verification process to ensure the transaction is valid.

Transaction

What is a transaction in Bitcoin? It is fairly straightforward and similar to other transactions we are familiar with. A sender must have a Bitcoin address that has bitcoins associated with it. Additionally, the address of the recipient is needed to tell Bitcoin where to send the money. When the “send” button is selected, the software signs the transaction with the sender’s public key and broadcasts the transaction to the network. The transaction is identified by a SHA-256 hash of the transaction data referred to as the transaction hash. The transaction is picked up and entered into the verification phase. Approximately 10 min later the recipient will receive their bitcoins. Note that using other services such as mixers it is possible to set a time delay to distance oneself from the transaction in addition to the other obfuscation features that a mixer provides. The blockchain and miners are responsible for confirming and validating the transaction.

Verification

The verification process is designed to take 10 min on average to complete and is composed of these basic steps:
1. Assign a transaction to a nonverified block
2. Confirm coins were signed by the sender
3. Confirm coins have not already been spent
4. Repeat for all transactions in the block
5. Calculate a difficult SHA256 hash of the block plus nonce
6. Add the block to the blockchain
The first four steps of the verification process are trivial. As mentioned in the previous section, the signature is checked against the sender’s public key to verify the coins belong to the sender. Miners then check the blockchain to prevent double spending by determining that the coins have not already been spent.
The majority of the 10 minutes is spent computing an arbitrarily difficult SHA256 hash of the block. Bitcoin uses the hashcash proof of work function for this step (Bitcoin (Hashcash), n.d.). A SHA256 hash of the block must be calculated with a certain number of preceding zeroes. As more nodes are present on the network (i.e., more Bitcoin users), the number of preceding zeroes increases, effectively increasing the difficulty of the calculation. After every 2016 verified blocks a system automatically adjusts the difficulty of the hash if the average time of the verification process deviates from 10 min. Difficulty increases as more Bitcoin nodes are added to the network.
Computers compete by appending a “nonce,” a random number, to the block and rehashing it until the correct hash is obtained. Once the correct hash is obtained, the nonce used is broadcasted to the network where each node appends the nonce to the block and hashes it, if a majority of the nodes agree then the block, and all of its transactions, is added to the chain.
All the transactions in a given block are considered verified once that block is followed by at least six blocks. Otherwise, the block is considered an “orphan.” Orphan blocks are not uncommon and can be a byproduct of multiple miners “finding” the right nonce at the same time.

Mining

Miners contribute their computing power to verify transactions by calculating difficult hashes. There are two types of miners, those that mine by themselves and those that work in “pools” with other miners. Regardless, miners seeking to make a profit use specialized hardware designed for this purpose. They are incentivized to verify transactions with Bitcoin rewards. More miners increase the integrity of Bitcoin by making it more difficult for a bad actor to have control of a majority of the verification process.
There are three ways of obtaining Bitcoins: purchase from an exchange, transfer among individuals, and mining. Mining has two main functions:
Verify and add blocks to the blockchain
Release new Bitcoins to the market
Once the verification processes are completed, the computer that calculated the correct hash is rewarded with freshly “minted” Bitcoins called the coinbase. The current reward is 12.5 Bitcoins per block. Unlike some cryptocurrencies, Bitcoin operates on reward-halving of the coinbase. This means that there are discrete phases of rewards. The next phase will be half of the current reward (6.25 BTC).
Users typically pay fees while making transactions. These fees are also awarded to the miner to incentivize them to include their transactions in the block. These fees will be increasingly important to miners as the coinbase decreases over time.
The speed of a miner is measured in gigahashes per second (GH/s). A miner’s speed is important, especially for solo miners, because having a greater hashing rate gives you a better chance of finding the nonce. Because of the difficulty of mining, individuals will devote specialized hardware to exclusively mine. While it is possible to mine with typical computers, especially those in a botnet, it is not nearly as efficient. In the early days of Bitcoin, and still the case with younger cryptocurrencies that are easier to mine, CPU and GPUs were used. However, this has since become antiquated as it is not possible to generate enough GH/s to even cover electrical costs.
Application-specific integrated circuits (ASICs) are regarded as the most efficient miners. These ASIC devices are specifically designed to generate the high GH/s rate and have no purpose outside of mining. Field-programmable gate arrays (FPGAs) are another mining alternative that are relatively inexpensive compared with ASICs and still viable under the current mining conditions.
Miners must weigh the speed of their system and make a decision if they can remain profitable as a solo miner or if they should work in a pool. Solo miners might not “mint” many blocks, but receive the full coinbase and fees for their hard work. In pools, the odds of finding the nonce are higher from the combined computer power, but the reward is split among the members. Often times, miners work in pools, as the likelihood of calculating the hash on their own is slim. When a member of a pool calculates the winning hash the reward is split among the group based on each individual’s contributed computation power. There are many mining pools that miners can sign up for. The most popular mining pools are F2Pool, BitFury, and AntPool.
To stimulate scarcity, Bitcoin has a finite amount of coins that can ever be created, 21 million Bitcoins. There are currently over 13 million coins in circulation. As more coins are mined the reward decreases (Table 1.2).

Table 1.2

Depicts Diminishing Returns on Bitcoins per Block as More Bitcoins Are in Circulation

Block Reward Projected Year BTC Mined Percentage Increase Percentage of Supply Limit
0 50.00 2009 2,625,000 Infinite 12.50
52,500 50.00 2010 2,625,000 100.00 25.00
105,000 50.00 2011 2,625,000 50.00 37.50
157,500 50.00 2012 2,625,000 33.33 50.00
210,000 25.00 2013 1,312,500 12.50 56.25
262,500 25.00 2014 1,312,500 11.11 62.50
315,000 25.00 2015 1,312,500 10.00 68.75
367,500 25.00 2016 1,312,500 09.09 75.00
420,000 12.50 2017 656,250 04.17 78.125

Each decrease in reward is referred to as a block reward-halving event. The full table of the year-by-year projections for Bitcoin can be found at en.bitcoin.wiki/Controlled_supply.

Orphan blocks, which have been mentioned previously, go through a slightly different process during mining. The transactions in the Orphan block must get added to another block and go through the verification process again to get added to the longer chain. Additionally, miners receive no reward for Orphan blocks. Transactions will not be verified until six other blocks follow a block. This way, modification of the blockchain is impractical due to the need to regenerate the hash of all the blocks above it. However, if a single or group of bad actors ever had enough hashing power it would be possible to accept any block to the blockchain.
A well-known flaw of the blockchain is that if 51% or greater of the hashing rate is ever contributed by any one or group of individuals the system breaks. This is due to the fact that transactions are verified based on majority vote. With the majority vote, these bad actors can double spend and prevent other transactions from being confirmed. However, this inherent flaw is only a concern with large mining pools as an individual is unlikely to have the resources to possess 51% of the hashing rate at a given time.

Blockchain explorers

We have talked quite extensively about the intricacies of Bitcoin and its blockchain format. However, let us actually look at an example of what such a transaction looks like using a blockchain explorer. There are plenty of blockchain explorers; however, we will use the blockchain.info website (blockchain, n.d.). We can use a variety of bitcoin objects to search for such as:
Block Height or Hash
Bitcoin Address
Transaction Hash
The block height is a block’s index or the distance from the genesis block where the genesis block is zero. Let us examine the genesis block once again as it is the simplest example. You can type “0” for the block height or use the block hash “000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f.”
When viewing the details of the block, notice that the metadata from the raw block header is also reported here. There is only one transaction in the genesis block and was the first 50 BTC “mined.” This block was verified on 01/03/2009 18:15:05 UTC. In later blocks, additional entries exist such as who “relayed” the block. Essentially, this will tell you who or what pool “mined” the block.
For each transaction, we can click on or search the particular transaction hash to view the relevant details. Here, we can see the address that sent the transaction (no one in the case of the genesis block) and the recipient(s). In addition, the time of the transaction in UTC is recorded as well as how much was sent. Note that the relayed by IP is the IP of the node on the network that first saw the transaction. This is not the IP address of the sender or the recipient. The Visualize tab is especially useful and allows an examiner to view the flow of bitcoins from one address or another. However, as mentioned previously, this can easily be obfuscated and difficult to follow.
Last, we could click on or search the address in the only transaction in the block, 1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa. Here, we can see every transaction sent and received by the address including the amounts and time. There are also overall statistics for the account such as total received and the final balance in the account. Blockchain.info offers the option to filter, view data in charts, and export the list of transactions to a CSV or XLS file.

Bitcoin protocol

The Bitcoin protocol refers to the network component of the Bitcoin framework. Knowledge of this protocol can be leveraged to listen for transactions coming across the wire on port 8333. Note it is possible to change this port. Altcoins, like Litecoin, use a similar protocol but on different ports. The protocol traffic can be captured with any packet sniffer such as Wireshark. Incidentally in Wireshark, Bitcoin traffic will be identified as such under the “Protocol” column. That may not be the case with other Altcoins and so it is good to understand how to manually dissect the packets by knowing the appropriate byte offsets.
Keep in mind the whole Bitcoin network is populated by nodes. Each node is a client that is connected to the Bitcoin network. As part of the Bitcoin framework, these nodes interact with each other such as when making transactions. Typically, these interactions occur between the host node and a “full” node on the network. Full nodes maintain a complete copy of the blockchain and play an important role in maintaining the integrity of the network. The Bitcoin protocol uses many message types to denote interactions between nodes (Bitcoin (Protocol_documentation), n.d.).
When examining network traffic, the most important message types for investigators are inv and tx. For example, these message types occur after a transaction with an intervening getdata message. Like all message types, they have a well-defined packet structure (Table 1.3).

Table 1.3

The Packet Structure for the inv Message

Byte Offset Item
0–53 Packet Headers
54– Data
54–57 Packet Magic
58–69 Command Name (0x696e76)
70–73 Payload Length
74–77 Payload Checksum
78 Count
79–82 Type (01 – tx & 02 – Block)
83–114 Data Hash
The inv message is typically the first sign of a transaction and an indication that the transaction has initially been accepted by the network. The command name is simply the hex of the characters inv (0x696e76). inv represents a node on the network advertising the knowledge of either a transaction (type 1) or a block (type 2). These nodes will use the same Payload checksum for this specific object. For message type 1 inv, the transaction hash is represented in the Data hash field. You will need to convert the endianness of the transaction hash before it can be used to lookup a transaction on a blockchain explorer (Table 1.4).

Table 1.4

The Packet Structure for the Getdata Message

Byte Offset Item
0–53 Packet Headers
54– Data
54–57 Packet Magic
58–69 Command Name (0x67657464617461)
70–73 Payload Length
74–77 Payload Checksum
78 Count
79–82 Type (01 – tx & 02 – Block)
83–114 Data Hash
The inv message can be followed by a getdata message, which contains a matching Payload Checksum to its parent inv and the transaction hash. The command name represents the hex of the getdata characters. The getdata message is a response from the host to the node acknowledging the transaction (Table 1.5).

Table 1.5

The Packet Structure for the tx Message

Byte Offset Item
0–53 Packet Headers
54–57 Packet Magic
58–69 Command Name (0x7478)
70–73 Payload Length
74–77 Payload Checksum
78–81 Transaction Version
82 Input Count
83–379* Transaction Input *Variable Length
83–114 Previous Output – Transaction Hash
115–118 Previous Output – Index
120–121 Previous Output – Script Length
122–375 Previous Output – Signature Script
376–379 Previous Output – Sequence
380 Output Count
381–412* Transaction Output *Variable Length
381–388 Value
389 Script Length
390–412 Script
413–416 Block lock time or Block ID
The node then sends a tx message with the transaction data related to the transaction hash in the previous inv and getdata messages. The tx message will have a Payload Checksum matching that of the first 4-bytes of the inv and getdata Data Hash value. The tx message will contain most of the information publically available on the blockchain. This is one way that investigators can associate a bitcoin transaction to a specific computer. The tx packet structure has two main sections for input and output. These can be variable in length and have a well-defined structure.
The Input and Output Count fields indicate the number of addresses involved. The Previous output Hash is the previous transaction hash in big Endian of each input address. Each output address has the corresponding amount sent and the address sent to in the Value and Script fields, respectively. It should be noted that the actual value sent and the value observed in Wireshark is a difference of 1 × 10−8. Additionally, the Script value must first be processed to arrive at the address as seen on the blockchain. This process involves taking the 25-byte Script value and removing the first 3-bytes and the last 2-bytes and running the remaining 20-bytes through a series of hashes. The 5-bytes removed are simply Bitcoin protocol artifacts that are not necessary to decode the address (Fig. 1.3).
image
Figure 1.3 Starting with the generated ESDCA public key, which is made up of its (x,y) position on the curve, a series of hashes are used to generate the Bitcoin Address.
The Script value from Wireshark is already the 20-byte product from the RIPEM160 and SHA256 hash of the public key. After removing the first 3-bytes and last 2-bytes from the Wireshark Script value the remaining 20-bytes must be processed. First, a checksum must be calculated by sequential SHA-256 hashing of the 20-byte value with a prepended 00. The first 4 bytes of this value is appended to the original 20-bytes and base58 encoded. Then a Network ID must be prepended to arrive at the actual address. For Bitcoin, the Network ID is often 1 or 3.

Forensic artifacts

Up to this point we have focused on understanding the underlying framework and protocol that Bitcoin depends on. Understanding these concepts is very important. It would not be feasible to detail the artifacts for every single bitcoin wallet or environment. However, by understanding these concepts you will be aware of what to look for and understand the important of the artifact for your case. This represents one of the few scenarios in our field where we know exactly what we are looking for when we start our examination.

Multibit HD

Let us go beyond theory and take a look at a popular wallet and its associated artifacts. Multibit HD is the predecessor of Multibit, now Multibit Classic, and in both cases a great deal of artifacts are stored. Multibit HD v.0.1.3 was analyzed by sending and receiving transactions with other addresses on the network. While there are plenty of artifacts we are going to touch on a few in depth as the bulk of information an examiner will need is stored there.
We are going to analyze these artifacts using the following transaction as an example. The following data are the result of a transaction hash query on blockchain.info. In this transaction, I sent 0.015 BTC to my MultiBit HD wallet.

Blockchain transaction details

Transaction Hash 52ae5a795999a7c3dcf241a48ad0e4d634cc760f67069670cb9f6abc7897f1f3

39CzWavjtUPaa46FXuKBoRngdPXWpUVtci3HJ3QFAaLHFuh2SM3Cby3fBc94UTmPrKEi(0.0049BTC)136QgW9w8QnMoMywh8aaM7jmbhidXrnacd(0.015BTC)

image
Size 373 bytes
Received Time 2015-09-18 04:31:45
Total Input 0.0199
Estimated BTC Transacted 0.0049
Fees 0.00004 BTC
One thing to note before we continue, when I sent this transaction I only sent 0.015 BTC to the second address. Why then are we seeing a second address in the mix? In the Wallets and addresses section, I mentioned that bitcoins are associated with an address not a wallet. This meant that when I sent 0.015 bitcoins from my wallet, which only had one address with 0.02 bitcoins, it had to send the entire address’ worth of bitcoins and returned the remainder back to my wallet. This is equivalent to needing to pay $1.50 for an item and using a $5.00 for the transaction. In this case, you would receive $3.50 back into your wallet.
Be careful to not misinterpret this as a transaction where the user is sending to two separate individuals. The blockchain.info has an Inputs and Outputs section with an “Estimated BTC Transacted” field that attempts to determine how many bitcoins were actually exchanged. For this transaction it estimated 0.0049 which is ultimately incorrect, but on other occasions it was able to determine the correct estimate. This can be used as a guide but should not be relied upon exclusively.

Multibit log

First, let us examine a very important artifact that is stored on the local machine. This is the multibit-hd.txt file which is a debug log for the application. This debug log is stored in C:Users\%User%AppDataRoamingMultiBitHDlogs folder. In this directory, there might also be zipped archives with a multibit-hd-YYYY-mm-dd.log naming convention. Multibit HD creates an archive of that days log at midnight local time. This is great news as we will not often have to worry about losing entries when the log becomes too large.
As a debug log, the majority of information in this log is related to application-specific details. However, we can find information on user usage of the application and transaction information. The timestamps in the log are in military time. Specifically, when the user opens the Multibit HD application, multiple entries are recorded in the log starting with “LoggingFactory bootstrap completed” followed by other startup messages. When a user closes Multibit HD the log ends with an “Issuing system exit” message preceded by other shutdown messages. With this information we can get an accurate picture of exactly how long the application was running and when. Even better, utilizing the archived logs we can view historical activity as well. In this respect, the log is more useful for tracking application utilization over prefetch and userassist artifacts.
Additionally, transactions are stored in the debug log. Again, because of the archived logs we can view a great deal of historical transactions that are conveniently separated by the day. Below is an excerpt from the log that captures the content of our transaction of interest. Some content of this particular message was removed to make the entry more manageable.
image
The timestamp for this transaction agrees with the recorded time on the blockchain. The message begins with “Received transaction” followed by the transaction hash. Unlike what we will see with transaction hashes in network packets you can immediately use this transaction hash with a block explorer to visualize the transaction details. Later in the message we can see the two address outputs in their 160-bit hash form which is the result of RIPEM160 hash of the public key. These addresses are in square brackets and preceded by “HASH160 PUSHDATA(20).” Additionally, we can see the amounts the user sent as well which were 0.0049 and 0.015 BTC.
image
An entry one line later in the log actually stores the, at the time, current USD value total of the transaction. Note that the total value of this transaction, roughly $3.50, is based on the Bitstamp exchange at the rate of 233.00/BTC.
From my experience, it is not uncommon for the desktop wallet software to store logs of transactions among other application data. There are plenty of other artifacts for the Multibit HD as this was not an in-depth analysis of the wallet software itself. Instead, the goal is to become comfortable with the inner workings of Bitcoin itself and how these artifacts manifest themselves. And while the method in which that information is stored is application dependent, the network artifacts observed are not.

The bitcoin protocol in action

The data below are the raw packets and parsed packets as displayed in Wireshark v1.12.7. The purpose of this exercise is to explain how to interpret the contents of the packets and what the values mean. Any application running on the Bitcoin network, or similar altcoins, should exhibit the same or similar Bitcoin protocol network. For example, one can use the Bitcoin protocol structure to parse raw Litecoin packets.

Inv packet

The transaction was sent on September 18, 2015 at 12:31 AM EST. You can see that the time the packet was captured by Wireshark or from Blockchain.info is quite accurate. From my experience, there is hardly ever more than a few seconds delay between pressing send and receiving the transaction. Of course, if the recipient of the transaction was offline, they would not receive any bitcoin packets until the next time they connected and synced their client.
The inv packet is a notification from a node on the network (81.159.199.0) to the host system (192.168.1.5) of an incoming item. In this case, that item is a transaction because the “Type” is set to 1. The payload checksum for inv and getdata messages will always be 0xb5d48293 when talking about this particular transaction. A different item gets a different payload checksum. The data hash is the most important component of the inv packet. This is the transaction hash after switching endianness.
Packet Time Source Destination Protocol Length Info
322 2015-09-18 00:31:38 81.159.199.0 192.168.1.5 Bitcoin 115 inv
Offset Hex Text
000 d4 3d 7e df 30 47 18 62 2c a8 e5 da 08 00 45 00 . = ∼.0G.b,.....E.
016 00 65 4e e0 40 00 72 06 df 65 51 9f c7 00 c0 a8 [email protected].....
032 01 05 20 8d f8 f4 62 a1 13 ae e5 d6 9d 38 50 18 .. ...b......8P.
048 01 03 4e f0 00 00 f9 be b4 d9 69 6e 76 00 00 00 ..N.......inv...
064 00 00 00 00 00 00 25 00 00 00 b5 d4 82 93 01 01 ......%.........
080 00 00 00 f3 f1 97 78 bc 6a 9f cb 70 96 06 67 0f ......x.j..p..g.
096 76 cc 34 d6 e4 d0 8a a4 41 f2 dc c3 a7 99 59 79 v.4.....A.....Yy
112 5a ae 52 Z.R

Inv packet data

Offset Item Value
54–57 Packet Magic 0xf9 be b4 d9
58–69 Command Name inv (0x69 6e 76)
70–73 Payload Length 37 (0x25)
74–77 Payload Checksum 0xb5 d4 82 93
78 Count 1 (0x01)
79–82 Type (01 – tx, 02 – Block) 1 (0x01)
83–114 Data Hash 0xf3 f1 97 78 bc 6a 9f cb 70 96 06 67 0f 76 cc 34 d6 e4 d0 8a a4 41 f2 dc c3 a7 99 59 79 5a ae 52

Getdata packet

Once an inv packet is received, the host will send a getdata request back to the same node in order to receive the details of the item. Again, the payload checksum and data hash will match that found in the inv packet.
Packet Time Source Destination Protocol Length Info
323 2015-09-18 00:31:38 192.168.1.5 81.159.199.0 Bitcoin 115 getdata
Offset Hex Text
000 18 62 2c a8 e5 da d4 3d 7e df 30 47 08 00 45 00 .b,.... = ∼.0G..E.
016 00 65 79 0c 40 00 80 06 a7 39 c0 a8 01 05 51 9f [email protected].
032 c7 00 f8 f4 20 8d e5 d6 9d 38 62 a1 13 eb 50 18 .... ....8b...P.
048 01 00 8f e6 00 00 f9 be b4 d9 67 65 74 64 61 74 ..........getdat
064 61 00 00 00 00 00 25 00 00 00 b5 d4 82 93 01 01 a.....%.........
080 00 00 00 f3 f1 97 78 bc 6a 9f cb 70 96 06 67 0f ......x.j..p..g.
096 76 cc 34 d6 e4 d0 8a a4 41 f2 dc c3 a7 99 59 79 v.4.....A.....Yy
112 5a ae 52 Z.R

Getdata packet data

Offset Item Value
54–57 Packet Magic 0xf9 be b4 d9
58–69 Command Name (0x67657464617461) getdata (0x67 65 74 64 61 74 61)
70–73 Payload Length 37 (0x25)
74–77 Payload Checksum 0xb5 d4 82 93
78 Count 1 (0x01)
79–82 Type (01 – tx & 02 – Block) 1 (0x01)
83–114 Data Hash 0xf3 f1 97 78 bc 6a 9f cb 70 96 06 67 0f 76 cc 34 d6 e4 d0 8a a4 41 f2 dc c3 a7 99 59 79 5a ae 52

TX packet

When the node receives the successful getdata packet it will return the transaction details to the client. The tx packet is the most complicated of the packets but has the most forensically relevant artifacts. For the tx packet, its payload checksum will not match the payload checksum found in the previous two packets. Instead, its payload checksum is the first four bytes of the data hash value.
The input and output counts will indicate how many addresses are involved in the input (1) and output (2) respectively. The transaction input substructure can be used to tell the examiner where the input came from. In this case, the previous hash is, after switching the endianness, the transaction hash where these coins are coming from. In this example, the previous transaction is 058792da1b008fe6d0d001334dea6b57465dfee7659fe730d0191bf3d7b8ce0f and shows how the input address obtained its 0.02 bitcoins.
In each transaction output substructure we can see the value sent. Make sure to multiply this value by 1 × 10−8; otherwise, you will be dealing with an artificially higher value than was actually sent. Last, the 160-bit hash of the recipient’s public key is embedded within the script value.
image
Below is some example Python code that can be used to perform this task. Be aware that the base58 library is a dependency and not part of the standard library. As a side note, if you remove the first 3 and last 2 bytes of the script value you can search this on Blockchain.info and it will handle the conversion for you.
First, we most remove the first 3 and last 2 bytes of the raw script value if our script is 25 bytes in length. If it is 23 then a different string slicing protocol is used to arrive at the correct conclusion. Either way, this creates the 20 byte hash of the public key. Next, in line 5 we generate the checksum of which we only use the first 4 bytes and append that to the 20 byte hash. Finally, we need to base58 encode the 24 byte hash and add the appropriate Network ID. On Bitcoin, this Network ID is often a 1 or 3. Be aware that for other altcoins we can often use this same process to reverse engineer their addresses as well although their Network ID will be different.
Packet Time Source Destination Protocol Length Info
324 2015-09-18 00:31:38 81.159.199.0 192.168.1.5 Bitcoin 451 tx
Offset Hex Text
000 d4 3d 7e df 30 47 18 62 2c a8 e5 da 08 00 45 00 . = ∼.0G.b,.....E.
016 01 b5 4e e1 40 00 72 06 de 14 51 9f c7 00 c0 a8 [email protected].....
032 01 05 20 8d f8 f4 62 a1 13 eb e5 d6 9d 75 50 18 .. ...b......uP.
048 01 02 4d fd 00 00 f9 be b4 d9 74 78 00 00 00 00 ..M.......tx....
064 00 00 00 00 00 00 75 01 00 00 f3 f1 97 78 01 00 ......u......x..
080 00 00 01 0f ce b8 d7 f3 1b 19 d0 30 e7 9f 65 e7 ...........0..e.
096 fe 5d 46 57 6b ea 4d 33 01 d0 d0 e6 8f 00 1b da .]FWk.M3........
112 92 87 05 00 00 00 00 fd fe 00 00 48 30 45 02 21 ...........H0E.!
128 00 b6 1f d0 d2 86 9e 0a 09 90 c4 cb 3e c6 c6 03 ............ > ...
144 20 c4 fd 29 20 b7 e5 60 72 0b 44 38 35 a7 49 2f ..) ..‘r.D85.I/
160 73 02 20 0e 59 07 e9 73 dc 14 34 81 49 84 af f4 s. .Y..s..4.I...
176 3f 1a a9 2b 94 73 12 49 e4 1d 83 d4 e2 f1 4f cb ?.. + .s.I......O.
192 d9 6a 94 01 48 30 45 02 21 00 c8 82 da b8 36 2b .j..H0E.!.....6+
208 81 e7 80 42 a1 a5 b4 7f 50 26 72 a3 77 6f da ef ...B....P&r.wo..
224 2d 65 bc 89 28 d5 d5 e6 06 14 02 20 78 e6 6c e0 -e..(...... x.l.
240 4b 74 82 f1 6c fb 78 8c 70 0e 38 b1 79 18 a5 25 Kt..l.x.p.8.y..%
256 d2 e1 a8 33 cb 30 44 91 dc af 4a b7 01 4c 69 52 ...3.0D...J..LiR
272 21 03 73 ab 9a fa d3 cf 1f 03 15 85 77 3e 1e 71 !.s.........w > .q
288 d4 33 9d 27 2a 96 f1 c5 c2 fb 58 c4 8b d0 8e 34 .3.’*.....X....4
304 59 7a 21 03 03 93 33 61 c8 60 c7 a6 67 f6 40 cb Yz!...3a.‘..g.@.
320 dc 75 b3 4e c5 92 7e c8 ce 2f 02 a9 90 02 1c 7f .u.N..∼../......
336 9c ef 05 6d 21 02 db 82 a3 1d b3 e1 af 54 cf 3e ...m!........T.>
352 dd ee 91 db 09 1c e3 ae df 30 d5 cf 65 ae 4a 82 .........0..e.J.
368 60 95 a1 ad 13 ab 53 ae ff ff ff ff 02 58 7f 07 ‘.....S......X..
384 00 00 00 00 00 17 a9 14 ab 29 ad ca 13 69 75 70 .........)...iup
400 2d 7c ff c8 6c 19 c1 5d 58 5b bd 47 87 60 e3 16 -|..l..]X[.G.‘..
416 00 00 00 00 00 19 76 a9 14 16 f6 19 6e ca be 5a ......v.....n..Z
432 e4 7b b8 25 e0 53 21 38 4b 55 c7 50 31 88 ac 00 .{.%.S!8KU.P1...
448 00 00 00 ...

TX packet data

Byte Offset Item Value
54–57 Packet Magic 0xf9 be b4 d9
58–69 Command Name (0x7478) tx (0x74 78)
70–73 Payload Length 373 (0x75 01)
74–77 Payload Checksum 0xf3 f1 97 78
78–81 Transaction Version 1 (0x01)
82 Input Count 1 (0x01)
83379* Transaction Input *Variable length (297 bytes)
83–114 Previous Output – Transaction Hash 0x0f ce b8 d7 f3 1b 19 d0 30 e7 9f 65 e7 fe 5d 46 57 6b ea 4d 33 01 d0 d0 e6 8f 00 1b da 92 87 05
115–118 Previous Output – Index 0
120–121 Previous Output – Script Length 254 (0xfe)
122–375 Previous Output – Signature Script

0x00 48 30 45 02 21 00 b6 1f d0 d2 86 9e 0a 09 90 c4 cb 3e c6 c6 03 20 c4 fd 29 20 b7 e5 60 72 0b 44 38 35 a7 49 2f

dd ee 91 db 09 1c e3 ae df 30 d5 cf 65 ae 4a 82 60 95 a1 ad 13 ab 53 ae

376–379 Previous Output – Sequence 4294967295 (0xff ff ff ff)
380 Output Count 2 (0x02)
381412 Transaction Output #1 (32 bytes)
381–388 Value 491352 (0x58 7f 07)
389 Script Length 23 (0x17)
390–412 Script 0xa9 14 ab 29 ad ca 13 69 75 70 2d 7c ff c8 6c 19 c1 5d 58 5b bd 47 87 60 e3 16
413446 Transaction Output #2 (34 bytes)
413–420 Value 1500000 (0x60 e3 16)
421 Script Length 25 (0x19)
422–446 Script 0x76 a9 14 16 f6 19 6e ca be 5a e4 7b b8 25 e0 53 21 38 4b 55 c7 50 31 88 ac
447–450 Block lock time or Block ID 0 (0x00)

Summary

In this chapter, we learned how cryptocurrencies work at a technical level by using Bitcoin as an example. Not every aspect of the currency was examined and some simplifications were made. Armed with this knowledge, examiners should now be able to confidently discuss their findings based on factual background and understand relevant artifacts.
When encountering a new cryptocurrency, look for the white paper on that particular currency. The white paper will often go into great detail about technical aspects of the currency. Most cryptocurrencies are supported by passionate investors and often post a deluge of information on the specifics of the currency in various forums or wikis.
Above all, becoming comfortable with the blockchain can greatly help speed up the process of these types of investigations. Websites, such as Blockchain.info, provide a variety of tools and built-in APIs that can be leveraged to assist an examiner. While out of the scope of this chapter, it is not difficult to script using these APIs to automate some of these processes. For example, one could write a script to parse the relevant log for transaction hashes and then query the blockchain for other relevant information before writing out the data to a CSV file.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.149.27.234