Understanding Blockchain - Build a blockchain application from scratch in Python
I am not a blockchain expert. I only studied blockchain for a semester. But I know the most effective way to learn something is to build it from scratch. I will have chances to deal with the issues during building it. I am happy to share my view on the process of building a fully functional application: A Simple Blockchain-based Voting System (video demo). Open source is available here.

What is “blockchain”?
We can simply understand blockchain is a type of decentralized database used to store data, any type of data. The key difference is that data is stored in the form of blocks, a chain of blocks using a hash function. For Bitcoin, the data can be the transactions (logs of transfers of Bitcoin from one account to another). The chain of blocks is linked by hashes where the hash of the current block includes the hash of the previous block.

Blockchain has the following characteristics:
-
Decentralization: Any information stored in the blockchain acts as a unit of the whole network. There is no centralized authority; all the information is shared among nodes.
-
Immutability: The data of blockchain is append-only. No delete. You have to add new annotated data to fix something in the past. It makes data on the blockchain traceable.
-
Distributed ledgers: Blockchain can be public or private. But it always belongs to a group, not to anyone. It is typically managed by a peer-to-peer network collectively adhering to a protocol for inter-node communication and validating new blocks.
The Core Idea Behind Blockchain
A little bit of history, in 2008, to implement a system where document timestamps could not be tampered without requiring a trusted party, Nakamoto improved the design of Merkle trees by using a Hashcash-like method to timestamp blocks. Hashcash is a proof-of-work system that was first used to limit email spam and denial-of-service attacks. Hashcash algorithm requires the sender to add an encoding of hashcash stamp to the header of the email. The header requires the recipient’s email address, the date of the message, the counter and the important condition, the first 20 bits of the 160-bit SHA-1 hash of the header are all zeros. The sender has to find the appropriate counter to satisfy the above condition. The only known way to do that is brute force. The recipient can easily check the condition of the email. If it satisfies, the recipient knows that the sender has lost a certain amount of computations to send an email, so they can believe that the sender cannot send this mail to everyone.
The Hashcash-like method in blockchain works with the same mechanism. The data is append-only and organized as blocks. A block consists of the data, previous hash, and nonce number. The nonce number plays a role as the counter and the current hash as the hashcash in the email example.
current_hash = hash(previous_hash + data + nonce)
The number of first zeros in the current hash is called the difficulty parameter. The greater the difficulty, the lower the likelihood of finding the appropriate nonce. In order to append some data, you have to create a new block and find a nonce that makes the current hash representation has the number of first zeros enough. It means you have to trade a certain amount of computational power to add a new block. This algorithm is called proof of work and this process is called mining.
Since the hash of a block will be in the header of the next block, you have to recalculate the subsequent blocks if you want to change data in the past. It also means that it is possible to have many different branches and many different data versions. In that case, blockchain maintains a protocol guaranteeing that the longest chain will be treated as the main chain. This protocol is called consensus.
Nowadays, there are many types of blockchain with different consensus algorithms. We will not discuss them here.
Why are “hashcash”, “chain”, and “longest-chain”?
In order to comprehend, we consider a scenario called a 51% attack, which is posed when Bitcoin was born. Let imagine that you join Bitcoin and use Bitcoin to buy a car. No trusted party in Bitcoin. It has to validate your transaction by itself ( checking if there is enough money in your account, this process will be validated by other people in the Bitcoin network ). Suppose you are a hacker and you want to cheat that transaction. Bitcoin does not store your balance (Bitcoin trace all of your transactions in the past to get your balance). The only thing you can do is to make the previous transaction disappear from the Bitcoin database. How?
You can clone blockchain to your machine, remove your transaction, and continue sharing in the network and validating other transactions. You will get your own branch. In this branch, you got a car without money.
The problem is other people in the blockchain network only accept the longest chain. You want to make other people believe that your chain is the real one. Thus, you have to make your chain longer than others. But when you mine your block, others are also mining in the main chain. As a consequence, you must have a computer with greater computing power than all others combined. In other words, your computational power must be at least 51% of the computing power of the entire network to catch up with the main chain. With a public network like Bitcoin, it is possible to consider it as impossible.
Blockchain in Python
In this section, we implement a simple blockchain from scratch with Python. This piece is a summarization of the fantastic article. This post aims to help you easily follow source code; you should run it yourself and print all things you are confused about. If you already know what it is, check part 2 to build a fully functional Blockchain-based Voting System.
First, you need a class to represent a block:
class Block:
def __init__(self, index, transactions, timestamp, previous_hash, nonce = 0):
self.index = index
self.transactions = transactions # data
self.timestamp = timestamp
self.previous_hash = previous_hash
self.nonce = nonce
def compute_hash(self):
block_string = json.dumps(self.__dict__, sort_keys=True)
return sha256(block_string.encode()).hexdigest()
and a class to represent a blockchain:
class Blockchain:
# difficulty of our PoW algorithm
difficulty = 2
def __init__(self):
self.unconfirmed_transactions = []
self.chain = [] #store chain of Block
@property
def last_block(self):
return self.chain[-1]
transactions
are the data that are stored in JSON format. For example, in Bitcoin, it will look like:
{
"index": "an unique id",
"sender": "private key of sender",
"receiver": "public key of receiver",
"amount": 100,
"timestamp": "The time at which the transaction was created"
}
unconfimed_transactions
is a pool that stores unvalidated transactions. Miner will pick transactions from the pool to validate.
The first block in the blockchain is always constructed with previous_hash is 0. In order to add a new block to the blockchain, you must implement the proof of work algorithm which will find appropriate nonce by brute-force and mine function which will get unconfirmed_transactions
and try to validate those transactions and run proof_of_work
and add a new block to the chain.
class Blockchain:
"""
previous code...
"""
def proof_of_work(self, block):
"""
Function that tries different values of nonce to get a hash
that satisfies our difficulty criteria.
"""
block.nonce = 0
computed_hash = block.compute_hash()
while not computed_hash.startswith('0' * Blockchain.difficulty):
block.nonce += 1
computed_hash = block.compute_hash()
return computed_hash
def mine(self):
"""
This function serves as an interface to add the pending
transactions to the blockchain by adding them to the block
and figuring out Proof Of Work.
"""
if not self.unconfirmed_transactions:
return False
last_block = self.last_block
new_block = Block(index=last_block.index + 1,
transactions=self.unconfirmed_transactions,
timestamp=time.time(),
previous_hash=last_block.hash)
proof = self.proof_of_work(new_block)
self.add_block(new_block, proof)
self.unconfirmed_transactions = []
return new_block.index
add_block
should be easily appending a new block. But the problem occurs when there are many people mining at the same time. They might get a new block before us, then the chain we use is no longer the longest chain. So we must check the validity of our chain before adding.
class Block:
"""
previous code...
"""
def add_block(self, block, proof):
"""
A function that adds the block to the chain after verification.
Verification includes:
* Checking if the proof is valid.
* The previous_hash referred in the block and the hash of latest block
in the chain match.
"""
previous_hash = self.last_block.hash
if previous_hash != block.previous_hash:
return False
if not self.is_valid_proof(block, proof):
return False
block.hash = proof
self.chain.append(block)
return True
def is_valid_proof(self, block, block_hash):
"""
Check if block_hash is valid hash of block and satisfies
the difficulty criteria.
"""
return (block_hash.startswith('0' * Blockchain.difficulty) and
block_hash == block.compute_hash())
We almost had a blockchain that can run on a single computer. The next thing we should do is create an application using this blockchain. We leverage the Flask framework to build a RESTful application that interacts with our blockchain database.
from flask import Flask, request
import requests
# Initialize flask application
app = Flask(__name__)
# Initialize a blockchain object.
blockchain = Blockchain()
# Flask's way of declaring end-points
@app.route('/new_transaction', methods=['POST'])
def new_transaction():
"""
API to post a transaction to Blockchain
"""
#...blockchain.unconfirmed_transaction.append(tx_data)
return "Success", 201
@app.route('/chain', methods=['GET'])
def get_chain():
"""
API to get chain of block, in JSON format
"""
# ...data = block.chain.__dict__()
return jsonify(data)
@app.route('/mine', methods=['GET'])
def mine_unconfirmed_transactions():
"""
API to call mine function in blockchain
get transactions in unconfirmed_transactions to validate and add new block
"""
blockchain.mine()
return "Success", 2001
### And write some endpoints that you need in your application
Finally, to join a blockchain network, you have to adhere to the protocol for inter-node communication, which is the consensus algorithm.
def consensus():
"""
Our simple consensus algorithm. If a longer valid chain is
found, our chain is replaced with it.
"""
global blockchain
longest_chain = None
current_len = len(blockchain.chain)
### get blockchain from other nodes, check if your blockchain is the longest one, otherwise, pull the longest one.
if longest_chain:
blockchain = longest_chain
return True
return False
Up to this point, we have understood what blockchain is, and how to implement a simple one. In the next post, I want to share with you how to implement a fully functional blockchain-based application with different architecture. I also want to introduce a new concept in blockchain, smart-contract, and how to handle it.
References:
- https://developer.ibm.com/technologies/blockchain/tutorials/develop-a-blockchain-application-from-scratch-in-python/
- https://en.wikipedia.org/wiki/Blockchain
- https://en.wikipedia.org/wiki/Hashcash
- https://www.coindesk.com/moscow-blockchain-voting-system-completely-insecure-says-researcher
- https://steemit.com/blockchain/@abdulganiy/understanding-decentralised-and-distributed-databases
- https://blog.goodaudience.com/what-is-a-51-attack-or-double-spend-attack-aa108db63474