1. What is Bitcoin?

Bitcoin is an electronic currency (digital currency) based on cryptography. On 1st November 2008, Satoshi Nakamoto (whose identity remains unknown—whether they are a person, an AI, an organisation, or an individual is uncertain) proposed the concept of Bitcoin. They published a paper on Bitcoin, which has since become the white paper of Bitcoin.

In this white paper, a decentralised electronic bookkeeping system was proposed. Traditional electronic cash systems use banks to keep accounts, supported by the country’s credit, whereas the decentralised electronic bookkeeping system is shared among participants. People obtain bitcoins through mining and complete payments through public accounting.

2. The principle of blockchain

(1) Merkle Tree

A hash tree, or Merkle tree, is a tree in which each leaf node is marked with the cryptographic hash of a data block, while each non-leaf node is labelled with the cryptographic hash of its child nodes. Hash trees allow efficient and secure verification of the contents of large data structures. A hash tree is a generalisation of hash lists and hash chains. In order to prove that a leaf node is part of a given binary hash tree, it is necessary to compute a number of hashes proportional to the logarithm of the number of leaf nodes. This is in contrast to hash lists, where the number of operations required is directly proportional to the number of leaf nodes.

This is an example of a binary hash tree. Hash 0-0 and 0-1 are the hash values of data blocks L1 and L2, respectively, and hash 0 is the concatenation of hashes 0-0 and 0-1.

(2) Block and blockchain

A blockchain is composed of many blocks, linked together using encryption. Each block contains a block header and transaction information, often represented by a Merkle tree. The header information includes the hash sequence and the timestamp. By design, the blockchain is resistant to modification of its data, as any changes to a given block require altering all subsequent blocks.

Conceptually, the blockchain can be seen as consisting of five layers:

  1. Infrastructure (hardware)
  2. Network (node discovery, information dissemination, and verification)
  3. Consensus (proof of work, proof of stake)
  4. Data (blocks, transactions)
  5. Applications (dApps)

Because the data in any given block cannot be modified without altering all subsequent blocks, blockchain technology ensures security and integrity.

The figure above shows the formation of a blockchain. The main chain (in black) consists of the longest sequence of blocks from the genesis block (in green) to the current block. There are also isolated blocks (in purple) outside the main chain.

Blockchain allows participants to independently verify and audit transactions at a relatively low cost. It uses peer-to-peer networks and distributed timestamp servers to independently manage the blockchain database. This ensures security through large-scale collaboration and collective self-interest. Blockchain is often described as a value exchange protocol, eliminating the infinite reproducibility of digital assets and addressing the double-spending problem. It can also maintain ownership records, detailing offers and acceptances when set up correctly.

An example: Imagine a transaction initiated between four individuals, A, B, C, and D. A transfers 100 bitcoins to B. Since this is a decentralised accounting method, each of these four participants records the transaction in their own ledger, noting the transfer of 100 bitcoins from A to B. This transaction information is stored in a block. Each block is 1MB in size and can store approximately 4,000 pieces of information. Why is A chosen as the reference point to notify others? How do we determine who should keep the records in ordinary transactions? What incentives exist for keeping records? These are key questions encountered in blockchain design.

3. Reasons for using blockchain for accounting

To incentivise participation, Satoshi Nakamoto introduced a reward mechanism in their blockchain design. There are two types of rewards for maintaining the blockchain: transaction fees and block rewards.

Each initiated transaction includes a small fee, which is rewarded to the users who record and verify the transaction. In addition to transaction fees, the system rewards the user who successfully adds a new block to the blockchain. According to Satoshi Nakamoto’s design, a new block is generated every ten minutes, and a certain amount of bitcoin is rewarded to the miner. Initially, this reward was set to 50 bitcoins, and it halves approximately every four years. Consequently, over time, the reward decreases, and the total number of bitcoins will approach 21 million, making it increasingly difficult to mine new bitcoins.

4. Who is the “centre” in each block?

(1) Obtaining the power to package through the proof-of-work mechanism

Satoshi Nakamoto also described how each user or node competes for the power of bookkeeping through a proof-of-work mechanism. The first user who successfully solves the required calculations for each block is allowed to produce the next block. The proof-of-work mechanism involves solving a cryptographic puzzle based on the SHA256 algorithm, requiring users to try solutions one by one. The first to solve the puzzle gains the right to package the block—this process is known as mining.

(2) SHA256 algorithm

The Secure Hash Algorithm (SHA) is a series of cryptographic hash functions defined by the U.S. National Institute of Standards and Technology. SHA-1 and SHA-2 are different versions of this algorithm, varying in the bit length of their outputs and internal structure. SHA-2, regarded as an improvement over SHA-1, includes different variants, the most popular being the 256-bit version (SHA256). The final output of SHA256 is a 256-bit binary number. The following is the pseudocode of the SHA256 algorithm:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
Initialize hash values:
(first 32 bits of the fractional parts of the square roots of the first 8 primes 2..19):
h0 := 0x6a09e667
h1 := 0xbb67ae85
h2 := 0x3c6ef372
h3 := 0xa54ff53a
h4 := 0x510e527f
h5 := 0x9b05688c
h6 := 0x1f83d9ab
h7 := 0x5be0cd19

Initialize array of round constants:
(first 32 bits of the fractional parts of the cube roots of the first 64 primes 2..311):
k[0..63] :=
0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5,
0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174,
0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da,
0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967,
0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85,
0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070,
0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3,
0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2

Pre-processing (Padding):
begin with the original message of length L bits
append a single '1' bit
append K '0' bits, where K is the minimum number >= 0 such that L + 1 + K + 64 is a multiple of 512
append L as a 64-bit big-endian integer, making the total post-processed length a multiple of 512 bits
such that the bits in the message are L 1 00..<K 0's>..00 <L as 64 bit integer> = k*512 total bits

Process the message in successive 512-bit chunks:
break message into 512-bit chunks
for each chunk
create a 64-entry message schedule array w[0..63] of 32-bit words
(The initial values in w[0..63] don't matter, so many implementations zero them here)
copy chunk into first 16 words w[0..15] of the message schedule array

Extend the first 16 words into the remaining 48 words w[16..63] of the message schedule array:
for i from 16 to 63
s0 := (w[i-15] rightrotate 7) xor (w[i-15] rightrotate 18) xor (w[i-15] rightshift 3)
s1 := (w[i- 2] rightrotate 17) xor (w[i- 2] rightrotate 19) xor (w[i- 2] rightshift 10)
w[i] := w[i-16] + s0 + w[i-7] + s1

Initialize working variables to current hash value:
a := h0
b := h1
c := h2
d := h3
e := h4
f := h5
g := h6
h := h7

Compression function main loop:
for i from 0 to 63
S1 := (e rightrotate 6) xor (e rightrotate 11) xor (e rightrotate 25)
ch := (e and f) xor ((not e) and g)
temp1 := h + S1 + ch + k[i] + w[i]
S0 := (a rightrotate 2) xor (a rightrotate 13) xor (a rightrotate 22)
maj := (a and b) xor (a and c) xor (b and c)
temp2 := S0 + maj

h := g
g := f
f := e
e := d + temp1
d := c
c := b
b := a
a := temp1 + temp2

Add the compressed chunk to the current hash value:
h0 := h0 + a
h1 := h1 + b
h2 := h2 + c
h3 := h3 + d
h4 := h4 + e
h5 := h5 + f
h6 := h6 + g
h7 := h7 + h

Produce the final hash value (big-endian):
digest := hash := h0 append h1 append h2 append h3 append h4 append h5 append h6 append h7

(3). SHA256 core algorithm code formed by C++ language

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
#pragma once

#ifndef SHA256_H
#define SHA256_H
#include <string>

class SHA256
{
protected:
typedef unsigned char uint8;
typedef unsigned int uint32;
typedef unsigned long long uint64;

const static uint32 sha256_k[];
static const unsigned int SHA224_256_BLOCK_SIZE = (512 / 8);
public:
void init();
void update(const unsigned char* message, unsigned int len);
void final(unsigned char* digest);
static const unsigned int DIGEST_SIZE = (256 / 8);

protected:
void transform(const unsigned char* message, unsigned int block_nb);
unsigned int m_tot_len;
unsigned int m_len;
unsigned char m_block[2 * SHA224_256_BLOCK_SIZE];
uint32 m_h[8];
};

std::string sha256(std::string input);

#define SHA2_SHFR(x, n) (x >> n)
#define SHA2_ROTR(x, n) ((x >> n) | (x << ((sizeof(x) << 3) - n)))
#define SHA2_ROTL(x, n) ((x << n) | (x >> ((sizeof(x) << 3) - n)))
#define SHA2_CH(x, y, z) ((x & y) ^ (~x & z))
#define SHA2_MAJ(x, y, z) ((x & y) ^ (x & z) ^ (y & z))
#define SHA256_F1(x) (SHA2_ROTR(x, 2) ^ SHA2_ROTR(x, 13) ^ SHA2_ROTR(x, 22))
#define SHA256_F2(x) (SHA2_ROTR(x, 6) ^ SHA2_ROTR(x, 11) ^ SHA2_ROTR(x, 25))
#define SHA256_F3(x) (SHA2_ROTR(x, 7) ^ SHA2_ROTR(x, 18) ^ SHA2_SHFR(x, 3))
#define SHA256_F4(x) (SHA2_ROTR(x, 17) ^ SHA2_ROTR(x, 19) ^ SHA2_SHFR(x, 10))
#define SHA2_UNPACK32(x, str) \
{ \
*((str) + 3) = (uint8) ((x) ); \
*((str) + 2) = (uint8) ((x) >> 8); \
*((str) + 1) = (uint8) ((x) >> 16); \
*((str) + 0) = (uint8) ((x) >> 24); \
}
#define SHA2_PACK32(str, x) \
{ \
*(x) = ((uint32) *((str) + 3) ) \
| ((uint32) *((str) + 2) << 8) \
| ((uint32) *((str) + 1) << 16) \
| ((uint32) *((str) + 0) << 24); \
}
#endif

5. Blockchain and Bitcoin Security

Are blockchain and Bitcoin secure? How do they prevent counterfeiting, tampering, and double spending? In everyday life, identity authentication can involve facial recognition, signatures, or fingerprints. However, once digitised, these can be forged by copying. To address this, Bitcoin uses digital signatures. Digital signature technology primarily relies on asymmetric encryption. Bitcoin first generates a random number to create a private key (known only to the user). This private key is used to encrypt data, and a public key is then generated from it, which is used for decryption. Finally, a public address is generated, allowing users to conduct anonymous transactions. A common encryption algorithm used for such purposes is RSA, while Bitcoin employs an elliptic curve cryptography algorithm.

Conversion process

  1. First, use a random number generator to produce a private key, which is a 256-bit binary number. The private key is akin to a bank card PIN and must remain confidential.

  2. The private key is then used to generate a public key using the SECP256K1 algorithm, an elliptic curve cryptographic method similar in function to RSA. A public key can be generated from a known private key, but it is computationally infeasible to deduce the private key from the public key.

  3. Like the SHA256 algorithm, RIPEMD160 is also a hash function, producing a hash value of the public key. However, it is impossible to derive the original public key from this hash value.

  4. A one-byte version number is added to the start of the public key hash, followed by two rounds of the SHA256 hash function. The first 4 bytes of the result are used as a checksum and appended to the end.

  5. The result is then encoded using BASE58 to obtain the wallet address (akin to a bank account), such as A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa.

(1) Resolving Forged Transaction Records

The figure above shows an event where A transfers 1 bitcoin to B. After computing a digest using a hash function, the digest is encrypted using A’s private key, resulting in a signature. Due to the uniqueness of the private key, the digest is also unique. A broadcasts the following messages: “A transfers 1 bitcoin to B”, A’s public key, and A’s signature.

Suppose the message “A transfers 1 bitcoin to B” is false, resulting in Digest 1 based on the hash operation. If A’s public key is used to decrypt the signature and yields Digest 2, any discrepancy between Digest 1 and Digest 2 would indicate that “A transfers 1 bitcoin to B” is a forged message.

(2) Preventing Tampering with Transaction Records

To prevent tampering with transaction records, the blockchain relies on the longest chain principle:

When the blockchain branches—i.e., when more than one person mines the next block at almost the same time—the chain direction diverges. Generally, the longest chain principle is used to determine which chain to continue. Suppose group A chooses to mine on-chain while group B chooses to mine off-chain. If group A successfully mines the next block first, adding it to their chain, group B will abandon their chain and switch to the new block. Typically, the shorter chain is discarded.

According to the longest chain principle, if someone attempts to tamper with a block, they must create a new chain starting from that block and make it longer than the original chain. In other words, they would need computational power greater than that of all other mining nodes combined to achieve this—a scenario with extremely low probability. For instance, if a person controls 90% of the world’s mining machines, they could theoretically tamper with the transaction history of a chain, but it would be far more profitable to use that mining power to mine legitimately.

(3) Preventing Double Spending

In the event of double spending, for example, A having only 100 bitcoins but simultaneously broadcasting the messages “A to B: 100 bitcoins” (recorded as message b) and “A to C: 100 bitcoins” (recorded as message c), group D will reject message c if it receives message b first, and group E will do the opposite. The validity of each message depends on which group (D or E) manages to solve the proof-of-work first. The first to successfully mine will write the confirmed transaction in the new block, while the other transaction is deemed invalid.

Copyright Notice
This article, except for the referenced content below, is the original work of Junhao. The author retains the exclusive rights to its final interpretation. If there are any issues regarding copyright infringement, please contact me for removal. Reproduction or distribution of this content without my explicit permission is prohibited.

6. References

[1]. BMoney http://www.weidai.com/bmoney.txt

[2]. Wikipedia https://en.wikipedia.org/wiki/Bitcoin

[3]. Wikipedia https://en.wikipedia.org/wiki/SHA-2

[4]. Wikipedia https://en.wikipedia.org/wiki/Blockchain

[5]. Youtube https://www.youtube.com/watch?v=g_fSistU3MQ&t=11s

[6]. BitcoinOrg https://bitcoin.org/bitcoin.pdf