Ethereum概念理解

必读:

Block, transaction, account state objects and Ethereum tries

  • The world state trie contains the mapping between addresses and account states. The hash of the root node of the world state trie is included in a block (in the stateRoot field) to represent the current state when that block was created. We only have one world state trie.
  • The account storage trie contains the data associated to a smart contract. The hash of the root node of the Account storage trie is included in the sccount state (in the storageRoot field). We have one Account storage trie for each account.
  • The transaction trie contains all the transactions included in a block. The hash of the root node of the Transaction trie is included in the block header (in the transactionsRoot field). We have one transaction trie per block.
  • The transaction receipt trie contains all the transaction receipts for the transactions included in a block. The hash of the root node of the transaction receipts trie is included in also included in the block header (in the receiptsRoot field); We have one transaction receipts trie per block.

摘自: Merkle Tree and Ethereum Objects - Ethereum Yellow Paper Walkthrough (2/7)

Externally owned account(EOA) VS. Contract account

两者都使用20字节的地址。EOA外部拥有账户是由用户通过公钥/私钥秘钥对控制的,通过以太坊客户端创建,由秘钥对生成的账户地址。而Contract account则是EOA通过调用创建合约的方法创建的账户,账户内包含合约代码,由调用方EOA的地址和调用方等nonce(该值针对EOA递增,目的为防止重放攻击)经过PLP编码和KEC散列生成账户地址。交易只能由EOA发起,目的方可以是EOA(常见的转账行为,交易transaction的data字段为空),也可以是合约账户(调用执行智能合约,一次调用中也可能包含跨合约调用,即合约账户到另一个合约账户,此时交易transaction的data字段存放的是合约调用的方法名,参数等二进制数据),还有合约部署(目的地址为0,交易transaction的data字段存放)。

字段 含义
nonce A scalar value equal to the number of transactions sent from this address or, in the case of accounts with associated code, the number of contract-creations made by this account.
balance A scalar value equal to the number of Wei owned by this address.
storageRoot A 256-bit hash of the root node of a Merkle Patricia tree that encodes the storage contents of the account (a mapping between 256-bit integer values), encoded into the trie as a mapping from the Keccak 256-bit hash of the 256-bit integer keys to the RLP-encoded 256-bit integer values.
codeHash The hash of the EVM code of this account—this is the code that gets executed should this address receive a message call; it is immutable and thus, unlike all other fields, cannot be changed after construction. All such code fragments are contained in the state database under their corresponding hashes for later retrieval.

It is not the code that is executed on subsequent transactions sent to the contract. That code is returned by the initialization code. Essentially, the code in the data field is a program that is going to write a program that gets deployed as a smart contract.(部署字节码 + 合约字节码 + AUXDATA + Swarm hash) The standard initialization code generated by the Solidity compiler does the following:

  1. Runs the code in the contract’s constructor, setting storage values, etc.
  2. Copies the code for the rest of the contract into memory and returns it.

新部署的合约代码被打包提交到链上后,所有到运行节点都会执行data字段里的部署代码(特殊情况是部分节点只按需存储部分数据),其合约存在于各运行节点各自的EVM内,并且对应的地址都是一致,即地址与合约对应,保证后续的合约调用。合约代码存放在EVM的virtual rom内,code hash值为代码的哈希值,storage hash存放的是账户的MPT树的root hash。简之,部署合约三个步骤:1. 创建合约账户 2. 执行合约的初始化 3. 拷贝到virtual rom

How Ethereum Transactions Work
How Smart Contract Deployment Works

MPT

MPT(Merkle Patricia Tries)是以太坊中存储区块数据的核心数据结构,它是 Merkle Tree 和 Patricia Tree 融合一个树形结构。附上经典的官方结构示例。具体的代码分析网上资料详细。

以太坊采用改进的Merkle树,因为前后两个block的数据绝大部分都是相同的,不需要同时存储两份。新block的树可以引用上一个block,加上新block内的交易修改的账户。

以太坊源码分析—MPT树
Ethereum以太坊源码分析(三)Trie树源码分析(上)
Ethereum以太坊源码分析(三)Trie树源码分析(下)

Event

In Ethereum, when a transaction is mined, smart contracts can emit events and write logs to the blockchain that the frontend can then process. Event在Ethereum里主要有三种用途:

  1. smart contract return values for the user interface
    发送交易调用合约的返回结果是交易的哈希值,而不是合约的返回值(只有在被打包进区块时才执行合约)。执行合约产生的结果可以作为时间写入区块,调用方可以根据合约哈希值相应获取。
  2. asynchronous triggers with data
    异步时间触发,类似于观察者模式。
  3. a cheaper form of storage
    称为log(通过LOG这个EVM操作码)。相比stroage耗费的gas小得多,可以用作存储,但是不能在合约中被调用。例如可以存放历史数据,前端可以通过遍历区块去获取。

Technical Introduction to Events and Logs in Ethereum

GHOST

GHOST协议(Greedy Heaviest Observed Subtree protocol)。同样是POW,比特币由于出块时间10min,取最长链。而Ethereum出块时间短(6s),不足扩撒至全网,如果取最长链,则地域优势明显。因此,Ethereum选择最重的链,也就是兄弟节点之间子树节点最多的被选为主链。例子可以参考 以太坊Ghost协议和叔块。主要是要解决 1. 出块速度快,存在多个节点同时出块,但是在自身出块后才接收到别的节点传输的块,浪费算力。2. 更严重的是因为出块速度块带来的地域优势,越早接收到合法块的节点出下一个块的可能性更大,从而造成的中心化风险。通过增加叔块选择最重的链,也就是选择工作量最大的链,降低地域优势。同时奖励叔块,激励节点出块。

官方文档 - Modified GHOST Implementation : 详细介绍了GHOST稀释风险,以及叔块的选择标准和奖励方案。

Transaction Structure

字段 含义
nonce A scalar value equal to the number of transactions sent by the sender
gasPrice A scalar value equal to the number of Wei to be paid per unit of gas for all computation costs incurred as a result of the execution of this transation
gasLimit A scalar value equal to the maximum amount of gas that should be used in executing this transaction. This is paid up-front, before any computation is done and may not be increased later
to The 160-bit address of the message call’s recipient or, for a contract creation transaction, ∅, used here to denote the only member of B0
value A scalar value equal to the number of Wei to be transferred to the message call’s recipient or, in the case of contract creation, as an endowment to the newly created account
v, r, s Values corresponding to the signature of the transaction and used to determine the sender of the transaction.
For the signing, Ethereum uses the same elliptic curve as Bitcoin’s, which is secpk256k1.
用于从签名中恢复ECDSA公钥,从而得到发送方from的地址
Ref: ECDSA: (v, r, s), what is v?
data An unlimited size byte array specifying the input data of the message

FISCO BCOS 2.0+ 的交易结构在原以太坊的交易结构的基础上,有所增减字段。比较重要的新增字段是blockLimit,表示 交易生命周期,该交易最晚被处理的块高.

Geth Sync Modes

  • Archive nodes: 存储创世块至今的所有区块历史数据,包括每个区块的状态。适合对历史状态的快速查找。
  • Full sync mode(Full nodes): 从创始块开始按以下步骤下载和重放执行交易以验证每个区块。周期性的创建checkpoint。只保留最近的128个区块的状态树,过期的区块状态树将被修剪。
    1. download and verify headers
    2. download block bodies and receipts. In parallel, download raw state data and build state trie
    3. heal state trie to account for newly arriving data
  • Snap sync mode(Full nodes): 与Full sync mode类似,差别在于从相对近期的checkpoint开始以上的下载执行校验步骤。
  • Light nodes: 只下载和校验区块头。非常依赖于无私的全节点提供数据服务,接收全节点的凭证,与本地的区块头校验。常用于提交交易。

    Geth官方文档 - Sync Modes