Interpreting BitVM: How to verify fraud proof on the BTC chain? (Execute the opcode of EVM or other VM)
Author: Wuyue & Faust, geekweb3
Advisor: Kevin He, BitVM ChineseCommunityInitiator, ex Web3 Tech Head@Huobi
Introduction: Currently, Bitcoin Layer2 has become a trend, and there are dozens of projects that position themselves as "Bitcoin Layer2" on the market. Among them, many Bitcoin Layer2s that call themselves "Rollup" claim to have adopted the solution proposed in the BitVM white paper, making BitVM a prominent subject in the Bitcoin ecosystem.
Unfortunately, most of the current written materials about BitVM fail to explain its principles in a simple way.
This article is a simple summary of what we have read after reading the 8-page white paper of BitVM and reviewing materials related to Taproot, MAST tree, and Bitcoin Script. To facilitate the understanding of readers, some of the expressions are different from those in the BitVM white paper.Xiaobai NavigationDifferent from this, we assume that readers have some knowledge of Layer 2 and can understand the simple idea of “fraud proof”.
A few words to summarizeBitVM’s idea: Data that does not need to be on-chain is published and stored off-chain first, and only Commitment is stored on-chain.
When a challenge/fraud proof occurs, we only put the data that needs to be on the chain on chian to prove that it is associated with the commitment on the chain. After that, the BTC main network will verify whether there are any problems with the on-chain data and whether the data producer (the node that processes the transaction) has committed any malicious acts.All of this follows the principle of Occam's razor - "Do not add entities unless necessary" (If you can have less on chain, then have less on chain).
Text: The so-called BitVM-based BTC on-chain fraud proof verification scheme, a simple summary:
1.The core idea of BitVM
First of all, a computer/processor is an input-output system composed of a large number of logic gate circuits.One of the core ideas of BitVM is to use Bitcoin Script to simulate the input-output effect of logic gate circuits..
As long as you can simulate the logic gate circuit, you can theoretically realize the Turing machine and complete all computable tasks. In other words, as long as you have a lot of people and money, you can gather a group of engineers to help you use the simple Bitcoin Script code to simulate the logic gate circuit first, and then use a huge number of logic gate circuits to realize the functions of EVM or WASM.
(This screenshot comes from a teaching game: "Turing Complete", the core content of which is to use logic gate circuits, especially NAND gates, to build a complete CPU processor)
Someone once compared the idea of BitVM to using redstone circuits to make an M1 processor in Minecraft. In other words, it is equivalent to using building blocks to build the Empire State Building in New York.
(It is said that this is a "processor" that someone spent a year building in "Minecraft")
2. Why do we have to use Bitcoin Script to simulate EVM or WASM?
Isn't this very troublesome?Most Bitcoin Layer2s tend to choose to support high-level languages such as Solidity or Move. Currently, the only language that can run directly on the Bitcoin chain is Bitcoin Script, a simple, non-Turing-complete programming language consisting of a bunch of unique opcodes..
(A Bitcoin Script code example)
If Bitcoin Layer2 intends to verify fraud proofs on Layer1 like Ethereum Layer2 such as Arbitrum, it will inherit BTC to a great extent.SafetyIn order to verify the "controversial transaction" or "controversial operation code" directly on the BTC chain, the operation code corresponding to the Solidity language/EVM adopted by Layer2 must be re-run on the Bitcoin chain. The problem boils down to:
Use Bitcoin Script, a simple programming language native to Bitcoin, to achieve the effects of EVM or other virtual machines.
Therefore, to understand the BitVM solution from the perspective of compiler theory, it translates EVM/WASM/Javascript opcodes into Bitcoin Script opcodes, and the logic gate circuit serves as an intermediate form (IR) between "EVM opcode -> Bitcoin Script opcode".
(The BitVM white paper discusses the general idea of executing certain “controversial instructions” on the Bitcoin chain)
Anyway, the final simulated effect is,Put the instructions that can only be processed on EVM/WASM into the Bitcoin chain for direct processing Although this solution is feasible, the difficulty lies in how to use a large number of logic gate circuits as an intermediate form to express all EVM/WASM operation codes. Moreover, using a combination of logic gate circuits to directly express some extremely complex transaction processing processes may generate a huge workload.
3. “Interactive Fraud Proof” Highly Similar to Arbitrum
Next, let’s talk about another core mentioned in the BitVM white paper, which is the “interactive fraud proof” that is highly similar to Arbitrum.
Interactive fraud proofs involve a word called assertion. Generally speaking, the proposer of Layer2 (often a sorter) will issue an assertion on Layer1, declaring that certain transaction data and state transition results are valid and correct.
If someone thinks that the assertion submitted by the Proposer is problematic (the associated data is incorrect), a dispute will occur.At this point, the Proposer and Challenger will exchange information in turns and perform a binary search on the disputed data to quickly locate a very fine-grained operation instruction and its associated data fragments.
This controversial operation instruction (OP Code) needs to be directly executed on Layer 1 along with its input parameters, and the output result needs to be verified (the Layer 1 node will compare the output result calculated by itself with the output result previously released by the Proposer). In Arbitrum, this is called a "single-step fraud proof".
(In Arbitrum's interactive fraud proof protocol, the data released by the Proposer will be retrieved through binary search to locate the controversial instruction and execution result as soon as possible, and finally send the single-step fraud proof to Layer1 for final verification)
References:Former Arbitrum Technical Ambassador explains Arbitrum's component structure (Part 1)
(Arbitrum’s interactive fraud proof flowchart, the description is relatively rough)
At this point, the idea of a single-step fraud proof is easy to understand: most transaction instructions that occur on Layer 2 do not need to be re-verified on the BTC chain. However, a controversial data fragment/operation code must be replayed on Layer 1 when challenged.
If the test conclusion is:
-
If there is a problem with the data previously published by the Proposer, the assets pledged by the Proposer will be slashed;
-
If there is a problem with the Challenger, the Challenger’s pledged assets will be slashed.
-
If the Prover does not respond to the challenge for a long time, it can also be Slashed.
Arbitrum is built on EthereumcontractTo achieve the above effects, BitVM needs to use Bitcoin Script to implement time lock, multi-signature and other functions.
4.MAST Tree and Merkle Proof
After briefly talking about "interactive fraud proof" and "single-step fraud proof", we will talk about MAST tree and Merkle Proof.
As mentioned earlier, in the BitVM solution, the large amount of transaction data/huge logic gate circuits involved in Layer 2 processing off-chain will not be directly on-chain. Only very small amounts of data/logic gate circuits will be on-chain when necessary.
but weWe need a way to prove that the data that was originally off-chain and is now on-chain is not fabricated. This is what is often referred to as Commitment in cryptography.. Merkle Proof is a type of Commitment.
First, let's sayMAST TreeThe full name of MAST tree is Merkelized Abstract Syntax Trees.It converts the AST tree involved in the compilation principle into the form of Merkle Tree.
So, what is an AST tree? Its Chinese name is "Abstract Syntax Tree". Simply put, it is to break down a complex instruction into a bunch of basic operation units through lexical analysis, and then organize it into a tree-like data structure.
(A simple example of an AST tree. This AST tree breaks down simple operations such as x=2 and y=x*3 into underlying opcodes + data)
The MAST tree is the Merkle-ized AST tree to support Merkle Proof. One advantage of the Merkle tree is that it can achieve efficient "data compression". For example, if you want to publish a piece of data on the Merkle tree to the BTC chain when necessary, but you also want to convince the outside world that this piece of data does exist on the Merkle tree, and is not something you "randomly picked up". What should you do?
You just need to record the Root of the Merkle tree on the chain in advance, and then present the Merkle Proof in the future to prove that a certain piece of data exists in the Merkle tree corresponding to the Root.
(Relationship between Merkle Proof/Branch and Root)
Therefore, there is no need to store the complete MAST tree on the BTC chain. You only need to disclose its Root in advance as a Commitment and present data fragments + Merkle Proof /Branch when necessary. This can greatly compress the amount of on-chain data and ensure that the on-chain data really exists on the MAST tree. Moreover, only disclosing a small part of the data fragments + Merkle Proof on the BTC chain, rather than disclosing all the data, can play a good role in privacy protection.
References:Data Withholding and Fraud Proofs: Why Plasma Doesn’t Support Smart Contracts
(MAST tree example)
The BitVM solution attempts to express all logic gate circuits using Bitcoin scripts, and then organize them into a huge MAST tree. The leaf at the bottom of the tree (Content in the figure) corresponds to the logic gate circuit implemented using Bitcoin scripts.
Layer2 Proposers frequently publish the root of the MAST tree on the BTC chain. Each MAST tree is associated with a transaction, involving all its input parameters/operation codes/logic gate circuits. To some extent, this is similar to Arbitrum's Proposer publishing Rollup Block on the Ethereum chain.
When a dispute occurs, the challenger declares on the BTC chain that he wants to challenge the Root published by the Proposer, and then asks the Proposer to reveal a piece of data corresponding to the Root. After that, the Proposer presents the Merkle proof and repeatedly discloses small pieces of data from the MAST tree on the chain until he and the challenger locate the disputed logic gate circuit. Then, Slash can be executed.
5. Last
At this point, the most important part of the entire BitVM solution has been basically explained. Although some of the details are still a bit obscure, I believe that readers can already get the essence and key points of BitVM.
As mentioned in its white paperBit value commitment is to prevent the Proposer from assigning both 0 and 1 to the input value of the logic gate when being challenged and forced to verify the logic gate circuit on the chain, causing ambiguity and confusion..
Summarize
The BitVM solution first uses Bitcoin script to express logic gate circuits, then uses logic gate circuits to express the operation codes of EVM/other VMs, and then uses the operation codes to express the processing flow of any transaction instruction, and finally organizes them into a merkle tree/MAST tree.
Such a tree can easily exceed 100 million leaves if the transaction processing flow it expresses is complex, so we should try to reduce the block space occupied by the Commitment and the scope of the fraud proof.
Although a single-step fraud proof only requires a very small piece of data and logic gate script on the onchain, the complete Merkle Tree must be stored off-chain for a long time so that when someone challenges it, the data on the onchain tree can be accessed at any time.
Every transaction that occurs in Layer2 will generate a large Merkle Tree. The computing and storage pressure on the nodes can be imagined. Most people may not want to run nodes (but these historical data can be expired and eliminated, and B^2 network specifically introduces zk storage proofs similar to Filecoin to incentivize storage nodes to preserve historical data for a long time)
However, the optimistic Rollup based on fraud proof does not need too many nodes, because its trust model is 1/N. As long as one of the N nodes is honest and can initiate a fraud proof at a critical moment, the Layer2 network isSafetyof.
but,There are still many challenges in the design of Layer2 solutions based on BitVM,for example:
1)Theoretically, in order to further compress data, it is not necessary to verify the opcode directly in Layer 1. The processing flow of the opcode can be compressed into a zk proof again, allowing the challenger to challenge the verification steps of the zk proof. This can greatly compress the amount of data on the chain. However, the specific development details will be very complicated.
2)Proposers and Challengers need to interact repeatedly off-chain. How to design the protocol, the commitment and challenge process, and how to further optimize the processing flow require a lot of brain cells.
The article comes from the Internet:Interpreting BitVM: How to verify fraud proof on the BTC chain? (Execute the opcode of EVM or other VM)
Related recommendation: Learn about Zest: 100% capital-efficient stablecoin on Blast
How to better attract and utilize the current liquidity of more than one billion US dollars on the Blast chain has become a major issue for all crypto This is a question that ambitious development teams around the world must think about, and Zest has given them the answer. Written by: Go2Mars Research Foreword With Blast announcing the launch of the test network, 50%…