Discussion of the principles and technical details of the Ordinal inscription protocol
Author: @hicaptainz
In the past two weeks, when I was studying the BTC ecosystem and various inscription projects, I found that there were very few articles that could clearly introduce the principles and technical details: for example, when inscriptions are minted, how transactions are initiated, and the sats in UTXO How was it tracked, where was the engraved content placed in the script, and why did BRC20 require two operations when transferring money? I find that without understanding these technical details, it is difficult to understand the differences between various protocols such as BRC20, BRC420, atomicals, stamps, and Runes. This article will go deep into BTCBlockchainBasic knowledge, try to answer the above questions.
BTC block structure
BlockchainIn essence, it is a multi-user accounting technology. In computer science terms, it is a distributed database. The records (accounts) in each period of time form a block, and then the ledger is expanded according to the order of time.
We used excel to make a table to illustrateBlockchainworking principle. An excel file represents aBlockchain, each separate table represents a block, and the blocks are in chronological order from 560331, 560332. to the latest 560336. 560336 will package the latest transactions in the block. The main part of the block is the most common double-entry accounting method in the accounting field. The address on one side is recorded as debit (debit), which is inputs from, and the address on the other side is recorded as credit (credit), which is outputs to. Value corresponds to the number of BTC in the corresponding address. The number of Inputs coins will be greater than the number of Outputs coins, and the difference is the transfer fee at the user level and the handling fee obtained by the miners (bookkeepers). The block header will obtain the height of the previous block, the hash value of the previous block, the creation time (timestamp) of this block, and a random number. So as a decentralized accounting technology, who will grab the accounting rights of the next block? It relies on this random number and its corresponding hash value. Miners with computing power perform a hash calculation on the random number of the current block. The miner who first obtains the hash value that meets the conditions has the accounting rights of the next block and wins the block reward and transfer fee. Finally, there is the script area, which can be used for some extended applications. For example, the script op_return can be used as a postscript column. It should be noted that in the actual block, the script area is attached to the input and output information, rather than really being a separate area. For example, the script attached to the input is an unlocking script (ScriptSig), which requireswalletThe address performs private key signature authorization to allow transfer, and the script attached to the output is a locking script (ScriptPubKey), which is used to set the unlocking conditions for receiving the BTC (generally the condition is "only those with the corresponding private key can spend").
The above two pictures are the original input and output data structure tables. At the execution level, the script is represented as the incidental parameters of the transaction information. The unlocking script (ScriptSig) is also called "witness data" because it requires private key authorization. data).
Segwit and Taproot
Although the Bitcoin network has been running for over 10 years without any notable incidents, there have been times when transaction costs have spiked to highs that were no longer feasible. As a result, Bitcoin’s developers have been discussing how best to scale the network to handle growing transaction volumes in the future.
In 2017, this debate reached a climax, Bitcoin developmentCommunityIt split into two factions. One faction supports the use of soft forks to implement a feature called SegWit, and the other is the "big block" faction that supports direct block expansion.
We mentioned above that the unlocking script needs to use private key authorization to generate "witness data", so can this witness data be separated from the block, thereby increasing the number of transactions that each block can accommodate in disguise? Segregated Witness was officially activated in August 2017. Its implementation is to divide all transaction data into two parts, one part is the basic information of the transaction (Transaction Data), the other part is the signature information of the transaction (Witness Data), and save the signature information in a new data structure , is in a new block called "segregated witness (witness)" and is transmitted separately from the original transaction.
Technically, the implementation of SegWit means that transactions no longer need to include witness data (it will not occupy the 1MB space originally allocated by Bitcoin for blocks). Instead, at the end of a block, an additional separate space is created for witness data. It supports arbitrary data transfers and has a discounted "block weight" that cleverly keeps large amounts of data within Bitcoin's block size limits to avoid the need for a hard fork. In this way, the upper limit of the transaction data size of Bitcoin transactions is increased, while the transaction fee of signature data is reduced. Before the SegWit upgrade, the upper limit of Bitcoin's capacity was 1MB. After SegWit, although the upper limit of pure transactions was still 1M, the size of the Segregated Witness space reached 4MB.
Taproot was implemented in November 2021 and consists of 3 different Bitcoin Improvement Proposals (BIPs), including: Taproot, Tapscript and its new digital signature scheme called "Schnorr Signature". Taproot is designed to bring many benefits to Bitcoin users, such as increased transaction privacy and lower transaction fees. It will also allow Bitcoin to perform more complex transactions, thereby broadening application scenarios (some new opcodes have been added).
These updates are a key enabler for Ordinals NFT, which stores NFT data in a spend script in the Taproot script path (witness data space). This upgrade makes it easier to structure and store arbitrary witness data, laying the foundation for the "ord" standard. As data requirements are relaxed, it is assumed that a transaction can fill an entire block with its transaction and witness data — reaching the 4MB block size (witness data space) limit — greatly expanding the types of media that can be placed on the chain.
Someone may ask, since some strings are put in the script, are there no restrictions on these strings? What if these scripts are actually executed? If you put the content casually, will there be an error code that refuses to produce a block? This brings up the OP_FALSE instruction. OP_FALSE (also represented as "0" in Bitcoin Script) ensures that the execution path in the scripting language never enters the OP_IF branch and remains unexecuted. It acts as a placeholder or no operation (No Operation) in the script, similar to a "comment" in a high-level language, to ensure that subsequent code is not executed.
UTXO transfer model
The above are the basic principles of studying BTC from the perspective of computer data structure. Let's discuss the UTXO model from the perspective of financial model.
UTXO is the abbreviation of Unspent Transaction Outputs. The Chinese translation is "unspent transaction output". It can actually be understood as the remaining funds that are not transferred out during a transfer. So why does Bitcoin use such a concept? This starts with the account transaction model and account balance model of the accounting method.
Because we have been in the centralized system for too long, we have become very accustomed to the accounting method of the account balance model. When user A transfers 100 yuan to user B, the bank will first check whether there is 100 yuan in A's bank account. If so, it will deduct 100 yuan from A's account and then add 100 yuan to B's account. In this way, The transfer is completed.
However, there is no concept of balance in Bitcoin’s accounting algorithm. existBlockchainThe distributed ledger records only one transaction and does not directly record the current balance of an account (recording the balance generally requires a dedicated server node to record, which is centralized). Assume that the current balance of user A is 1,000 yuan. If user A transfers 100 yuan to user B, the transfer will be recorded as:
Transaction 1 User A transfers 100 yuan to user B
Transaction 2 User A transfers 900 yuan (UTXO) to User A himself
Although transaction 2 here is a transaction, functionally it serves as the account balance, indicating that after completing the transfer of 100 yuan, there is still 900 yuan left in A's account.
So the question is, why do we have to create such a UTXO? Because only transactions can be recorded on the BTC blockchain, account balances cannot be recorded. Without this UTXO, calculating the balance requires adding up all the incoming and outgoing transactions of an account, which is very time-consuming and computationally resource-consuming. The emergence of UTXO cleverly avoids the pain point of backtracking all transactions when calculating balances.
One characteristic of UTXO is that, like coins, it cannot be broken up and used. So how do you get the input amount together during the transaction, and how do you get change? We can make an analogy with coins (in fact, it is better to automatically translate it to "coin" every time you see the word UTXO).
Xiao Ming transfers 1 Bitcoin to Xiao Gang. The whole process is like this. Xiao Ming needs to collect enough inputs. For example, in the previous transaction corresponding to Xiao Ming's address, he found a UTXO with a face value of 0.9, which is not enough for 1 Bitcoin. Fortunately, multiple inputs are allowed in the transaction, so Xiao Ming found another UTXO with a face value of 0.2, so there will be two inputs in this transfer transaction. There will also be two outputs at the same time, one pointing to Xiaogang's address, with a face value of 1 Bitcoin. The other points to Xiao Ming's own address, with a face value of 0.1 Bitcoin. This output is the change (gas is ignored in this example).
In other words, there are two coins in Xiao Ming's pocket, one with a face value of 0.9 and the other with a face value of 0.2. At this time, Xiao Ming needs to pay the coin with a face value of 1, so he needs to hand these two coins to Xiao Gang at the same time. Zero 0.1 is given to Xiao Ming. Therefore, the essence of this accounting model is to avoid "calculating balances" through the action of "making change".
Ordering system of Ordinal protocol
The Ordinal protocol can be said to be the source of this round of BTC ecological explosion. It breaks down the homogenized BTC into the smallest unit sat, and then marks each sat with a serial number. How is that done?
We know that the total amount of BTC is 21 million, and one BTC can be split into at least 100 million parts (sat), so the smallest unit of BTC is sat. Whether these BTCs or the smallest unit sat, they are all typical HomogenizationTokenFt. We now try to assign an ordinal number to these sats.
When talking about the block data structure earlier, we mentioned that the transaction information needs to indicate the address and amount of the input and the address and amount of the output. Each block contains two parts of transactions: BTC block reward and transfer fee. Fee transactions must have input and output, but because the block reward is BTC generated out of thin air, there is no input address, so the "input from" field is blank, also called "coinbase transaction". The total 21 million BTCs come from this coinbase transaction, which is also ranked first in the transaction list among all blocks.
The Ordinal protocol stipulates as follows:
-
Numbering: Each sat is numbered in the order in which they were mined.
-
Transfer: Transfer from the input to the output of the transaction according to the first-in-first-out rule
The first rule is relatively simple, it determines that the number can only be generated by coinbase transactions in mining rewards. For example, if the mining reward of the first block is 50 BTC, the first block will be allocated sats in the range [0;1;2;…;4,999,999,999]; the second block reward is also 50 When BTC is used, the second block will allocate sats in the range of [5,000,000,000;5,000,000,001;…;9,999,999,999].
The difficult part to understand here is that since UTXO actually contains many satoshis, each satoshi in this UTXO looks the same. How to sort them? This is actually determined by the second rule. Let’s give a simple example:
Let me first assume that the minimum division unit of BTC is 1, a total of 10 blocks have been produced, and the block reward of each block is 10 BTC, that is, the total amount is 100. We can directly assign a serial number (0-99) to these 100 BTC. If there is no transfer, then we only know that the 10 BTC numbers in the first block are (0-9), the 10 BTC numbers in the second block are (10-19), until the tenth area The 10 BTC number of the block is (90-99). Because there is no cost and no output, we can only assign a number range to every 10 BTC.
Assume that two outputs are added to the second block, one is 3 BTC, and the other is "change" 7 BTC, which corresponds to transferring 3 BTC to others and giving 7 BTC in change to yourself. At this time, in the transaction list of the block, assume that the 7 BTC given to yourself in change are ranked first (the corresponding number is 10-16), and the 3 BTC given to others are ranked second (the corresponding number is 17-19). This confirms the sequential set of sats contained in a certain UTXO by transferring the output.
Note that each sat is not a UTXO! Since UTXO is the smallest transaction unit that cannot be subdivided, sat can only exist in UTXO, and UTXO contains a certain range of sats, and new output can only be generated after spending a certain UTXO Split the sats number in .
As for how to express this "number", Ordinal supports multiple forms, such as the "integer method" mentioned above, and others include decimal method, degree method, percentage method, and pure letter nomenclature.
After sats have a unified serial number, you can consider inscription. As we mentioned above, you can upload any data type file in the 4M space of the witness data area, whether it is text, pictures or videos. After uploading, the file will be automatically converted to hexadecimal and stored in the taproot script area. So, 1 UTXO corresponds to 1 Taproot script area, and this 1 UTXO will contain many sats at the same time (the whole is a set of sats sequences. In order to prevent dust attacks, the number of Bitcoins in a single UTXO is limited to no less than 546 satoshis. .). In order to facilitate recording, the Ordinal protocol artificially stipulates "use the first sat number of this sequence set to represent the binding relationship" (the original words of the white paper are the number of the first satoshi of the first output), such as (17-19 ), the UTXO of sats with number 17 is directly used to replace this set and bind the inscribed content.
Minting and transfer of Ordinal assets
Ordinal NFT obviously uploads various files to the script in the Segregated Witness Zone and binds a sats sequence set to it, thus realizing the issuance of NFT assets on the BTC chain. But there is another problem here. The script in the segregated witness zone contains both the input unlocking script and the output locking script. So which script should the content be placed in? The correct answer is both. I have to mention the commit-reveal mechanism in blockchain technology here.
The Commit-Reveal mechanism in the blockchain is a protocol used to ensure fair and transparent processing of information. This mechanism is often used in scenarios where hidden information needs to be submitted (such as a vote or bid) and then revealed at a later point in time. The Commit-Reveal mechanism is divided into two phases: Commit phase and Reveal phase.
1. Commit phase: In this phase, users submit their information (such as voting choices or bid prices), but this information is encrypted. Typically, the user generates a hash of this message (i.e., a cryptographic digest of the message) and then sends this hash to the blockchain. Due to the properties of hash functions, they can generate a unique output (hash value) that is irreversible from the original message. This means that the original information cannot be inferred from the hash value. This process ensures the confidentiality of information at the time of submission.
2. Reveal Phase: At a predetermined later time, users must reveal their original information and prove that it matches a previously submitted hash. This is usually done by submitting the original information along with any additional data (such as a nonce or "salt") used to generate the hash value. The network then verifies that the hash of this original message is the same as the previously submitted hash. If there is a match, the original message is accepted as valid.
As we said before, the engraved content needs to be bound to the sats sequence set contained in UTXO. UTXO is an output in the block, so it must be attached to the output locking script. However, BTC's full nodes need to locally maintain and transmit all UTXO sets in the entire network. Imagine that if there are 10,000 4M video files directly uploaded to 10,000 UTXO locking scripts, then all full nodes need to have ultra-high storage space and ultra-fast network speeds. It can be said that the entire chain will collapse directly. . Therefore, the only solution is to put the content in the unlocking script of the input, and then let this content "point" to another output.
Therefore, the casting of Ordinal assets needs to be divided into two steps (walletThe two steps are combined. When constructing the transaction, the parent-child transaction commit-reveal is constructed at the same time. The user experience will feel that there is only one step and the gas fee is saved).
In the casting phase, the user first needs to upload the hash value of a certain file to the locking script in the UTXO in the commit transaction (their A address transfers money to his or her B address). Because it is a hash value, it does not occupy too much of the full node. UTXO database space. Secondly, the user constructs a new transaction (their B address transfers money to his A address), which is called a reveal transaction. The input at this time needs to use the UTXO containing the file hash value in the previous commit transaction, and the input The unlocking script must contain the original engraving file. To use the original words in the white paper, it is “First, in commit, create a taproot output that is submitted to the script containing the inscription content. Secondly, in the reveal transaction, use the output generated by the commit transaction to display the inscription content on the chain. .”
In the transfer stage, Ordinal NFT is slightly different from BRC20. Because Ordinal NFT is an overall transfer, you only need to transfer the NFT bound to a certain UTXO directly to the recipient, similar to ordinary BTC transfers. However, because BRC20 involves a custom amount transfer, it is also divided into two steps. The first step is called Inscribe "TRANSFER", and the second step is called Transfer "TRANSFER". The engraving transaction is actually similar to the casting process of an Ordinal NFT, implying a commit-reveral father-son transaction pair. The second step of the transfer transaction is similar to an ordinary Ordinal NFT transfer, directly transferring the BRC20 assets bound to a UTXO to the recipient. . somewalletThese three transactions (transactions between father, son, and three generations) will be constructed at the same time to save time and gas.
In summary, the commit transaction is used to bind the engraved content (the hash value of the original content) and the serialized sats (UTXO), and the reveal transaction is used to display the content (the original content). This father-son trading pair jointly completed the minting of NFT.
P2TR with an example
The above technical discussion about casting is not over yet, because some people may be curious, how does the reveal transaction verify the inscription information in the commit transaction? Why do we need our two addresses AB to transfer funds to each other when structuring a transaction? I didn’t see the need to prepare two wallets when I was making the inscription. Here we need to talk about one of Taproot’s major upgrades, P2TR.
P2TR (Pay-to-Taproot) is a new type of Bitcoin transaction introduced by the Taproot upgrade. P2TR transactions work by allowing users to use a single public key or more complex scripts such as multi-signature wallets or smartcontract) to spend Bitcoin, enabling greater privacy and flexibility. This is achieved through the use of Merkleized Abstract Syntax Trees (MAST) and Schnorr signatures, techniques that make it possible to efficiently encode multiple spending conditions in a single transaction.
-
Create spending conditions
To create a P2TR transaction, the user first defines a spending condition, such as a single public key or a more complex script that specifies the requirements for spending Bitcoins (e.g., a multi-signature wallet or smartcontract).
-
Generate Taproot output
The user then generates a Taproot output that includes a single public key (the public key represents the spending condition). This public key is derived from a combination of the user's public key and the script's hash, using a process called "tweaking." This ensures that the output looks like a standard public key, making it indistinguishable from other transactions on the blockchain.
-
Spend Bitcoin
When a user wants to spend Bitcoin, they can use their single public key (if the spending conditions are met), or reveal the original script and provide the necessary signatures or data to satisfy the spending conditions. This is accomplished using Tapscript, which allows for more efficient and flexible execution of spending conditions.
-
Verify transaction
Miners and nodes then verify the transaction by checking the provided Schnorr signature and data and spending conditions. If the conditions are met, the transaction is considered valid and the Bitcoins can be spent.
-
Enhanced privacy and flexibility
Because P2TR transactions only reveal the necessary spending conditions when spending Bitcoin, they maintain a high level of privacy. Additionally, the use of MAST and Schnorr signatures enables efficient encoding of multiple spending conditions, allowing for more complex and flexible transactions without increasing the overall size of the transaction.
The above is how the commit-reveal mechanism is applied in P2TR. We will illustrate it with a practical case.
Use a blockchain explorerhttps://www.blockchain.com/Let's study the casting process of an Ordinal image NFT, including the previous commit-reveal stages.
First, we see that the Hash ID of the commit transaction is (2ddf90ddf7c929c8038888fc2b7591fb999c3ba3c3c7b49d54d01f8db4af585c). It can be noted that the output of this transaction does not contain inscription data (actually it contains the hash value of the 16-mechanism image file), and there is no relevant inscription information on the web page. This output (bc1p4mtc…..) address is actually a temporary address generated through the “tweaking” process (the public key representing the script unlocking conditions), and shares a private key with the taproot main address (bc1pg2mp…). The second UTXO in this transaction belongs to the returned "change" operation. In this way, the binding of the inscription content to the sats contained in the first UTXO is achieved.
Next, we check the record of the reveal transaction, and its Hash ID is (e7454db518ca3910d2f17f41c7b215d6cba00f29bd186ae77d4fcd7f0ba7c0e1). Here, we can see the Ordinals inscription information. The input address of this transaction is the temporary output address (bc1p4mtc…) generated by the previous transaction. The input unlocking script contains the hexadecimal file of the original image, and the output 0.00000546BTC (546 Satoshi) is the This NFT is sent to your own taproot main address (bc1pg2mp…). Based on the First in First Out principle and "the number of the first satoshi of the first output is bound", although the number of sats contained in the two UTXOs before and after changes, the bound sat serial number remains unchanged. So, we can find the satoshi where this inscription is located in (sat 1893640468329373).
(https://ordinals.com/sat/1893640468329373)
These two transactions (belonging to father-son transactions) will be submitted to the memory pool by the wallet at the same time when minting, so they only need to spend a gas, and there is a high probability that they will enter the same block and be recorded and broadcast by the miners (in the above example The two transactions exist in block 790468 at the same time.). Miners and nodes then verify by checking the Schnorr signature and hexadecimal image hash provided by the input in the reveal transaction against the hexadecimal image hash in the output lock script in the commit transaction. If the two are the same, the transaction is considered valid and the Bitcoin UTXO can be spent. Then these two transactions will naturally be permanently recorded in the BTC blockchain database, and the NFT image will naturally be saved and displayed. . If the two hashes are different, both transactions will be canceled and the inscription will fail.
BRC20 protocol and indexer
For the Ordinal protocol, we engrave a piece of text, which is a text NFT (corresponding to Loot on Ethereum), engrave a picture, which is a picture NFT (corresponding to PFP on Ethereum), and engrave a piece of music, which is an audio NFT. So if we engrave a piece of code, and this code is a piece of "release FT homogenizationToken"Where is the code?
BRC20 deploys, mints, and transfers Tokens by using the Ordinal protocol to set inscriptions into JSON data format. JSON contains code snippets describing various properties of the Token, such as its supply, maximum minting unit, and unique code. We have already talked about it in the previous article, BRC20TokenThe essence of SFT is a semi-fungible token, that is to say, in some cases it can be used as an NFT transaction, and in some cases it can be used as an FT transaction. How is this kind of control over "different situations" achieved? The answer is indexers.
SoXiaobai NavigationThe indexer is actually a bookkeeper, used to record the received information into categories in the database. In the Ordinal protocol, the indexer determines the changes in sorted sats in different addresses by tracking input and output. In the BRC-20 protocol, the indexer has an additional function: recording changes in the token balance in the inscription at different addresses.
So we can see different forms of token existence from the perspective of bookkeepers: BRC20 protocol tokens actually exist in a triple database. In the first layer, Layer 1, the bookkeeper is a BTC miner, the database type is "chained database", and the BTC generated is an FT asset. In the second layer 2, the bookkeeper is the Ordinal indexer, the database type is "relational database", and the generated sats with serial numbers are NFT assets. In the third layer 3, the bookkeeper is the BRC20 indexer, the database type is "relational database", and the BRC20 assets generated are FT assets. When we count BRC20 in terms of "pieces", the point of view is the ordinal indexer (recorded by this indexer), which is naturally an NFT; when we think of BRC20 in terms of split "pieces" ( Especially recharging to centralizationexchangeAfter), the station's perspective is the BRC20 indexer (recorded by this indexer or the centralizedexchangeserver record), which is naturally FT. From this we can draw a conclusion that the existence of the semi-fungible token SFT is due to the different levels of bookkeepers.
Isn't the blockchain just a distributed database, so there is a group of bookkeepers such as miners to jointly maintain this "chained database" (because only a chained database can be truly decentralized). But going round and round, we are still back to the old path of centralized "relational databases". This is also the essential reason why some time ago the initiators of the Ordinal protocol, the initiator of the BRC20 protocol, and the unisat wallet were so excited about whether to upgrade the indexer - the bookkeepers had different opinions.
However, after more than ten years of development, the industry has still accumulated a lot of "decentralization" experience. Can indexers use "chained databases" to replace relational databases? Can fraud proof or ZKP be used to guarantee it?Safetyand decentralization? Will the demand for DA in the Bitcoin ecosystem spill over to other DAs, thereby promoting the prosperity and integration of multi-chain ecosystems? I seem to see more possibilities.
The article comes from the Internet:Discussion of the principles and technical details of the Ordinal inscription protocol
Related recommendations: Variant Li Jin: Embrace short-shelf-life applications
Another path to success in consumer apps: create a series of smaller, ephemeral hits. Written by: Li Jin Compiled by: Xiaobai Navigation coderworld Introduction In today’s consumer application field, pursuing popular applications that are smaller in scale and have a short life cycle may be a winning strategy. This article explores the effectiveness of this strategy while analyzing how to leverage…