Ethereum Storage Roadmap: Challenges and Opportunities

All articles11个月前发布 wyatt
65 0 0
The growing storage requirements pose a huge challenge to Ethereum nodes.

Written by: EthStorage

Summary

  • The growing storage requirements pose a huge challenge to Ethereum nodes.

  • Due to storage limitations, some clients have begun to prune historical data, resulting in inconsistent storage behavior between full nodes in the network.

  • To ensure consistency across all clients, historical data pruning is being EIP-4444 and EIP-4844 is standardized.

  • Therefore, by replaying historical data to restore the latest L1 or L2 State requires centralized, out-of-protocol services, prompting exploration of more decentralized, Ethereum-aligned solutions

  • Ethereum Portal NetworkIt is a lightweight, decentralized P2P network for all types of Ethereum data, including historical data. It is designed for resource-constrained devices and provides Ethereum JSON-RPC services. The historical network and beacon chain network are almost ready.

  • The EthStorage network is an incentivized modular storage network for EIP-4844 BLOBs data. To store BLOBs, users call L1 storagecontract put() method, providing ETH as storage fee and recording the BLOB hash value on the chain. Over time, the storage fee will be gradually distributed to storage providers who submit off-chain BLOB storage proofs. The EthStorage testnet is running on the Ethereum Sepolia testnet, with multipleCommunityParticipants have successfully proven their local storage.

  • Future plans include developing a decentralized Ethereum state network, implementing proof of storage for dynamically sized data, and decentralized access directly from the browser.

Acknowledgements: Thanks to Piper Merriam from EF, Karthik Raju from Polychain, and Qiang from EthStorage for their feedback on this post.

background

On October 22, 2023, the famous Go-Ethereum (Geth) Development Manager Pter Szilgyi on TwitterXiaobai Navigation He expressed his deep concerns on the Ethereum blockchain. He pointed out that while the Geth client retains all historical data, other Ethereum clients such as Nethermind and Besu can be configured to delete certain historical Ethereum data (such as historical blocks and block headers). This makes the behavior of all clients inconsistent and unfair to Geth. This has triggered heated discussions and debates around the Ethereum storage issue in the Ethereum roadmap.

以太坊存储路线图:挑战与机遇

Storage Challenges

Why did Nethermind and Besu choose to stop storing historical data? What were the issues behind this decision? From our perspective, there are two main reasons:

  • Ethereum client storage requirements becomeGetting higher.

  • Storing Ethereum historical dataThere are no incentives or penalties within the protocol.

The first reason stems from the ever-increasing storage requirements of running Ethereum clients. To get a deeper understanding of the specific requirements, the pie chart below shows the storage distribution of a new Geth node as of block 18,779,761 on December 13, 2023.

以太坊存储路线图:挑战与机遇

As shown in the figure:

  • Total storage size: 925.39 GB

  • Historical data (blocks/transaction receipts): about 628.69 GB

  • State data in Merkle Patricia Trie (MPT): approximately 269.74 GB

The second reason is the lack of in-protocol incentives or penalties for storing historical blocks. While the protocol requires nodes to store all historical data, it fails to provide any mechanism to encourage storage or punish violations. Nodes storing and sharing historical data becomes purely altruistic, and client operators are free to delete or modify all historical data without any penalty. In contrast, Validator nodes must maintain and update the complete state locally to prevent Slash caused by proposing/voting for invalid blocks.

Therefore, it is not surprising that some node operators choose to remove historical data when storage costs become a significant burden on nodes. Without historical data, node clients can significantly reduce storage costs, reducing them from approximately 1TB to around 300GB.

以太坊存储路线图:挑战与机遇

Image: Nethermined configuration running a node without historical blocks — currently saves about 460GB of storage costs

With the upcoming Ethereum Data Availability (DA) upgrade, storage challenges will intensify. The road to fully scaling Ethereum DAIt started with EIP-4844 in the DenCun upgrade, which introduced a fixed-size binary large object (BLOB) and an independent fee model called blobGasPrice. Each BLOB is set to 128KB, and EIP-4844 allows up to 6 BLOBs per block. In order to expand data throughput, Ethereum plans to adopt 1D Reed-Solomon code, initially allowing 32 BLOBs per block, and reaching 256 BLOBs per block when fully expanded.

If the Ethereum DA runs at full capacity (256 BLOBs per block), the Ethereum DA network is expected to receive approximately 80 TB of DA data per year, a number that far exceeds the storage capacity of most nodes.

以太坊存储路线图:挑战与机遇

Ethereum Storage Roadmap and Its Consequences

以太坊存储路线图:挑战与机遇

Ethereum roadmap released by VitalikTweets, mentioned that Purge mainly involves storage

The rising storage costs have attracted the attention of researchers in the Ethereum ecosystem. To address this issue and ensure consistency across all clients, researchers are working on proposals to explicitly delete the storage of history. The two main proposals are:

  1. EIP-4444: Limit historical data in execution clients:This proposal allows clients to delete historical blocks older than one year. Assuming the average block size is 100K, the upper limit of historical block data is about 250GB (100K * (3600 * 24 * 365) / 12, assuming block time = 12 seconds).

  2. EIP-4844: Sharded BLOB Transactions: EIP-4844 discards BLOBs older than 18 days. This is a more aggressive approach than EIP-4444, limiting the historical BLOB size to around 100GB ((18 * 3600 * 24) * 128K * 6 / 12, assuming block time = 12 seconds).

What are the consequences of deleting all client historical data? One major problem is that new nodes cannot be synchronized to the latest state through the "full sync" mode, which is a synchronization that executes transactions from the genesis block to the latest block. Accordingly, we must adopt "snap sync" or "state sync" to synchronize the latest state directly from the Ethereum node. This method has been implemented in Geth and runs as the default synchronization.

Likewise, this consequence applies to all L2,Right now L2 New nodes cannot fully synchronize by replaying L2 genesis to the latest L2 blockEthereum L2 GenesisIn addition, since L1 nodes do not maintain L2 states, L2's "snap sync" method cannot derive the latest L2 state from L1, which violates the inherited EthereumSafetyImportant L2 assumptions guaranteed. The expected solution will rely on third-party services such as Infura/Etherscan/L2 projects themselves to store historical L2 data or state copies. This is a centralized solution achieved through extra-protocol, indirect incentives.

The core issues we want to explore are:

  • Can we find better decentralized solutions for storage and access?

  • Is it possible to have a direct incentive mechanism that is consistent with Ethereum (e.g., on L1 contractabove)?

  • On top of all this, can we provide a fully decentralized solution for the Ethereum storage route with direct incentivization within the protocol?

solution

Solution 1: Ethereum Portal Network

The Ethereum Portal network is a lightweight, decentralized access network for connecting to the Ethereum protocol. It provides Ethereum JSON-RPC interfaces such as eth_call, eth_getBlockByNumber, etc. It converts JSON-RPC requests into P2P requests to the distributed hash table (DHT), similar to the IPFS network. Unlike IPFS, which allows the storage of any data type and is susceptible to junk data, the Portal P2P network specializes in hosting Ethereum data such as historical block headers and block transaction data. This is achieved through the light client verification technology built into the Portal network.

An important feature of the Portal Network is its lightweight design andResource-constrained devicesIt can run on nodes with a few megabytes of storage and low memory, thus promoting decentralization. Even a mobile phone or Raspberry Pi device can potentially join the network and contribute to the availability of Ethereum data.

The Portal Network was developed in line with Ethereum’s client diversity philosophy, with clients written in Rust, JavaScript, and Nim. The Beacon Network and History Network are already available, while the State Network is under active development. It is worth noting that the Portal Network does not provide direct incentives for data storage — all nodes in the network operate in an altruistic manner.

以太坊存储路线图:挑战与机遇

Image: Portal Network Rust client (Trin) in action with 100MB storage limit

Solution 2: EthStorage Network

The EthStorage Network is a decentralized incentivized storage network dedicated to storing EIP-4844 BLOBs and is funded by the ESP project.

  • Minimum Trust: Unlike existing solutions that require a centralized data bridge, EthStorage relies on Ethereum consensus and a 1/m trust model of permissionless EthStorage storage nodes. The process of storing a BLOB is as follows: a user signs a transaction with a BLOB, calling the storage contractput(key, blob_idx)method. The storage contract will then record the BLOB hash on-chain. The storage provider will then download and store the BLOB directly from the Ethereum DA network, thus bypassing the data bridge problem.

  • Storage costs are aligned with incentives: When callingput()method, the transaction must send a storage fee (via msg.value) and deposited into the contract. After the storage proof is successfully submitted and verified by the off-chain storage node, this storage fee will be gradually distributed to the storage node over time. Compared with the existing Ethereum storage fee model that pays a one-time storage fee to the proposer, the storage fee paid over time follows a discounted cash flow model - assuming that the storage cost will decrease relative to the ETH price over time. This major innovation introduced by EthStorageKeep fees consistent with storage node storage contribution.

  • Proof of Storage:Storage proof is inspired by data availability sampling, and the sampling in EthStorage is for BLOBs stored over a period of time. In order to effectively verify on-chain sampling, EthStorage makes full use of smart contracts and the latest SNARK technology developments.

  • No permission to operate:Any storage node in EthStorage can get paid as long as it stores data and submits storage proof on the chain regularly.

From modularBlockchainFrom the perspective of Ethereum, EthStorage acts as Ethereum storage L2, but it charges storage fees instead of transaction fees. By indexing BLOB hashes on the chain, EthStorage is an Ethereum modular storage layer that improves storage scalability and reduces costs (with a goal of about 1000 times).

On the development side, EthStorage has been integrated with EIP-4844 on the Ethereum Sepolia testnet. We have stress-tested EthStorage and the Ethereum Sepolia testnet, including writing BLOBs of about hundreds of GB to EthStorage. More than 100CommunityParticipants join the network and successfully prove their local storage.

The main advantage of the EthStorage Network is that it provides decentralized direct incentives on top of Ethereum — a groundbreaking feature to our current knowledge. However, the network is limited in that it is designed specifically for fixed-size BLOBs.

以太坊存储路线图:挑战与机遇

Ethereum Sepolia testnet dashboard on EthStorage

Looking to the future

Although Ethereum storage has not received major attention, it is of great significance in the Ethereum ecosystem. With the rapid growth of the Ethereum network, the storage and accessibility of Ethereum data has become a key challenge. The Portal Network and EthStorage Network are still in their early stages, and there are many important long-term development directions to focus on:

  • Decentralized low-latency access to Ethereum state data network. Accessing the Ethereum state in a decentralized and verifiable way is a critical but challenging task. Using the traditional DHT network model, querying account information usually requires multiple queries to internal trie nodes stored in different P2P nodes. This often results in considerable delays. How to leverage the structure of the state tree to speed up access is the key. The state network that will be launched by the Ethereum Portal Network is designed to solve this problem.

  • Integration of Portal network and EthStorage network:The Portal network can be seamlessly extended to support BLOB data. The EthStorage team has partially implemented this feature. The next step is to unify these networks to provide a decentralized JSON-RPC network that can provide programmatic access to BLOBs through contracts. By combining the application logic in the contract with the scalable BLOB storage provided by EthStorage, we can enable new dApps on Ethereum, such as dynamic decentralized websites (such as decentralized Twitter/YouTube/Wikipedia, etc.).

  • Decentralized access to browsers: Similar to the ipfs:// protocol for accessing data in the IPFS network,web3 The industry needs an Ethereum native access protocol to support direct browser access to unleash the huge potential of Ethereum’s rich data. This data covers a wide range of areas, fromTokenOwnership and account balances to NFT images and dynamic decentralized websites, all made possible by smart contracts and future Ethereum storage capabilities. In this space, ERC-4804/6860 defines web3The :// protocol is currently being actively developed and promoted to achieve this goal.

  • Advanced Storage Proof of Sized Data: In addition to fixed BLOBs, exploring advanced storage proofs is imperative to address dynamically sized data such as historical blocks or even state objects. Developing complex algorithms can enhance the adaptability of storage solutions.

In our pursuit, we hope that through these efforts, we can collectively contribute to the Ethereum roadmap and lay the foundation for future decentralized storage solutions for the Ethereum ecosystem.

The article comes from the Internet:Ethereum Storage Roadmap: Challenges and Opportunities

Related recommendations: Bitcoin's popular second-layer inventory (Part 1): Sidechain and UTXO+ client verification

Which Bitcoin Layer2 is the best? Written by: Day Last year, due to the popularity of inscriptions, everyone began to shift their attention from Ethereum to Bitcoin, especially institutions, which began to invest money in the layout of Bitcoin ecological infrastructure. Recently, Bitcoin Layer2 such as BEVM and BOB have completed financing ranging from millions to tens of millions. In addition, the recent Nervo...

share to
© 版权声明

相关文章