📦Data Sampling

Data availability is responsible for receiving data from the Execution Layer and acts as the data storage module for Rollup, storing the most original transaction data. In ME Network, Data Sampling[1] is used, allowing Light Nodes (also known as Light Clients) to retrieve a subset of data from a block to verify its availability without downloading the entire block. This is a crucial process for data availability because it enables the sharing of network resources, saves resource space, and achieves scalability and efficiency.

ME Network employs the 2D Reed-Solomon Encoding[2] technology, which involves encoding and sampling to ensure data availability and establish support foundations for trust-minimized Light Nodes.

Similar to how a scratched part of a CD doesn't prevent a movie from playing due to the use of the same error-correcting code technology.

In the context of 2D Reed-Solomon Encoding Technology, the initial encoding is a part of the erasure coding process itself. Erasure coding is a data protection method where data is divided into fragments, then extended and encoded with redundant data fragments, and finally stored.

In ME Network, data is arranged in a two-dimensional square grid (imagine it like a disc, but is not circular), and both rows and columns utilize standard one-dimensional encoding for erasure coding. The encoding scheme involves a certain number of original data blocks and an equal number of parity check blocks. (We can think of it as having 3 valuable gold bars in a basket, followed by filling it with 97 meaningless bricks.) This erasure efficiency is 50%, meaning that any 50% of data blocks are sufficient to recover the original data.

This encoding process is crucial for the functionality of Light Nodes, as Light Nodes can determine whether the data in each block is available without needing to download the entire block itself. In ME Network, Light Nodes obtain block data and relevant data for verifying fraud proofs through a process of random sampling.

Let's explain in detail how it works: In ME Network, the data of each block is divided into smaller data chunks. These chunks are then arranged in a matrix format. The encoding scheme of the 2D Reed-Solomon (RS) erasure coding is applied to expand the parity data into a larger matrix. The rows and columns of this expanded matrix are used to calculate Merkle roots. These Merkle roots are then used as part of the block header's block submission data.

To verify data availability, Light Nodes in the ME Network will sample the data blocks in the expanded matrix. They randomly select a set of unique coordinates within the matrix and query the Full Nodes for the corresponding data blocks and the Merkle tree proofs at those coordinates. If the Light Node receives valid responses for each sampling query, then it's highly likely that the data for the entire block is available.

This sampling process is critical to the operation of the ME Network's data availability layer. Without this process, the data availability layer and Rollup would struggle to function properly. A complete data availability layer relies on the repeated verification by full nodes and light nodes that store all data, and it also requires the collaboration of key technologies such as 2D Reed-Solomon erasure coding and Merkle trees.

[1] celestia.org.Data Availability Sampling.

[2] IS REED,G. SOLOMON. Polynomial Codes over Certain Finite Fields. 1960

PreviousImproved Fraud Prevention Design NextME-SDK

Last updated 1 year ago