Cascade Basics
Cascade Storage Protocol begins by passing in any external data object. Cascade then leverage an LT-encoding fountain code algorithm to breaking each asset up in a series of redundant partitions. Every partition contains certain random fragments of the combined file which is then distributed redundantly across participating SuperNodes running on the network.
Two parameters that control how the data is encoded into partitions:
The size of each partitions
The desired redundancy factor of each partitions
Partition Distribution and Assignment
The sets of partitions are auto-distributed across the network to randomly selected Supernodes using the Kademlia DHT algorithm. This provides a useful “distance metric” that can be computed for any binary string of data and automatically eliminates the need for any of the unnecessary architecture. Such a system combats against the need for any for a system to decide which node is responsible for which partitions. It also eliminates the requirement to iterate through SuperNodes to find the one with the relevant partitions, and prevents any complicated logic being needed for handling partitions re-allocation in the case of SuperNodes entering or leaving the network.
Each partition is uniquely identified by a SHA3-256 hash. We determine the binary representation of each hexadecimal string (both the partition and Supernode identifier) and compute the XOR distance between strings. The smaller the distance in this computation, the ‘closer’ the parition is to the SuperNode in the network.
SuperNodes are responsible for storing partitions that are ‘closest’ to them in the network. As new partitions are created and as SuperNodes enter and leave the network, this set of nearest partitions to a given SuperNode changes. This achieves a completely distributed, deterministic way to self-organize into a particular network topology using random outputs.
Network Self-Healing
Despite the redundancy and self-balance introduced above, in that a replacement Supernode is automatically found when an old Supernode leaves the network, it is still conceivable that a particular chunk could be lost forever. Possibly, a very large and sudden drop in the number of available Supernodes - as a result of market forces or an attack on the network - could wipe out all the Supernodes hosting that chunk before new Supernodes can take them over.
However, if this event were to occur, there is a solution. Each chunk is uniquely determined by two items: the original data and a random seed for a given chunk. The random seed is generated when the chunks are first created. The set of these random seeds, together with the file hashes for each chunk, is also contained in the artwork registration ticket. If Supernodes on the network determine that a given chunk is no longer available, then the highest-ranked Supernode can retrieve enough LT chunks to reconstruct the original file and then use the random seed corresponding to the missing LT chunk to generate from scratch the identical chunk. This process can be verified easily by computing the file hash of the “new” chunk and checking that it matches the file hash listed for that chunk in the original artwork registration ticket on the blockchain.
Last updated