Sense Basics
A detailed technical background on how the Sense Protocol works.
Last updated
A detailed technical background on how the Sense Protocol works.
Last updated
Effective “near-duplicate image detection” is an open research problem in computer vision, given the fact that visual data is extremely high dimensional. Even a relatively tiny 100kb jpeg file can easily comprise 500,000 or more pixels, each of which has a red, green, and blue component. Moreover, someone could edit that jpeg file in Photoshop in such a way that would seem immediately recognizable to a human observer as a simple derivative of the original image, but nevertheless end up changing every single one of those pixels, perhaps in highly complex ways that leave little of the original structure intact at the level of the individual pixels.
Senses solves this problem by dramatically reducing the dimensions involved, while still retaining the high-level structural content of the data. The compressed representation becomes the NFT fingerprint, a list of numbers vs. the original pixel data, that is robust to various transformations. In this manner, even if we compare the fingerprint of a candidate NFT which is simply another known NFT with random noise, it will look suspiciously similar to the fingerprint of the original NFT. By quantifying this similarity, we achieve a measure that we can use as a relative known rareness score.
By leveraging a variety of well-trained deep neural net models comprising tens of millions of artificial “neurons,” we can achieve exceptional results on very complex data classification tasks. When each model is parsed a particular data, it generates a list of N numbers in a particular order, which we refer to as the NFT fingerprint vector for a given image and model. An analogy of the vector is as follows: We take a human subject and scan their brain in real-time to see exactly what nerve cells are active at any time, and how activated each one is. We then show the human subject the candidate NFT and record the results of the activation pattern in their brain as a series of numbers. Similar to how a human brain works, what the model “sees” is not just some mechanical description of the precise pixels, but rather a high-level depiction of the NFT contents. This ability to capture the “high-level abstract content” of the NFT is what makes these representations so powerful.
In order to construct the NFT fingerprint vector, we leverage a number of well-defined models. Each model requires a unique pre-processing pipeline applied to the NFT, such as resizing and pixel representation. We then obtain the NFT fingerprint vector as an output from each of the individual models, which is then concatenated into a single large fingerprint vector consisting of exactly 10,048 decimal numbers. The output below demonstrates the first 16 numbers of a fingerprint vector for a particular NFT:
Notably, certain of the entries in the example above are exactly zero which shows that there is real structure to each fingerprint. If a certain feature is not detected in the input data, then some of the outputs will be ‘switched’ off. In any case, we have now transferred input data into a fingerprint vector – a process which takes less than a few seconds to complete.
Near-Duplicate Detection
Once we have a representation of data that serves as the NFT fingerprint, we are able to more accurately assess the relative rareness of each fingerprint within the dataset. For example, consider the two scenarios below:
Scenario 1: Two Dissimilar Images:
Scenario 2: Similar Images:
For the relative rareness score to be useful, we need to calculate a similarity score that would result in the images in Scenario 1 having a fairly low similarity score (note that, even if the subjects are very different, if the images were made by the same creator using similar techniques, they still might have a higher similarity score than if we compared them to a completely different image) while in Scenario 2, each of those images should have a very high similar score. Also observe how in Scenario 2, every single pixel is changed meaningfully from its original value.
The NFT fingerprint vectors are robust to simple transformation and describing similarity that looks into the fingerprints at a deeper level. We leverage various tools in the form of different correlation measures and statistical dependency measures. We compute the correlation between a candidate NFT fingerprint and the entire database of many hundreds of thousand or millions of NFTs in just a few seconds. As a result, we take a candidate NFT fingerprint vector and return a list of correlation measures comparing it to the fingerprints of all previously registered NFTs in the system.
To reliably identify near-duplicate with a reasonable confidence interval, we leverage a variety of functions and correlation measures — some of which are fairly advanced and computationally intensive. For example, the system relies on correlation measures that operate on the ranks of data rather than the data values themselves, and on similarity measures of statistical dependency. Essentially, these measures inform us about how “suspiciously similar” two fingerprint vectors are. Put another way, they enable us to measure how improbably it would be to find such particular patterns between the fingerprints if we were really looking at “random” or unrelated data. We combat the issue of false negatives by employing several differentiated, varied, and powerful similarity measures.
Optimization Techniques
We then optimize the performance of the system to minimize false negatives and false positives alike by employing additional techniques.
In one instance, we assess all Pearson correlation scores for all registered NFTs versus the candidate NFT, and then compare the value of the maximum correlation of any registered NFT to the 99.99th percentile correlation across all registered NFT. The percentage increase in the maximum correlation versus the 99.99th correlation (i.e., Pearson Gain), can tell you some very useful information if it’s large enough. For example, suppose that there are 10,000 registered fingerprints such that there are 10,000 correlation scores, sorted in descending order. We compare the maximum to the 99.99th percentile score – suppose that the top score is 86.00%, and the second score is 65.00%, implying a Pearson Gain of 86.00%/65.00% - 1 = 32.3%. This signifies that exactly one had a much higher correlation than the rest of the dataset. Extending this across the entire dataset, we can identify correlation across broad clusters of NFT data objects. Implementing this requirement drastically improves the threshold of confidence in our system.
The system as outlined accurately quantifies a similarity score on a spectrum of 0.00% - 100.00% rather than a binary 0-1 in a way that resembles human intuition. We combine the results of the process described above to generate various “sub-scores” which is transformed to a single number between 0.00% - 100.00%. One sub-score sums up the various similarity measures and compares the sum to the “maximum” if the NFTs were the same, essentially averaging the result of the different similarity measures to the extent they are available. We combine the sub-scores across each methodology to compute the Combined Relative Rareness Score.
Machine Learning Makes It Even Better
We employ a parallel approach using machine learning to further optimize our approach. We start with a universe of known NFT files, such as open data from OpenSea. We then segregate a certain percentage of the data as registered NFTs and compute their NFT fingerprint vector. We take the remaining data as un-known true originals - that is, we do not compute their NFT fingerprint vector, and we know that none of these NFTs are in our database. Finally, we generate a large corpus of artificially generated near-duplicate NFTs through transformation techniques as shown in the examples below:
Then, we apply our entire Sense protocol to the transformations. We select a known near-duplicate NFT, compute its fingerprint vector, and apply our funnel of correlation measures to compare it to all registered NFTs. Next, we select an original NFT that we know should not be identified as a near-duplicate of any registered NFT and apply the same process. For each of these, we can observe how many registered fingerprints make it to the “last stage” of the funnel. Rather than track the Combined Relative Rareness Score, we apply a binary label of 1 to the artificial near-duplicate NFTs and 0 to the true originals. We then model our input data against the various similarity measures and sub-scores we compute for each image.
This methodology enables us to then make use of machine learning training, or supervised learning. Given a row of data which signifies the maximum correlation scores of the candidate NFT versus all the registered NFT fingerprints, we predict whether the label is a 1 (i.e., duplicate) or a 0 (i.e., original) using various approaches. For example, we build random forest classifier that uses an ensemble of decision trees to predict the label from the input data via XGBoost. We also construct a deep neural network classifier using Keras applications that predicts the label from the input data. Each of the models are nuanced and provide different degrees of gradations. We combine each score produce a final Overall Average Score, which is far more precise and maps closer to human intuition than any individual score.
Putting it All Together with Examples
The following are results demonstrating the actual protocol running on example NFTs from the test corpus
1. Modification of a known NFT with random noise
First, we begin by modifying a known NFT by adding random noise, to an extent that would more than enough to cause Google’s reverse image search to not find any matches at all. In this example, the system accurately classifies it, assigning it a fairly low overall average rareness score of ~28.82% Note that the system was also able to correctly identify the exact registered image that the candidate image was derived from.
2. Modification of a known NFT through modification
Here is another example of a near-duplicate where it’s a “stretched” version of a registered image; in this case, the overall average rareness score is just 0.087, which is quite low; if we were to use an actually identical image, the score would be closer to 0.0:
Perhaps most impressive of these examples is applying the system to a near-duplicate that uses a “contour” or edge detection filter on a registered image; the system gives this an overall average rareness score of just ~0.15 despite it looking dramatically different versus the original:
Rareness on the Internet
We leverage existing open-source functionality, such as Google’s Reserve Image Search, to crawl and index websites and assess rareness relative to what is known to the internet. These tools provide results which is described as “visually similar” data such as shown below:
It has been indexed by various websites and Google is able to find the exact image. The same applies to an NFT “series” where there are dozens or even thousands of extremely similar images created as part of a series. When a user attempts to register a new NFT on Pastel Network, in addition to running the standard Pastel, assess its relative rarness to the internet (note that this is done in a fully decentralized way, with multiple randomly selected Pastel Supernodes conducting the same search and ensuring that their results all match exactly, just as the Pastel rareness score is computed by multiple Supernodes independently and the results checked for consistency):
If an artist has created a genuinely new image and never shared this image before online, then they can first register it on Pastel Network, and they will receive the highest level of “certified rareness” available on the system: the resulting NFT will be rare on Pastel and rare on the internet, and both of these metrics are written into the immutable NFT registration ticket that is part of the Pastel Blockchain. If that image is subsequently shared on social media or other websites, then the rareness scores of the Pastel NFT will not change—the only thing that matters is how rare it was at the time it was registered on Pastel.
Concluding Remarks
If you think about it, this is a much, much stronger concept of what it means for a digital image to be “rare”: not only can we verify the authenticity and provenance using the creator’s digital signatures (like all NFT systems in use), but we can go much further, and actually assess how rare the underlying pixel patterns of the image are, both on Pastel Network itself as well as on the broader internet. If value is largely a function of rareness/scarcity, we believe that this additional layer of authentication of rareness will result in better valuations for NFT creators. After all, if another creator makes a similar NFT in the future, they will still be able to register it on Pastel, but it won’t achieve anything close to the rareness score of the original image.
Furthermore, even if the original creator themselves try to create another very similar or identical NFT in the future, this subsequent NFT will not have the rareness score of the creator’s first and original NFT. This protects NFT buyers from “inflation” caused by the creator, which is something that an NFT system based only on verifying digital signatures can’t really do, since the second or third “highly similar” or identical NFT would still appear to be totally legitimate because it is correctly signed by the artist— despite the fact that it’s a “knock off” of the original.
How It Works
Each SuperNode on Pastel running Ubuntu 20.04 has a handful of support directories to exist on the local machine.
The fingerprint database (which uses SQLite as the database system) file is “seeded” with several thousand NFT fingerprints. An example database file can be downloaded here which contains several thousand image fingerprints. In addition, there are pre-trained classifiers such as XGBoost and Keras that are downloaded and trained by the client.
The client then simply runs the Python file, which will loop and monitor the “dupe_detection_input_files” folder for new NFT files. If the user uploads new data into this folder, the process will wait for the file to finish uploading over and will then proceed with the analysis automatically. First it will compute the rareness score, and then it will compute the “internet rareness score.”
Finally, the process will compute a series of perceptual hashes. The perceptual hashes are used to scale the system in the case when a candidate NFT is a near exact duplicate of an already registered NFT; in effect, if the hash-based system uncovers a duplicate, we can skip over the computationally intensive process and go straight to giving the image a rareness score of zero. We use a variety of image hashing algorithms, including the “pdq” algorithm from Facebook Research and the “NeuralHash” algorithm from Apple, as well as more traditional methods.
When the process finishes, it generates a json output file in the folder “dupe_detection_output_files”, where the file name is the first 10 characters of the SHA3-256 hash of the image file. For example:
The contents of this file is as follows (the final element in the json file, “image_fingerprint_of_candidate_image_file,” has been shortened below for space reasons since it consists of 10,048 numbers):
Scalability:
The current architecture of Pastel’s Dupe Detection system can scale up to several hundred thousand images. The main limiting factor is that the machine running the dupe detection code must keep a table of all registered images in memory.
However, there is a very straightforward approach to scaling the system up to millions of images, which is essentially to use what is known as “sharding”. The basic idea is as follows:
Each Supernode (“SN”) has an identifier used as its name in the Pastel Network. By using the concept of XOR distance, we can associate each SN with a particular subset of the space of all previously registered images. These associations would not be disjoint: that is, the same images would be assigned to at least 3 SNs so that we can compare the results of these machines to see that they all match.
Each SN is responsible for computing the correlations/dependency scores for the candidate image compared to the subset of all registered images which the SN is responsible for. After this is done, these correlation scores are shared with other SNs in the network, and the results from the SNs that have been assigned the same subset of images are compared to check for consistency.
The verified results from each group of SNs are all sent to the 3 top ranked Supernodes, which combine the results and then finish the computation. This avoids the limiting factor mentioned above, which is the current requirement for the full table of registered fingerprints to reside in memory at once on a single machine. The results of the overall computation from each of the three top ranked SNs are compared by all SNs to check for consistency, and if they match, the results are written to the blockchain as usual.