Data Validation
SecondBrainDAO Data Validation Process
The SecondBrainDAO validation framework is fully open-sourced, allowing anyone to browse, review, and is forked here. The process ensures data integrity through four key pillars: Proof of Uniqueness, Proof of Authenticity, Proof of Ownership, and Proof of Quality. Below is a detailed breakdown of each component:
1. Proof of Uniqueness (Is the data new?)
To prevent duplication and reward originality, each data submission is tied to a unique wallet address. A uniqueness score is assigned based on the volume of novel information provided. The system evaluates submissions by comparing them against prior entries, specifically:
Browser history: Analyzed for new URLs and visit patterns.
Bookmarks: Checked for distinct entries not previously submitted.
Location timeline data: Assessed for new geographic routes or timestamps. The higher the proportion of fresh data, the greater the uniqueness score.
2. Proof of Authenticity (Is the data real?)
Submissions must adhere to strict formatting and realism standards to be considered authentic:
Location Timeline Data: Requires valid timestamps and structured route information (e.g., logical sequence of locations).
Browser History: Must originate from Microsoft Edge or, for Brave/Chrome users, be processed via the "Chrome History Analysis to CSV" extension. Time spent per site must fall within a realistic range of 2 seconds to 2 hours.
Bookmarks: Must be submitted in Netscape HTML format to pass validation. Data failing these criteria is flagged as inauthentic and rejected.
3. Proof of Ownership (Who owns the data?)
Ownership is cryptographically verified to ensure submissions belong to the submitting user:
Data is submitted through the SecondBrainDAO platform, where it is linked to a wallet address.
A provided signature is authenticated against the submission to confirm the user’s rightful ownership. This secure process prevents unauthorized or fraudulent data uploads.
4. Proof of Quality (How good is the data?)
Data quality is scored based on completeness, accuracy, and richness:
Location Timeline Data: Must include precise timestamps and cover a period of up to 60 days to achieve the highest score.
Browser History: Requires valid URLs, page titles, timestamps, and reasonable time-spent metrics (2 seconds to 2 hours per site).
Bookmarks: Quality increases with the number of unique, well-formed entries submitted in Netscape HTML format. Higher scores are awarded to datasets that are comprehensive, accurate, and diverse.
Last updated