
SIFT1B Dataset
Billion-scale benchmark dataset containing 128-dimensional SIFT descriptors of one billion images. Widely used standard for evaluating approximate nearest neighbor search algorithms at scale.
About this tool
Overview
SIFT1B (also known as BigANN or ANN_SIFT1B) represents the 128-dimensional SIFT (Scale-Invariant Feature Transform) descriptors of one billion images. Released in September 2010, it remains a fundamental benchmark for large-scale vector search evaluation.
Dataset Characteristics
- Size: 1 billion vectors
- Dimensions: 128-dimensional SIFT descriptors
- Source: Image feature descriptors
- Format: High-dimensional vectors suitable for similarity search
Significance for Evaluation
SIFT1B plays a critical role in evaluating vector search algorithms by providing:
- Consistent, reproducible foundation for comparison
- Billion-scale testing to stress-test ANN algorithms
- Pre-processed data with known characteristics
- Industry-standard benchmark for performance claims
Related Datasets
- SIFT1M: 1 million SIFT descriptors (smaller version for initial testing)
- GIST1M: 960-dimensional GIST descriptors
- Deep1B: 1 billion deep learning features
Dataset Access
Laurent Amsaleg (CNRS/IRISA) and Hervé Jégou (Facebook AI Research) have waived all copyright and related rights. Datasets can be downloaded from http://corpus-texmex.irisa.fr/
For downloading BIGANN, using Axel is recommended for faster downloads.
Usage in Research
SIFT1B is extensively used in:
- ANN algorithm benchmarking
- Vector database performance evaluation
- Scalability testing
- Algorithm comparison studies
Loading more......
Information
Categories
Tags
Similar Products
6 result(s)