



Open-source project demonstrating GPU-accelerated approximate nearest neighbor search using Inverted File (IVF) indexing on embeddings from a large Wikipedia dataset. It employs K-means clustering into 128 clusters and supports configurable CUDA kernels for coarse and fine search stages. Applicable for efficient vector querying in AI applications.
Loading more......
Demonstrates efficient vector database querying using GPU acceleration on a large-scale Wikipedia dataset (November 2020 plain text from Kaggle).
embedding.py to generate vector embeddings for Wikipedia articles.cluster.py to group embeddings into 128 clusters..npy files to .bin format using convert_npy_bin.py for C++ compatibility.queries_data folder or generate new ones with test.py --query \"Your query\".IVF.cpp for GPU-accelerated search.Passed as arguments to the executable:
n_probe: 1 to 128 (top clusters for fine search)mode: "Atomic" or "NonAtomic" (CUDA kernel type)sequential_fine_search: true/false (separate kernels per cluster or combined)use_cuda_coarse: true/false (GPU/CPU for coarse search)use_cuda_fine: true/false (GPU/CPU for fine search)threadsperBlock: Multiple of 32 (threads per block)On CUDA-enabled machine (e.g., cuda5.cims.nyu.edu):
module load cuda-12.4
nvcc IVF.cpp cosine_similarity.cu -o IVF
./IVF --n_probe=30 --mode=Atomic --sequential_fine_search=false --use_cuda_coarse=true --use_cuda_fine=true --threadsperBlock=128 --print_results=true
Use run_multiple_configs.sh <config_file> [num_runs] to run configurations multiple times and compute average CPU/GPU times.
Free and open-source (GitHub repository).