NVIDIA cuVS

NVIDIA cuVS is a GPU-accelerated approximate nearest neighbor search library utilizing CUDA for high-performance CAGRA, HNSW, IVF-PQ indexes on billion-scale datasets. Supports batch queries for high-throughput operations, ideal for large-scale similarity search and real-time recommendations. Delivers up to 12x faster index building and 8x lower query latency compared to CPU-only implementations like Milvus.

Visit Website

Overview

cuVS is a GPU-accelerated library for vector search and clustering from NVIDIA RAPIDS. Enables databases to scale up and out for massive-scale vector search workloads, delivering unmatched speed through GPU acceleration.

Performance Benefits

Index Building

12x faster index building on GPU at 95% recall (vs. CPU)
Massive speedups for large datasets
Parallel processing on GPU architecture

Query Performance

8x lower search latencies at 95% recall
Higher throughput at all recall levels
Better latency characteristics

Supported Algorithms

IVF-PQ: Inverted File with Product Quantization
IVF-Flat: Inverted File with flat vectors
CAGRA: CUDA-Accelerated Graph-based ANN
HNSW: Hierarchical Navigable Small World
Brute Force: Exact search option

Pricing

Free and open-source under Apache 2.0 license.

Surveys

Loading more......

Information

Websiterapids.ai

PublishedApr 23, 2026

Tags

3 Items

#gpu-acceleration #cuda #GPU Support

Similar Products

cuVS

NVIDIA RAPIDS cuVS is a GPU-accelerated library for vector search and clustering with CUDA-optimized HNSW, IVF, CAGRA, and PQ implementations. Supports batch queries for high QPS, suited for large-scale similarity search in GenAI apps. Achieves up to 12x faster indexing and lower latency vs CPU-only alternatives like FAISS CPU.

000

PilotANN

Memory-bounded GPU-accelerated framework for graph-based ANN vector search using CUDA and LibTorch, optimized for large-scale workloads beyond GPU memory. Features batch processing for high efficiency; outperforms CPU-only ANN in speed for similarity search in vector databases.

000

RUMMY

GPU-accelerated vector query processing system using CUDA to handle datasets larger than GPU memory via reordered pipelining and cluster-based retrofitting. Supports batch queries with up to 135x speedup over traditional GPU methods and 23x vs CPU-only for large-scale similarity search and MIPS.

000

FusionANNS

An efficient CPU/GPU cooperative processing architecture for billion-scale approximate nearest neighbor search. FusionANNS achieves up to 13.1× higher QPS compared to SPANN and can handle billion-vector datasets with over 12,000 QPS while maintaining 15ms latency using only one entry-level GPU.

000

RAFT

RAFT is a suite of GPU-accelerated libraries for data science, including support for vector search and similarity operations, often used in vector database scenarios.

000

Juno — Optimizing ANNS with Sparsity-Aware Algorithm and Ray-Tracing Core Mapping

ASPLOS 2024 paper introducing Juno, a system that accelerates high-dimensional approximate nearest neighbor search using sparsity-aware algorithms and GPU ray-tracing (RT) core mapping for hardware-level computation acceleration.

000

NVIDIA cuVS

Overview

Performance Benefits

Index Building

Query Performance

Supported Algorithms

Pricing

Information

Categories

Tags

Similar Products

Connect with us

Stay Updated

Product

Clients

Company

Resources

NVIDIA cuVS

Overview

Performance Benefits

Index Building

Query Performance

Supported Algorithms

Pricing

Information

Categories

Tags

Similar Products