Memory-bounded GPU-accelerated framework for graph-based ANN vector search using CUDA and LibTorch, optimized for large-scale workloads beyond GPU memory. Features batch processing for high efficiency; outperforms CPU-only ANN in speed for similarity search in vector databases.