• Home
  • Categories
  • Pricing
  • Submit
    Built with
    Ever Works
    Ever Works

    Connect with us

    Stay Updated

    Get the latest updates and exclusive content delivered to your inbox.

    Product

    • Categories
    • Pricing
    • Help

    Clients

    • Sign In
    • Register
    • Forgot password?

    Company

    • About Us
    • Admin
    • Sitemap

    Resources

    • Blog
    • Submit
    • API Documentation
    All product names, logos, and brands are the property of their respective owners. All company, product, and service names used in this repository, related repositories, and associated websites are for identification purposes only. The use of these names, logos, and brands does not imply endorsement, affiliation, or sponsorship. This directory may include content generated by artificial intelligence.
    Copyright © 2025 Awesome Vector Databases. All rights reserved.·Terms of Service·Privacy Policy·Cookies
    Decorative pattern
    Decorative pattern
    1. Home
    2. Data Processing
    3. NVIDIA cuDF

    NVIDIA cuDF

    Open-source Python GPU DataFrame library that accelerates popular data engines like Apache Spark, pandas, and Polars on NVIDIA AI infrastructure. Built on Apache Arrow, it utilizes GPU parallelism and memory bandwidth to accelerate data processing and analytics workflows, serving as the data-processing foundation for the Sirius GPU-accelerated database project.

    Surveys

    Loading more......

    Information

    Websitegithub.com
    PublishedApr 4, 2026

    Categories

    1 Item
    Data Processing

    Tags

    3 Items
    #GPU-accelerated#dataframe#Apache Arrow

    Similar Products

    6 result(s)

    cuVS

    NVIDIA RAPIDS' GPU-accelerated library for vector search and clustering, providing CUDA-optimized implementations of HNSW, IVF, CAGRA, and PQ, enabling billion-scale search on GPU hardware.

    VAST AI OS

    GPU-accelerated platform from VAST Data that includes a native vector database, designed for enterprise AI workloads including multi-agent systems, video-reasoning, and high-volume RAG. It combines vector embeddings with structured data and metadata in unified tables, enabling hybrid queries across modalities without orchestration layers or external indexes.

    VAST CNode-X

    GPU-accelerated server from VAST Data that combines the VAST AI OS with NVIDIA data-processing libraries and onboard NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs. Designed for enterprise AI workloads requiring high-throughput vector search, data vectorization, and inference, it leverages the NVIDIA AI Data Platform reference design.

    ruvector-scipix

    Rust OCR engine for scientific documents, extracting text and mathematical equations to LaTeX, MathML, or plain text. Supports batch processing, content detection for equations/tables/diagrams, confidence scoring, and PDF support. Includes TypeScript client (@ruvector/scipix) and CLI (scipix-cli).

    PageIndex

    Open-source tool by VectifyAI for pagewise document indexing that converts PDF pages into image representations for downstream multimodal embedding and retrieval. Designed to support late-interaction-based retrieval approaches like ColPali by preserving original document layout and visual structure.

    SmallPond

    A distributed data processing framework for vector data operations, providing lightweight parallel processing capabilities for embedding pipelines and data preparation workflows.

    Overview

    NVIDIA cuDF is an open-source GPU DataFrame library that provides pandas-like APIs for data processing accelerated by NVIDIA GPUs. Built on Apache Arrow, it utilizes GPU parallelism and memory bandwidth to accelerate data processing and analytics workflows.

    Key Capabilities

    • Accelerates popular data engines including Apache Spark, pandas, and Polars
    • Built on Apache Arrow columnar memory format
    • Utilizes GPU parallelism and memory bandwidth for high-throughput processing
    • Native compatibility with DuckDB
    • Serves as the data-processing foundation of the Sirius GPU-accelerated database

    Use Cases

    • Large-scale data processing and analytics
    • Real-time data transformation pipelines
    • GPU-accelerated SQL query execution
    • Data preprocessing for machine learning and AI workloads

    Licensing

    Free and open-source.