• Home
  • Categories
  • Tags
  • Pricing
  • Submit
    Decorative pattern
    1. Home
    2. Sdks & Libraries
    3. PUFFINN

    PUFFINN

    Parameterless and Universal Fast Finding of Nearest Neighbors - an LSH-based library for approximate nearest neighbor search with probabilistic guarantees. Features a parameterless design requiring only memory budget and result quality specifications.

    🌐Visit Website

    About this tool

    Overview

    PUFFINN (Parameterless and Universal Fast FInding of Nearest Neighbors) is a parameterless LSH-based index for solving the k-nearest neighbor problem with probabilistic guarantees. It provides an easily configurable library for finding approximate nearest neighbors of arbitrary points.

    Key Features

    • Parameterless Design: Users only need to specify memory usage and desired result quality (recall), not complex algorithm parameters
    • Probabilistic Guarantees: Each near neighbor is guaranteed to be found with the specified recall probability, regardless of query difficulty
    • LSH-Based: Uses Locality Sensitive Hashing with an adaptive query mechanism
    • Multiple Similarity Measures: Supports Cosine similarity (SimHash/cross-polytope LSH) and Jaccard similarity (MinHash)
    • Multi-Language Support: Available in both C++ and Python with feature parity

    Technical Approach

    Under the hood, PUFFINN uses Locality Sensitive Hashing (LSH) with an adaptive query mechanism. This approach provides:

    • Configurable memory budgets
    • Adjustable recall levels
    • Consistent performance across different query types

    Distance Functions

    Currently Supported:

    • Cosine Similarity: Using SimHash or cross-polytope LSH
    • Jaccard Similarity: Using MinHash

    Implementation

    • Languages: C++ (primary implementation) with Python bindings
    • License: Open source
    • Platform: Cross-platform support

    Use Cases

    • Applications requiring guaranteed recall rates
    • Similarity search with predictable performance
    • Systems where parameter tuning overhead should be minimized
    • Cosine similarity search (text embeddings, document similarity)
    • Jaccard similarity search (set similarity, recommendation systems)

    Academic Background

    Published at the 27th Annual European Symposium on Algorithms (ESA 2019)

    Authors: Martin Aumüller, Tobias Christiani, Rasmus Pagh, and Michael Vesterli

    Paper: Available on arXiv (arXiv:1906.12211)

    Benchmarks

    PUFFINN has been included in various ANN benchmarking efforts and demonstrates competitive performance with probabilistic quality guarantees.

    Resources

    • GitHub: https://github.com/puffinn/puffinn
    • Publication: ESA 2019 (27th Annual European Symposium on Algorithms)
    • arXiv: 1906.12211

    Advantages

    The parameterless design makes PUFFINN particularly attractive for developers who want:

    • Simple configuration (just memory and recall)
    • Guaranteed search quality
    • No need for extensive parameter tuning
    • Predictable behavior across different datasets
    Surveys

    Loading more......

    Information

    Websitegithub.com
    PublishedMar 17, 2026

    Categories

    1 Item
    Sdks & Libraries

    Tags

    3 Items
    #Lsh#Ann#Open Source

    Similar Products

    6 result(s)
    FLANN

    Fast Library for Approximate Nearest Neighbors containing a collection of algorithms optimized for nearest neighbor search in high dimensional spaces with automatic algorithm and parameter selection.

    PageANN

    Disk-based approximate nearest neighbor search framework with page-aligned graph structure. Achieves 1.85x-10.83x higher throughput than state-of-the-art methods through optimized SSD utilization.

    PipeANN

    Low-latency, billion-scale updatable graph-based vector store on SSD. Achieves <1ms search latency with 10x less memory than in-memory indexes through alignment of best-first search with SSD characteristics.

    PyNNDescent

    Python implementation of Nearest Neighbor Descent for k-neighbor-graph construction and ANN search. Targets 80%-100% accuracy with fast performance and supports wide variety of distance metrics. This is an OSS library.

    Voyager

    Voyager is a Spotify open-source vector search library and service for efficient nearest neighbor search on large-scale vector datasets.

    Annoy

    An open-source library for approximate nearest neighbor search in high-dimensional spaces, often used as a backend for vector databases and search engines.

    Decorative pattern
    Built with
    Ever Works
    Ever Works

    Connect with us

    Stay Updated

    Get the latest updates and exclusive content delivered to your inbox.

    Product

    • Categories
    • Tags
    • Pricing
    • Help

    Clients

    • Sign In
    • Register
    • Forgot password?

    Company

    • About Us
    • Admin
    • Sitemap

    Resources

    • Blog
    • Submit
    • API Documentation
    All product names, logos, and brands are the property of their respective owners. All company, product, and service names used in this repository, related repositories, and associated websites are for identification purposes only. The use of these names, logos, and brands does not imply endorsement, affiliation, or sponsorship. This directory may include content generated by artificial intelligence.
    Copyright © 2025 Awesome Vector Databases. All rights reserved.·Terms of Service·Privacy Policy·Cookies