• Home
  • Categories
  • Tags
  • Pricing
  • Submit
    Decorative pattern
    1. Home
    2. Llm Tools
    3. Docling

    Docling

    Open-source document parsing framework from IBM with 97.9% accuracy in complex table extraction and excellent text fidelity. Self-hostable solution for converting PDFs, spreadsheets, and scanned images into structured data for RAG pipelines.

    🌐Visit Website

    About this tool

    Overview

    Docling is an open-source framework for parsing documents into structured data, developed by IBM. It excels at complex table extraction and provides self-hosting capabilities for privacy-sensitive applications.

    Features

    • High Accuracy: 97.9% accuracy in complex table extraction
    • Excellent Text Fidelity: Preserves document structure and formatting
    • Self-Hostable: Run entirely on-premises without API charges
    • Multiple Formats: Handles PDFs, spreadsheets, scanned images
    • Complex Layouts: Handles multi-column and nested structures
    • Multimodal: Processes both text and images
    • Open Source: Free to use and modify

    Performance

    Comprehensive evaluation reveals Docling as the superior framework for extracting structured data from sustainability reports. Achieves 100% accuracy on simple tables and 97.9% on complex structures. Processing time is 17+ seconds for complex documents but provides better accuracy than alternatives.

    Use Cases

    • Enterprise RAG pipelines requiring privacy
    • Complex document processing (financial reports, research papers)
    • Applications with sensitive data
    • On-premises AI deployments
    • Organizations wanting to avoid per-page API charges

    Comparison

    • vs LlamaParse: Slower but more accurate, self-hostable
    • vs Unstructured: Better on complex tables, slower processing

    Integration

    Works with LangChain, LlamaIndex, and can be integrated with IBM watsonx.data and Granite models.

    Pricing

    Free and open-source. No per-page charges.

    Surveys

    Loading more......

    Information

    Websitegithub.com
    PublishedMar 11, 2026

    Categories

    1 Item
    Llm Tools

    Tags

    3 Items
    #Document Parsing#Open Source#Rag

    Similar Products

    6 result(s)
    ARES

    RAG evaluation framework that trains lightweight judges for retrieval and generation scoring, refining evaluation by training specialized LLM judges on synthetic datasets to provide more reliable, confidence-aware judgments.

    LlamaParse

    High-performance document parsing service by LlamaIndex that consistently processes documents in about 6 seconds regardless of size. Returns rich Markdown and optional HTML tables with wide format support through hosted API.

    Unstructured

    Document parsing platform delivering strong content fidelity and precision with low hallucination rates. Achieves 100% accuracy on simple tables and 75% on complex structures with comprehensive enterprise document support.

    Verba

    Verba is a community-driven, open-source Retrieval-Augmented Generation (RAG) application that provides an end-to-end, user-friendly interface for building RAG workflows on top of a vector database, showcasing practical semantic search and retrieval patterns with Weaviate.

    Embedchain

    Open Source RAG Framework designed to be 'Conventional but Configurable', streamlining the creation of RAG applications with efficient data management, embeddings generation, and vector storage.

    FlashRAG

    Python toolkit for efficient RAG research providing 36 pre-processed benchmark datasets and 23 state-of-the-art RAG algorithms in a unified, modular framework for reproduction and development.

    Decorative pattern
    Built with
    Ever Works
    Ever Works

    Connect with us

    Stay Updated

    Get the latest updates and exclusive content delivered to your inbox.

    Product

    • Categories
    • Tags
    • Pricing
    • Help

    Clients

    • Sign In
    • Register
    • Forgot password?

    Company

    • About Us
    • Admin
    • Sitemap

    Resources

    • Blog
    • Submit
    • API Documentation
    All product names, logos, and brands are the property of their respective owners. All company, product, and service names used in this repository, related repositories, and associated websites are for identification purposes only. The use of these names, logos, and brands does not imply endorsement, affiliation, or sponsorship. This directory may include content generated by artificial intelligence.
    Copyright © 2025 Awesome Vector Databases. All rights reserved.·Terms of Service·Privacy Policy·Cookies