Overview
PrivateGPT is a production-ready AI project that allows you to ask questions about your documents using Large Language Models (LLMs), even without Internet connectivity. 100% private - no data leaves your execution environment.
Core Architecture
PrivateGPT is a service that wraps AI RAG primitives in a comprehensive set of APIs, built on:
- FastAPI: RESTful API framework
- LlamaIndex: RAG orchestration framework
- Local LLMs: Offline language models
- Local Vector Stores: Private embedding storage
Privacy Guarantees
100% Private
- No Internet Required: Can run completely offline
- No Data Leakage: Data never leaves your infrastructure
- Local Processing: All computation happens locally
- Self-Hosted: Full control over deployment
Data Security
- Documents remain on your servers
- Embeddings stored locally
- No third-party API calls
- Compliance-ready for regulated industries
Local LLM Support
Ollama Integration
- Connect to local Ollama instance
- Simplifies local LLM installation
- Wide model selection
- Easy model management
LlamaCPP
- Direct LlamaCPP integration
- GGUF model format support
- CPU and GPU acceleration
- Memory-efficient inference
Vector Database Support
All supported vector stores run locally by default:
- Qdrant: High-performance vector search
- Milvus: Scalable vector database
- ChromaDB: Simple embedded vector store
- PostgreSQL: With pgvector extension
Document Processing
Chunking Strategy
- 500-token chunks (default with LangChain)
- Overlapping chunks for context preservation
- Configurable chunk size and overlap
Embedding Generation
- SentenceTransformers for embeddings
- Local embedding models
- No external API calls
- Multiple embedding model options
Vector Storage
- DuckDB integration (some configurations)
- Persistent vector storage
- Fast retrieval
RAG Pipeline
Ingestion Phase
- Document loading and parsing
- Text chunking
- Local embedding generation
- Vector database storage