pgai
Open-source PostgreSQL extension and Python library that automates embedding generation and synchronization for RAG and semantic search applications. Features pgai Vectorizer for declarative embedding pipelines. This is an OSS solution.
About this tool
Overview
pgai is a suite of tools that transforms PostgreSQL into a robust, production-ready retrieval engine for RAG and Agentic AI applications. It automatically creates and synchronizes vector embeddings from PostgreSQL data and S3 documents.
Key Features
- pgai Vectorizer: Declarative approach to embedding generation that treats embeddings like database indexes
- Automatic Synchronization: Keeps embeddings in sync as data changes
- Batch Processing: Efficient embedding generation with built-in handling for failures, rate limits, and latency spikes
- Multi-Model Support: Works with OpenAI, Cohere, and other embedding providers
- S3 Integration: Direct embedding of documents stored in S3
- Reliability: Handles model failures and unreliable endpoints gracefully
Components
pgai works as part of a three-extension stack on Timescale Cloud:
- pgvector: Provides vector data type and HNSW index
- pgvectorscale: StreamingDiskANN index for performance
- pgai: Embedding generation and AI model integration
Supported Models
- OpenAI: text-embedding-ada-002 and other embedding models
- Cohere: Multiple representation models for English and multilingual text
- Extensible to other providers
Use Cases
- RAG (Retrieval Augmented Generation) applications
- Semantic search over PostgreSQL data
- Automated embedding pipelines
- AI-powered data applications
Installation
Available via:
- PyPI:
pip install pgai - PostgreSQL extension installation
- Pre-configured on Timescale Cloud
Pricing
Free and open-source. Available on GitHub at github.com/timescale/pgai under Apache 2.0 license. Timescale Cloud provides managed hosting with usage-based pricing.
Loading more......
Information
Categories
Tags
Similar Products
6 result(s)PostgreSQL extension for scalable, low-latency vector search written in Rust. Features 20x faster HNSW than pgvector, with support for FP16, INT8, and binary vectors. This is an OSS extension.
Open-source PostgreSQL extension that builds on pgvector with higher-performance embedding search and cost-efficient storage. Features StreamingDiskANN index inspired by Microsoft's DiskANN algorithm. This is an OSS solution under PostgreSQL license.
PostgreSQL extension for scalable, high-performance vector search, successor to pgvecto.rs. Features RaBitQ quantization enabling 6x cost savings vs Pinecone. Fully compatible with pgvector. This is an OSS extension.
First fully reproducible open-source text embedding model with 8,192 context length. v2 introduces Mixture-of-Experts architecture for multilingual embeddings. Outperforms OpenAI models on benchmarks. This is an OSS model under Apache 2.0 license.
Open-source toolkit for developing AI applications using Postgres and pgvector. Provides managed PostgreSQL with built-in vector support, Python client (vecs), and AI features. This is a commercial managed service with OSS components.
Puck is an open-source vector search engine designed for fast similarity search and retrieval of embedding vectors.