



A Python ETL framework for stream processing and real-time analytics with built-in real-time vector indexing. Pathway automatically detects document changes and re-indexes in real-time, ensuring AI applications always use the latest information rather than stale data.
Pathway is a Python ETL framework that specializes in stream processing, real-time analytics, LLM pipelines, and RAG applications. Its key innovation is built-in real-time vector indexing that automatically stays synchronized with changing data sources.
Pathway provides an in-memory real-time vector index, eliminating the need for a separate vector database for many use cases. The index automatically:
Pathway offers real-time data indexes (vector search, full text search, and more) that effortlessly synchronize with data sources:
Changes are detected and reflected in the index within seconds, not hours or days.
Despite being written in Python for ease of use, Pathway code is executed by a scalable Rust engine based on Differential Dataflow. This architecture enables:
The use of Differential Dataflow means Pathway performs incremental computation—only recomputing what changes rather than reprocessing entire datasets. This makes real-time updates extremely efficient.
Build RAG applications that always work with fresh data:
A typical Pathway RAG pipeline:
Loading more......
Pathway integrates with:
Open-source and available on GitHub (pathwaycom/pathway) with extensive documentation and example notebooks for live vector indexing pipelines.