



Open-source evaluation and tracing library for AI agents and RAG systems, combining OpenTelemetry-based tracing with trustworthy evaluations including ground truth metrics and LLM-as-a-Judge feedback for production monitoring.
Loading more......
TruLens is an open source library for evaluating and tracing AI agents, including RAG systems and other LLM applications. Originally created by TruEra, TruLens is now a community-driven project with active oversight and support from Snowflake following TruEra's acquisition.
Combines OpenTelemetry-based tracing with trustworthy evaluations, including both ground truth metrics and reference-free (LLM-as-a-Judge) feedback. TruLens instrumentation is OpenTelemetry compatible, allowing interoperation with other observability systems.
The RAG triad consists of 3 core evaluations:
Satisfactory evaluations on each provides confidence that the LLM app is free from hallucination.
Evaluate critical components of your app's execution flow:
TruLens automatically instruments popular frameworks:
pip install trulens
from trulens.core import TruSession
from trulens.apps.langchain import TruChain
# Initialize session
session = TruSession()
# Wrap your app
tru_app = TruChain(
chain,
app_name="My RAG App",
app_version="v1"
)
# Use as normal - tracing happens automatically
response = tru_app("What is RAG?")
# View results in dashboard
session.run_dashboard()
Define custom evaluation criteria:
from trulens.feedback import Feedback
from trulens.providers.openai import OpenAI
# Initialize provider
provider = OpenAI()
# Define feedback
f_context_relevance = Feedback(
provider.context_relevance
).on_input().on("context").aggregate(np.mean)
Built-in dashboard for:
Free and open-source library. Costs only for LLM API calls used in feedback functions.