TruLens

Open-source evaluation and tracing library for AI agents and RAG systems, combining OpenTelemetry-based tracing with trustworthy evaluations including ground truth metrics and LLM-as-a-Judge feedback for production monitoring.

Visit Website

Surveys

Loading more......

Information

Websitewww.trulens.org

PublishedMar 14, 2026

Tags

3 Items

#observability #evaluation #tracing

Similar Products

OpenLLMetry

Open-source observability for GenAI and LLM applications based on OpenTelemetry, providing AI-aware instrumentation for vector databases, LLM frameworks, and model providers.

000

Arize Phoenix

Open-source LLM tracing and evaluation solution built on OpenTelemetry for RAG evaluation. Provides automated instrumentation which records the execution path of LLM requests through multiple steps.

000

Galileo

An AI observability and evaluation platform that helps monitor and evaluate LLM outputs, RAG pipelines, and data quality, with tools for detecting hallucinations and measuring retrieval quality.

000

TruLens

An evaluation framework for LLM applications including RAG systems, providing observability, debugging, and guardrails. TruLens tracks retrieval quality, LLM performance, and hallucinations with detailed tracing.

000

Promptfoo

Open-source CLI and library for evaluating and red-teaming LLM applications with automated testing, security vulnerability scanning, and CI/CD integration. Recently acquired by OpenAI but remains open-source.

000

Opik

An open-source LLM observability and evaluation platform that provides comprehensive tracking, monitoring, and evaluation capabilities for large language model applications. Designed for production AI systems with focus on debugging and performance optimization.

000

Key Features

OpenTelemetry Integration

Combines OpenTelemetry-based tracing with trustworthy evaluations, including both ground truth metrics and reference-free (LLM-as-a-Judge) feedback. TruLens instrumentation is OpenTelemetry compatible, allowing interoperation with other observability systems.

RAG Triad Evaluation

The RAG triad consists of 3 core evaluations:

Context Relevance: Are retrieved contexts relevant to the query?

Groundedness: Is the answer grounded in the retrieved context?

Answer Relevance: Does the answer address the question?

Satisfactory evaluations on each provides confidence that the LLM app is free from hallucination.

Evaluation Capabilities

Evaluate critical components of your app's execution flow:

Retrieved context quality

Tool calls

Agent plans

Response generation

End-to-end performance

Basic Usage

from trulens.core import TruSession from trulens.apps.langchain import TruChain # Initialize session session = TruSession() # Wrap your app tru_app = TruChain( chain, app_name="My RAG App", app_version="v1" ) # Use as normal - tracing happens automatically response = tru_app("What is RAG?") # View results in dashboard session.run_dashboard()

from trulens.feedback import Feedback from trulens.providers.openai import OpenAI # Initialize provider provider = OpenAI() # Define feedback f_context_relevance = Feedback( provider.context_relevance ).on_input().on("context").aggregate(np.mean)

TruLens

Information

Categories

Tags

Similar Products

Connect with us

Stay Updated

Product

Clients

Company

Resources

TruLens

Information

Categories

Tags

Similar Products

Overview

Key Features

OpenTelemetry Integration

RAG Triad Evaluation

Evaluation Capabilities

Framework Support

Installation

Basic Usage

Feedback Functions

Dashboard

Key Advantages

Use Cases

Integration Partners

Resources

Pricing