Skip to main content

Logging Runs

Log Pre-generated Responses in Python

Experiment with Multiple Chain Workflows

Logging and Comparing Against Your Expected Answers

Use Cases

Evaluate and Optimize RAG Applications

Evaluate and Optimize Agents, Chains or Multi-step Workflows

Prompt Engineering

Evaluate and Optimize Prompts

Experiment with Multiple Prompts

Metrics

Choose your Guardrail Metrics

Enabling Scorers in Runs

Register Custom Metrics

Customize Chainpoll-powered Metrics

Getting Insights

Understand Your Metric's Values

A/B Compare Prompts

Evaluate with Human Feedback

Identify Hallucinations

Rank your Runs

Collaboration

Share a Project

Collaborate with Other Personas

Export Your Evaluation Runs

Advanced Features

Add Tags and Metadata to Prompt Runs

Programmatically Fetch Logged Data

Set up Access Controls

Best Practices

Prompt Management & Storage

Create an Evaluation Set