AI Quality Assurance

2 posts

Voxli

Agent Reliability Apr 27, 2026

The multi-turn failures that prompt evals can't see

Most agent failures we see in pilots don't show up on prompt evals.

Voxli

AI Agents Mar 27, 2026

The Risks of Agent Speculation

It’s no surprise that hallucinations are a common known failure during agentic AI testing. The agent starts to overpromise, begins to fabricate answers and even claims that it…

Voxli