Introducing Enhanced LLM Observability in Middleware

Jul 01, 2026

As AI applications move into production, maintaining visibility into performance, cost, quality, and infrastructure becomes increasingly important.

We’re excited to introduce Enhanced LLM Observability in Middleware, with new capabilities that help you monitor, test, evaluate, and optimize your AI applications from a single platform.

What’s New

LLM Traces
Gain complete visibility into every LLM request with detailed traces, including tokens, latency, cost, error rates, stack traces, spans, and execution metadata to quickly troubleshoot issues.

Playground
Experiment with prompts across multiple LLM models, tools, and output schemas. Compare responses side by side and validate behavior before deploying to production.

LLM Evaluation
Measure response quality using configurable LLM-as-a-Judge evaluations. Define custom rules, choose the evaluation model, and automate response validation at scale.

GPU Monitoring
Monitor GPU hosts and devices with detailed metrics such as utilization, occupancy, memory usage, temperature, power consumption, bandwidth, and running processes.

LLM Dashboards
Track AI application performance with 50+ built-in dashboards covering token usage, latency, costs, LLM calls, evaluations, tool usage, and error rates.

Watch the Product Demo

Learn more about LLM Observability on our product page or get started in minutes with the setup guide.

Middleware's Newsletter

Discussion about this post

Ready for more?