Closing the verification loop: Observability-driven harnesses for building with agents

Posted on 2026-03-09 :: 1 min read :: Tags: software engineering, ai, testing, datadog

Originally published at Datadog AI.

This is a mirror entry of an article I co-authored published in Datadog AI Blog.

AI agents can now produce software faster than any team can verify it. The bottleneck has moved from writing code to trusting what was written.

We have seen this pattern before. Early programmers resisted compilers because they could write better assembly by hand. Often they were right. Compilers earned trust because the languages they translate have precise semantics: The programmer defines what the program does; the compiler has freedom over how it is implemented. Automation has consistently won only when paired with verification.

With AI agents, building trust is more challenging than in the case of compilers. AI agents ingest unrestricted natural language, sometimes from untrusted sources, and translate it into running code. We must find new ways to verify the outputs of these new program synthesis engines.