writing · 2025-03-30

Make the model show its work

Reasoning chains aren't just for accuracy. In an enterprise system they're how you earn trust, debug failures, and catch a model doing something it was never asked to do.

Every maths teacher who ever wrote “show your work” was making a point that applies neatly to language models. An answer with no visible reasoning is impossible to trust and impossible to debug. You either believe it or you don’t, and belief is not an engineering strategy.

In the systems I build, I ask the model to lay out its reasoning before it commits to an answer: why this document, why this conclusion. That single change pays off in three different places.

For the user, it turns a black-box verdict into something they can sanity-check. A support engineer who can see why a case was summarized the way it was will trust the summary, and correct it when it’s wrong.

For me, it is the best debugging signal there is. When an answer is bad, the reasoning chain usually shows exactly where it went off the rails. Often the retrieval was right and the reasoning was lazy, or the retrieval was wrong and the reasoning was faithfully building on sand. You cannot tell those apart from the final answer alone.

And for safety, the chain is a tripwire. When a prompt injection succeeds, you frequently see it first in the reasoning, where the model “decides” to follow an instruction it should have ignored.

A model that shows its work is slower and more verbose. It is also the only kind I would put in front of a customer.

#llm#explainability#rag

← all writing