"The AI just doesn't work" is not a product complaint. It is a category mistake.
Across 69 demo calls, the same complaint kept appearing about whichever incumbent AI the buyer had tried. Sometimes verbatim, sometimes as a variation: "expensive add-on that doesn't work," "delivered poor answers, customer frustration," "requires heavy training, still flawed responses," "great in the demo, useless on real tickets." Twenty-five mentions across six weeks. The buyers were not wrong about what they experienced. They were wrong about what they bought.
They had been told they were buying an AI agent. What they actually bought was an LLM-flavored chatbot: a system that generates a fluent reply, has no memory of the customer or the ticket across sessions, cannot reach into their commerce platform to check the order, cannot issue the refund itself, and does not know when to step out of the way. When that system produced a wrong answer in front of a paying customer, the buyer's conclusion was reasonable: this product is broken. The actual conclusion is harder: this product was sold under the wrong category.
This is the most expensive avoidable mistake in CX procurement in 2026. The cure is not "evaluate harder." The cure is to learn the five things that separate a chatbot from an agent before the demo starts, so the demo cannot trick you.