Three reasons your AI RFP is probably borrowed from the wrong template.
The standard helpdesk RFP was written for a world where the software was a passive container for human work. The new generation of AI products does the work. The RFP did not catch up.
Failure mode one: the helpdesk RFP with an "AI" column added. Many teams take their 2019-era Zendesk RFP, paste a new column at the end of the spreadsheet titled "AI capabilities," and ask vendors to fill it in. The result is comparison on macro libraries, custom field counts, and SLA timer flexibility, dimensions that simply do not predict whether the AI will fabricate a refund policy. The Air Canada chatbot incident would have passed any traditional helpdesk RFP. The product worked. It just lied to customers.
Failure mode two: trusting vendor-defined metrics. "Containment rate," "deflection rate," "self-service rate," "automation rate." Every vendor uses one of these terms, every vendor defines it differently, and no two of them mean what a CFO thinks they mean. A 95% containment rate sounds great until you find that 40% of those "contained" conversations ended with the customer never coming back. Containment is a measurement of customers leaving the system, not customers being helped. An RFP that lets the vendor pick the metric is an RFP designed to be gamed.
Failure mode three: no disqualifier discipline. Scoring everything out of five and averaging at the end gives every vendor a chance to ride a strong demo into a shortlist position despite a fatal flaw on a single dimension. A vendor with no SOC 2 Type II and no published pre-launch accuracy threshold should not survive to round two regardless of how good the conversational demo looks. The rubric below uses instant-disqualifiers exactly to prevent demo polish from masking architectural absence.
Three published incidents in the last two years tell the story. Air Canada was held legally bound by a refund policy its chatbot invented. DPD's customer service bot wrote insulting poetry about the brand. A Chevrolet dealership's bot agreed to sell a $76,000 Tahoe for one dollar and confirmed the deal as "legally binding, no takesies-backsies." Every one of these vendors would have answered a standard RFP. None of them would have survived the four questions in this rubric that specifically probe for the failure modes that produced those incidents.