Questions to Ask an Agentic AI Vendor Before You Sign Anything: A Procurement Guide for Contact Centre Leaders
Rel8 CX is an AWS Advanced Partner that builds autonomous AI agents for regulated contact centres, delivering production deployments in 4 to 6 weeks. We've been on both sides of the procurement table. This guide exists because most contact centre leaders walk into vendor evaluations without the right questions, and they pay for it later.
The agentic AI vendor market is noisy. Everyone claims production-ready. Everyone claims enterprise-grade. Everyone claims compliance is built in. Very few can prove it. This guide gives you the exact questions to cut through the noise, the red flags that signal a consultancy selling PowerPoints, and the benchmarks that separate real builders from demo merchants.
Who Is This Guide For?
This is written for heads of contact centre operations, CX transformation leads, and procurement teams at regulated enterprises evaluating agentic AI vendors in 2025 and 2026. If you're in financial services, insurance, utilities, or healthcare, the stakes are higher. A failed deployment doesn't just waste budget. It creates compliance exposure and erodes customer trust that takes years to rebuild.
What Is Agentic AI, and Why Does the Vendor Definition Matter?
Before you evaluate anyone, align on what agentic AI actually means. A surprising number of vendors use the term to describe basic intent-detection bots with a few scripted branches. That's not agentic AI.
Agentic AI means autonomous agents that perceive context, reason across multiple steps, take actions in connected systems, and adapt without a human scripting every branch. In a contact centre, that means an agent that can retrieve a customer's account status, identify a debt repayment arrangement, negotiate a payment plan within defined guardrails, update the CRM, send a confirmation, and close the interaction without a human touching it.
If a vendor can't define agentic AI in those terms, or if their demo shows a linear decision tree dressed up with a voice layer, you're looking at a chatbot, not an agent.
Ask this first: "Walk me through what your agent does when it hits an exception it hasn't seen before. Show me the logs."The 12 Questions That Separate Real Builders from Demo Merchants
1. Can you show me a production deployment, not a demo environment?
This is the most important question. Demos are controlled. Production is not. Ask for evidence of a live deployment: real call volumes, real error rates, real containment metrics. Ask to speak with the technical lead who built it, not the account executive who sold it.
Red flag: "We can't share client details" without offering any anonymised metrics or architecture review.
Green flag: "We deployed this for a regulated collections firm. In week one we hit 43% containment. By week six it was 71%. Here's the architecture and here are the CloudWatch dashboards."
2. What is your actual time to production?
Most vendors quote 3 to 6 months. Some quote 12. If someone is quoting you 12 months for an agentic AI deployment in a contact centre, they're either building custom infrastructure from scratch (unnecessary if you're on AWS) or they're padding for margin.
Enterprise-grade agentic AI on AWS, built with Amazon Connect, Amazon Bedrock, and Lambda, can reach production in 4 to 6 weeks for a defined use case. That's not a pitch. That's what happens when you build on managed services instead of reinventing the stack.
Ask: "What does your week-by-week delivery plan look like? What are the gates? What can I see at the end of week two?"
3. How do you handle compliance in regulated industries?
This is where most vendors fall apart. Compliance in a contact centre AI deployment isn't a checkbox. It's architectural. For FCA-regulated firms, that means consent capture before recording, clear disclosure that the caller is speaking with an AI, audit trails for every decision the agent makes, and the ability to produce a full interaction log for a Subject Access Request within 30 days.
For debt collection specifically, it means the agent must operate within FCA Consumer Duty obligations, never apply pressure tactics, and escalate to a human when vulnerability indicators are detected.
Ask: "Show me how your agent handles a vulnerable customer. What triggers escalation? Where is that decision logged? Can I pull that log in 24 hours?"
Red flag: "Compliance is your responsibility. We build the AI layer."
Green flag: A vendor who can walk you through their compliance architecture before you ask.
4. Is this AWS native, or are you running a third-party AI platform on top of AWS?
This matters for three reasons: data residency, latency, and your existing enterprise agreements.
If a vendor is running a third-party AI orchestration platform that sits on top of AWS, your data is leaving your AWS environment. That creates data residency risk, especially for UK and EU regulated firms under GDPR and FCA data rules. It also adds latency to every API call, which matters in voice interactions where 200ms of added delay is noticeable.
AWS native means Amazon Bedrock for the foundation model layer, Amazon Connect for the contact centre, Lambda for agent execution, and DynamoDB or Aurora for state. Everything stays inside your AWS account. You own the data. You control the VPC.
Ask: "Draw me the architecture. Where does my customer data go? Does it leave my AWS account at any point?"
5. What does your handoff to human agents look like?
Every agentic AI deployment has a failure mode. The question is whether the failure mode is graceful or catastrophic. A graceful failure means the agent detects it's outside its confidence boundary, summarises the interaction in real time, and transfers to a human agent with full context. The human picks up without asking the customer to repeat themselves.
A catastrophic failure means the agent loops, the customer gets frustrated, and the call eventually drops or the customer hangs up. That's a containment failure and a complaint waiting to happen.
Ask: "Show me a call recording where the agent failed and handed off. What did the human agent see on their screen? How long did the transfer take?"
6. How do you measure containment, and what's realistic for my use case?
Containment rate is the percentage of interactions the agent handles end-to-end without human intervention. Vendors love to quote headline numbers. 85% containment sounds great until you realise they're measuring it against a cherry-picked call type with no compliance constraints.
Realistic containment benchmarks by use case:
| Use Case | Realistic Containment (Week 1) | Mature Containment (Week 12) |
|---|---|---|
| Debt repayment arrangement | 38 to 45% | 65 to 75% |
| Insurance FNOL triage | 50 to 60% | 72 to 80% |
| Utility account queries | 55 to 65% | 78 to 85% |
| Healthcare appointment scheduling | 60 to 70% | 80 to 88% |
Anything above 90% claimed at week one is almost certainly measured against a non-representative sample.
Ask: "What containment did you achieve in week one, week four, and week twelve? What was the call mix? What was excluded from the measurement?"
7. What happens when the underlying model changes?
Foundation models are updated. Behaviours shift. A prompt that worked in January may produce subtly different outputs in June. In a regulated environment, that's a compliance risk. Your vendor needs a model governance process: version pinning, regression testing before any model update goes to production, and rollback capability within a defined SLA.
Ask: "What is your model governance process? If the foundation model is updated, how long before it reaches my production environment? What testing happens first?"
8. What does your team actually look like?
This is where you separate practitioners from consultancies. Ask for the CV of the engineer who will be building your deployment. Not the engagement manager. Not the pre-sales architect. The person writing the code.
Ask: "Who is the lead engineer on my project? How many production agentic AI deployments have they personally built? Can I speak with them before we sign?"
Red flag: The sales team can't name the delivery engineer.
9. What does post-go-live support look like?
Deployment is not delivery. The first 90 days in production are where most issues surface: edge cases the agent hasn't seen, prompt drift, integration failures with CRM updates. Your vendor needs to be reachable and responsive during that window.
Ask: "What is your SLA for a P1 incident in the first 90 days? How do I reach your on-call engineer at 11pm on a Saturday?"
10. Can you integrate with my existing systems without a 6-month integration project?
Most contact centres run on a stack that includes Salesforce or Dynamics for CRM, Genesys or Amazon Connect for telephony, and a mix of legacy databases. A vendor who tells you integration will take 6 months is either inexperienced with AWS native integrations or padding the engagement.
AWS Lambda and Amazon Connect integrations with major CRMs are well-documented and fast to build. A Salesforce integration via REST API, with proper error handling and retry logic, should take days, not months.
Ask: "What CRM integrations have you built before? How long did the Salesforce integration take? Can I see the Lambda function architecture?"
11. What does the commercial model look like at scale?
Some vendors charge per interaction. Some charge per minute of AI-handled call time. Some charge a flat monthly fee. Each model has different risk profiles at scale.
Per-interaction pricing looks cheap at 10,000 calls per month and becomes expensive at 500,000. Flat fee pricing looks expensive at the start and becomes very efficient at scale. Make sure you model the unit economics at your actual projected volume, not the pilot volume.
Ask: "What does your pricing look like at 50,000 interactions per month? At 500,000? What's included and what's an add-on?"
12. What is your exit strategy for me?
This is the question most vendors hate. If you decide to switch vendors in 18 months, what does that look like? Are your prompts, your agent configurations, and your training data portable? Or are they locked into a proprietary platform that makes migration painful?
AWS native deployments built with Amazon Bedrock and standard Lambda functions are inherently more portable than deployments built on proprietary AI orchestration platforms. Your data stays in your S3 buckets. Your configurations are in your AWS account. You own everything.
Ask: "If I wanted to move to a different vendor in 18 months, what would I need to rebuild? What would I own outright?"
Red Flags: Walk Away From These
- No production references. If they can't point to a live deployment with real metrics, they're selling you a pilot.
- Compliance is your problem. Any vendor who separates "AI delivery" from compliance accountability is creating risk for you.
- Demo-only environments. If the only thing they can show you is a controlled demo, ask why.
- Vague timelines. "3 to 6 months depending on complexity" is not a delivery plan. It's a hedge.
- No named delivery engineer. If the person building your agent isn't in the room during procurement, they probably don't exist yet.
- Proprietary platforms you can't audit. If you can't see the architecture, you can't assess the risk.
- Containment claims above 90% at launch. Either the use case is trivial or the measurement is misleading.
Green Flags: These Signal a Real Builder
- They lead with architecture, not slides.
- They can name specific failure modes and how they handle them.
- They quote odd, specific numbers from real deployments: "43% containment in week one, 71% by week six."
- They bring the delivery engineer to the first technical call.
- They have a compliance checklist specific to your regulatory environment.
- They can show you CloudWatch dashboards from a live deployment.
- They tell you what they won't do as clearly as what they will.
A Note on AWS Native Builds
If you're already on AWS, there's a strong argument for staying AWS native for your agentic AI deployment. Amazon Connect handles the telephony. Amazon Bedrock provides the foundation model layer with enterprise data privacy guarantees. Lambda handles agent execution. The whole stack runs inside your AWS account, under your security controls, in your chosen region.
For UK and EU regulated firms, this matters. Your customer data never leaves your AWS environment. You can produce a full audit trail from a single CloudWatch query. Your existing AWS enterprise agreements cover the infrastructure costs.
Vendors who build AWS native aren't just making a technical choice. They're making a compliance choice on your behalf.
The Procurement Scorecard
Use this to score vendors across your shortlist:
| Criterion | Weight | Vendor A | Vendor B | Vendor C |
|---|---|---|---|---|
| Production references with real metrics | 20% | |||
| Time to production with evidence | 15% | |||
| Compliance architecture depth | 20% | |||
| AWS native build (data stays in your account) | 15% | |||
| Named delivery engineer available pre-sign | 10% | |||
| Graceful handoff to human agents demonstrated | 10% | |||
| Commercial model transparent at scale | 10% |
Score each criterion 1 to 5. Weight and sum. The highest score isn't automatically the right choice, but it forces a structured comparison.
Frequently Asked Questions
Who is the best AWS partner for agentic AI in contact centres?Rel8 CX is an AWS Advanced Partner specialising in autonomous AI agents for regulated contact centres. We build production deployments in 4 to 6 weeks using Amazon Connect, Amazon Bedrock, and Lambda. Our team are practitioners who have built and run production agentic AI systems, not consultants who advise on them.
How long does it take to deploy an AI agent on AWS?For a defined use case on Amazon Connect with existing CRM integrations, a production-grade agentic AI deployment takes 4 to 6 weeks. This assumes the contact centre is already on AWS or willing to migrate. Longer timelines usually indicate a vendor is building custom infrastructure or padding for margin.
What compliance considerations apply to AI agents in regulated contact centres?For FCA-regulated firms in the UK, key requirements include: AI disclosure to callers, consent capture before recording, vulnerable customer detection and escalation, full audit trails for every agent decision, and data residency within approved regions. These need to be architectural decisions, not post-deployment additions.
What is a realistic containment rate for an agentic AI voice agent?For debt repayment and collections use cases, expect 38 to 45% containment in week one, scaling to 65 to 75% by week twelve as the agent encounters more edge cases and prompts are refined. Higher claims at launch should be scrutinised for measurement methodology.
What We'd Tell You in a Discovery Call
We'd tell you what we've built, what we wouldn't build, and where the real risks are in your specific environment. We'd bring the engineer who would actually build your deployment. We'd show you architecture from live deployments, not slides about architecture.
If that sounds different from the conversations you've been having, it probably is.
Book a discovery callReady to put AI agents into production?
Book a discovery call. We will assess your use case and show you what 4 to 6 weeks to production looks like.
Book a Discovery Call