From the first question to a Stripe payment — one agent runs the whole funnel.
A conversational AI sales agent for an online IT school. It answers course questions from a RAG knowledge base, qualifies leads in real time, books a consultation, and closes with an in-chat Stripe checkout — the entire funnel inside a single conversation, running 24/7.
The impact
Same inquiry, two worlds — a human sales desk versus an agent that never sleeps.
The brief
The problemThe funnel leaks at every handoff.
Every service business hits the same bottleneck: a potential customer reaches out — asks about pricing, compares options, needs help choosing — and waits. A sales rep takes 5–15 minutes to respond during work hours, and not at all on evenings and weekends. By then the lead has moved on to a competitor. Even when a rep does engage, the funnel leaks at every handoff: chat to email, email to a separate payment page, payment back to scheduling. Each transition is a chance to drop off. The result — lost revenue, wasted ad spend, and a team buried in repetitive conversations instead of closing deals.
The solutionOne chat that sells, books, and gets paid.
I built a conversational AI sales agent that handles the entire funnel — from the first “What courses do you have?” to a completed Stripe payment — inside a single chat window. The bot runs 24/7, never forgets a price, and qualifies leads automatically.
The core is a RAG pipeline: the school's knowledge base (PDF) is chunked, embedded, and indexed in Qdrant. Every question retrieves the top-9 candidates, then FlashRank reranks them to the 4 most relevant chunks. Gemini answers strictly from that context — no hallucinated prices, no made-up courses. In parallel, a lead-scoring system tracks intent info → interest → ready_to_buy and syncs to Airtable the moment a lead goes hot.
When all required data is collected — course, tier, contact — the bot offers consultation slots from Google Calendar and generates a Stripe Checkout link, all without the user leaving the chat. After payment, it creates the calendar event and sends confirmation emails automatically.
How it works
Three security layers wrap a RAG core; one output branches into response, CRM, and checkout.
Key features
RAG with reranking
The PDF knowledge base is chunked and indexed in Qdrant. Every question pulls 9 candidates, FlashRank reranks to the top 4, and full catalog pricing is injected into every call — so the bot never invents a price.
Real-time lead qualification
Intent is scored on every turn — info → interest → ready_to_buy — moving each lead cold → warm → hot. The moment it goes hot, it syncs to Airtable and a live status dot lights up in the UI.
Three-layer security
Aligned with the OWASP LLM Top-10: an input guard (41 regex patterns, EN + RU) blocks injection before the LLM, system-prompt hardening rejects role overrides, and an output guard sanitizes every response.
In-chat payments
Once course, tier and contact are all collected, the bot generates a Stripe Checkout link inside the conversation. A signed webhook confirms payment and kicks off the post-sale chain.
Consultation scheduling
Reads free 30-minute slots from Google Calendar, offers three to the user, and books the event automatically after payment — then Resend fires the confirmation email.
SSE streaming
Answers stream token-by-token over Server-Sent Events for a live typewriter feel, with a final sanitized replace pass once the response clears the output guard.
Under the hood
- Structured outputGemini returns JSON — answer, intent, extracted fields — with fallback parsing for malformed responses.
- Session state machineHistory, lead data, and payment state tracked per session, 30-min TTL.
- Fuzzy matching22 products with aliases — “ML” → Data Science, “security” → Cybersecurity.
- Webhook post-saleOne Stripe webhook triggers calendar event + email in a single chain.
- Pydantic validationMax 500 chars, empty/whitespace rejection, 1 MB body limit.
Got a funnel that leaks at every handoff?
This one collapses inquiry, qualification, scheduling, and payment into a single chat that runs around the clock. I scope it, build it end to end, and hand back something that closes on its own.