How I Sold a $5,500 Offline RAG Chatbot
Nov 11, 2025
I sold a tutoring chatbot to a university in South Africa for $5,500. It’s a straightforward tool, but the journey to the final product revealed a critical lesson about what businesses actually need from AI.
I tested two approaches for the client: one built with native code and another using n8n. Both worked well. The front-end was simple. The real complexity, the part that delivered the value, was hidden in the backend.
The university’s primary requirement was speed and specificity. They needed to chat with individual, pre-selected data sources without any delay. This meant having each document collection pre-vectorized and stored in its own database, ready to be called upon instantly.
The system had to avoid re-vectorizing and embedding data on the fly. That kind of latency would make the tool unusable for students needing quick answers. I hadn’t built a system with this specific architecture before, but it was the core of the problem.
The goal was to create an experience similar to Google’s NotebookLM, where a user selects their sources and immediately begins a conversation.
We solved it. Now, users can select any combination of sources, and the chatbot responds instantly, drawing only from the specified knowledge base.
During development, we uncovered a second, more practical problem. The university struggled with managing recurring subscription costs for AI models. The process of paying upfront with a personal credit card and claiming the expense back was a significant administrative burden. With a large volume of students and numerous data sources, a pay-per-use cloud model was not just expensive, but operationally painful.
This is a common issue for large organizations. The friction of corporate finance can kill an otherwise great project.
So, we pivoted to an offline-first solution.
The university invested in its own GPU. We deployed an open-source model, GPT-OSS, which is both fast and intelligent enough for the task. The entire system now runs on their local hardware. I can disconnect the machine from the internet, and it works flawlessly.
This move gave them a scalable, effective solution that fit their budget and their internal processes. They now have a fixed asset, not a recurring operational expense.
This project wasn’t just about building a chatbot, anyone can build that... It was about designing a system that solved a real-world operational bottleneck. By moving away from cloud-based subscriptions and solving the data latency issue, we delivered a tool that is both powerful and practical for thousands of students.
This is the essence of moving from Level 1 automation — simply using AI tools — to Level 2, where you strategically automate core processes. The focus shifts from the tool itself to the system that delivers a reliable, scalable outcome.
The setup for the selectable vector databases was tricky. If you are not tech-savvy, I will have the full detailed instructions in our Corporate Automation Library (CAL), which hosts the n8n code and steps required to get this running on your server.
Click Here to gain access to CAL. We have over 60+ high-impact, high-ROI automations, with 2–4 new corporate automations uploaded weekly.
Ritesh Kanjee | Automations Architect & Founder Augmented AI (121K Subscribers | 58K LinkedIn Followers)
From 80-Hour Weeks to 4-Hour Workflows
Get my Corporate Automation Starter Pack and discover how I automated my way from burnout to freedom. Includes the AI maturity audit + ready-to-deploy n8n workflows that save hours every day.
We hate SPAM. We will never sell your information, for any reason.