Case study · 2025
Orlando’s AI Assistant — RAG Demo
A multi-language AI assistant powered by custom RAG architecture that answers questions about Orlando’s background, skills, and projects with grounded context.
Demonstrates real RAG pipeline implementation, streaming UI patterns, safety-aware logic, multi-language support, and contemporary frontend engineering.

The Problem
Standard chatbots either hallucinate or refuse to engage. I wanted to demonstrate what a grounded, production-quality AI assistant looks like — one that pulls from a real knowledge base, streams answers in real time, and handles edge cases gracefully across multiple languages.
The Solution
Built a custom RAG pipeline where documents about my background, skills, and projects are chunked, embedded, and retrieved at query time. The Next.js frontend streams tokens as they arrive from the Gemini API, making the experience feel instantaneous rather than loading-then-delivering.
Key decisions:
- Custom RAG over fine-tuning — retrieval keeps the knowledge base updatable without model retraining
- 15+ context categories — knowledge is structured by domain (experience, skills, projects, values) so retrieval is precise
- Multi-language support — the system detects query language and responds in kind, tested across 8 locales
- Safety-aware logic — queries outside the knowledge scope return a calibrated refusal, not a hallucination
- Keyboard shortcuts — power users can navigate and submit without reaching for the mouse
The Outcome
A fully interactive demo deployed to production that shows RAG pipeline implementation, streaming UI patterns, multi-language support, and safety-aware prompt design working together. Demonstrates that grounded AI can feel both capable and trustworthy.