Skip to content
Back to work

Case study · 2025

Orlando’s AI Assistant — RAG Demo

A multi-language AI assistant powered by custom RAG architecture that answers questions about Orlando’s background, skills, and projects with grounded context.

Demonstrates real RAG pipeline implementation, streaming UI patterns, safety-aware logic, multi-language support, and contemporary frontend engineering.

Next.jsTypeScriptTailwind CSSCustom RAGGemini APIFramer Motioni18n
Orlando’s AI Assistant — RAG Demo screenshot

The Problem

Standard chatbots either hallucinate or refuse to engage. I wanted to demonstrate what a grounded, production-quality AI assistant looks like — one that pulls from a real knowledge base, streams answers in real time, and handles edge cases gracefully across multiple languages.

The Solution

Built a custom RAG pipeline where documents about my background, skills, and projects are chunked, embedded, and retrieved at query time. The Next.js frontend streams tokens as they arrive from the Gemini API, making the experience feel instantaneous rather than loading-then-delivering.

Key decisions:

  • Custom RAG over fine-tuning — retrieval keeps the knowledge base updatable without model retraining
  • 15+ context categories — knowledge is structured by domain (experience, skills, projects, values) so retrieval is precise
  • Multi-language support — the system detects query language and responds in kind, tested across 8 locales
  • Safety-aware logic — queries outside the knowledge scope return a calibrated refusal, not a hallucination
  • Keyboard shortcuts — power users can navigate and submit without reaching for the mouse

The Outcome

A fully interactive demo deployed to production that shows RAG pipeline implementation, streaming UI patterns, multi-language support, and safety-aware prompt design working together. Demonstrates that grounded AI can feel both capable and trustworthy.

Orlando’s AI Assistant — RAG Demo | Case Study by Orlando Ascanio