// AI · MCP
Building an AI Chatbot with MCP-Inspired Skills Architecture
Building an intelligent chatbot that can answer questions about your product, documentation, and codebase is easier than you think. In this comprehensive guide, we’ll build a production-ready AI chatbot using an MCP-inspired Skills architecture that’s cost-effective, scalable, and works for modern applications.
Why This Architecture?
Traditional RAG (Retrieval-Augmented Generation) approaches with vector embeddings can be expensive and complex. Our approach uses:
- MCP-Inspired Skills: Modular, reusable components that extend chatbot capabilities
- Markup-First Knowledge: Use existing MDX/Markdown content directly (no embeddings needed)
- 90% Cost Savings: Keyword search instead of expensive vector operations
- Universal Architecture: Consistent API across all platforms
This is the same architecture powering the MyCuppa.io chatbot.
Architecture Overview
┌──────────────────────────────────────────────┐
│ AI Chatbot Service │
│ │
│ ┌────────────┐ ┌──────────────────────┐ │
│ │ Skill │ │ Knowledge Index │ │
│ │ Orchestrator│ │ (MDX/Markdown files) │ │
│ └────────────┘ └──────────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────────────────────────┐ │
│ │ Claude API (Extended Thinking) │ │
│ └─────────────────────────────────────┘ │
└──────────────────────────────────────────────┘
Step 1: Knowledge Index Structure
First, create a searchable index of your content:
// src/lib/chatbot/knowledge/index.ts
export interface KnowledgeEntry {
id: string
title: string
description: string
content: string
category: string
tags: string[]
keywords: string[]
url: string
priority: number // 1-10, higher = more important
}
export interface KnowledgeIndex {
entries: KnowledgeEntry[]
lastUpdated: string
}
// Generate index from MDX files
export async function generateKnowledgeIndex(): Promise<KnowledgeIndex> {
const entries: KnowledgeEntry[] = []
// Read all MDX files
const contentDirs = ['docs', 'learning', 'components', 'sdks']
for (const dir of contentDirs) {
const files = await fs.readdir(path.join(process.cwd(), 'content', dir))
for (const file of files) {
if (!file.endsWith('.mdx')) continue
const filePath = path.join(process.cwd(), 'content', dir, file)
const fileContent = await fs.readFile(filePath, 'utf-8')
const { data, content } = matter(fileContent)
// Extract keywords using TF-IDF
const keywords = extractKeywords(content, data.title)
entries.push({
id: `${dir}/${file.replace('.mdx', '')}`,
title: data.title,
description: data.description || '',
content: content.substring(0, 2000), // First 2000 chars
category: dir,
tags: data.tags || [],
keywords,
url: `/${dir}/${file.replace('.mdx', '')}`,
priority: data.featured ? 10 : 5,
})
}
}
return {
entries,
lastUpdated: new Date().toISOString(),
}
}
// Simple TF-IDF keyword extraction
function extractKeywords(text: string, title: string): string[] {
const commonWords = new Set(['the', 'a', 'an', 'and', 'or', 'but', 'in', 'on', 'at', 'to', 'for'])
// Tokenize and count
const words = text.toLowerCase()
.replace(/[^\w\s]/g, ' ')
.split(/\s+/)
.filter(w => w.length > 3 && !commonWords.has(w))
const wordCount = new Map<string, number>()
words.forEach(word => {
wordCount.set(word, (wordCount.get(word) || 0) + 1)
})
// Add title words with higher weight
title.toLowerCase().split(/\s+/).forEach(word => {
if (word.length > 3) {
wordCount.set(word, (wordCount.get(word) || 0) + 5)
}
})
// Sort by frequency and return top 20
return Array.from(wordCount.entries())
.sort((a, b) => b[1] - a[1])
.slice(0, 20)
.map(([word]) => word)
}
Step 2: Skills System
Create modular Skills that extend chatbot capabilities:
// src/lib/chatbot/skills/base.ts
export interface SkillContext {
query: string
conversationHistory: Message[]
userProfile?: UserProfile
}
export interface SkillResult {
confidence: number // 0-1
response?: string
context?: any
shouldContinue: boolean
}
export abstract class Skill {
abstract name: string
abstract description: string
abstract keywords: string[]
// Determine if this skill should handle the query
abstract canHandle(context: SkillContext): number // 0-1 confidence
// Execute the skill
abstract execute(context: SkillContext): Promise<SkillResult>
}
// src/lib/chatbot/skills/FrameworkSkill.ts
export class FrameworkSkill extends Skill {
name = 'Framework'
description = 'Answers questions about Cuppa Framework'
keywords = ['cuppa', 'framework', 'component', 'sdk', 'install', 'setup', 'docs']
canHandle(context: SkillContext): number {
const query = context.query.toLowerCase()
const matchCount = this.keywords.filter(kw => query.includes(kw)).length
return Math.min(matchCount / 3, 1) // Normalize to 0-1
}
async execute(context: SkillContext): Promise<SkillResult> {
// Search knowledge base
const results = await searchKnowledge(context.query, {
categories: ['docs', 'components', 'sdks'],
limit: 5,
})
if (results.length === 0) {
return {
confidence: 0.3,
response: "I couldn't find specific documentation. Could you rephrase your question?",
shouldContinue: false,
}
}
// Build context for Claude
const relevantContext = results.map(r =>
`# ${r.title}\n${r.content}\nSource: ${r.url}`
).join('\n\n---\n\n')
return {
confidence: 0.9,
context: {
relevantDocs: results,
contextText: relevantContext,
},
shouldContinue: true,
}
}
}
// src/lib/chatbot/skills/LearningSkill.ts
export class LearningSkill extends Skill {
name = 'Learning'
description = 'Helps with tutorials and learning resources'
keywords = ['tutorial', 'learn', 'guide', 'example', 'how to', 'getting started']
canHandle(context: SkillContext): number {
const query = context.query.toLowerCase()
const matchCount = this.keywords.filter(kw => query.includes(kw)).length
return Math.min(matchCount / 2, 1)
}
async execute(context: SkillContext): Promise<SkillResult> {
const results = await searchKnowledge(context.query, {
categories: ['learning'],
limit: 3,
})
// Check if user has premium access for premium articles
const accessibleResults = results.filter(r => {
if (r.access === 'premium') {
return context.userProfile?.subscription === 'premium'
}
return true
})
if (accessibleResults.length === 0) {
return {
confidence: 0.6,
response: "I found some premium tutorials that match your question. Upgrade to Premium to access them!",
shouldContinue: false,
}
}
const relevantContext = accessibleResults.map(r =>
`# ${r.title}\n${r.description}\n\nURL: ${r.url}`
).join('\n\n---\n\n')
return {
confidence: 0.85,
context: {
tutorials: accessibleResults,
contextText: relevantContext,
},
shouldContinue: true,
}
}
}
Step 3: Skill Orchestrator
Coordinate multiple skills:
// src/lib/chatbot/orchestrator.ts
export class SkillOrchestrator {
private skills: Skill[] = []
registerSkill(skill: Skill) {
this.skills.push(skill)
}
async processQuery(context: SkillContext): Promise<{
selectedSkills: Skill[]
contexts: any[]
}> {
// Score all skills
const scored = await Promise.all(
this.skills.map(async skill => ({
skill,
confidence: skill.canHandle(context),
}))
)
// Select skills with confidence > 0.5
const selected = scored
.filter(s => s.confidence > 0.5)
.sort((a, b) => b.confidence - a.confidence)
.slice(0, 3) // Max 3 skills
// Execute selected skills
const results = await Promise.all(
selected.map(s => s.skill.execute(context))
)
return {
selectedSkills: selected.map(s => s.skill),
contexts: results
.filter(r => r.shouldContinue)
.map(r => r.context),
}
}
}
Step 4: Chat API Route
Create the API endpoint:
// src/app/api/chat/message/route.ts
import { NextRequest } from 'next/server'
import Anthropic from '@anthropic-ai/sdk'
const client = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
})
const orchestrator = new SkillOrchestrator()
orchestrator.registerSkill(new FrameworkSkill())
orchestrator.registerSkill(new LearningSkill())
orchestrator.registerSkill(new GeneralSkill())
export async function POST(req: NextRequest) {
try {
const { message, conversationId } = await req.json()
// Get user profile from session
const userProfile = await getUserProfile(req)
// Build context
const context: SkillContext = {
query: message,
conversationHistory: [], // Load from DB if needed
userProfile,
}
// Run skill orchestrator
const { selectedSkills, contexts } = await orchestrator.processQuery(context)
// Build system prompt with context
const systemPrompt = buildSystemPrompt(contexts, selectedSkills)
// Stream response from Claude with extended thinking
const stream = await client.messages.create({
model: 'claude-sonnet-4-5-20250929',
max_tokens: 16000,
thinking: {
type: 'enabled',
budget_tokens: 10000,
},
system: systemPrompt,
messages: [
{
role: 'user',
content: message,
},
],
stream: true,
})
// Create SSE stream
const encoder = new TextEncoder()
const readable = new ReadableStream({
async start(controller) {
try {
for await (const event of stream) {
if (event.type === 'content_block_delta' &&
event.delta.type === 'text_delta') {
const data = JSON.stringify({ token: event.delta.text })
controller.enqueue(encoder.encode(`data: ${data}\n\n`))
}
}
controller.enqueue(encoder.encode('data: [DONE]\n\n'))
controller.close()
} catch (error) {
controller.error(error)
}
},
})
return new Response(readable, {
headers: {
'Content-Type': 'text/event-stream',
'Cache-Control': 'no-cache',
'Connection': 'keep-alive',
},
})
} catch (error) {
console.error('Chat error:', error)
return Response.json({ error: 'Internal server error' }, { status: 500 })
}
}
function buildSystemPrompt(contexts: any[], skills: Skill[]): string {
const skillNames = skills.map(s => s.name).join(', ')
const contextText = contexts
.map(c => c.contextText)
.filter(Boolean)
.join('\n\n---\n\n')
return `You are an AI assistant for the Cuppa Framework. You help developers understand and use Cuppa for building cross-platform applications.
Active Skills: ${skillNames}
Relevant Documentation:
${contextText}
Guidelines:
- Be concise and helpful
- Reference specific documentation when relevant
- Provide code examples when appropriate
- If you don't know something, say so
- For premium content, mention it's available to Premium subscribers
Answer the user's question using the provided context.`
}
Step 5: React Chat Component
// src/components/chat/ChatWidget.tsx
'use client'
import { useState } from 'react'
import { useAuth } from '@/contexts/AuthContext'
interface Message {
role: 'user' | 'assistant'
content: string
}
export function ChatWidget() {
const { profile } = useAuth()
const [messages, setMessages] = useState<Message[]>([])
const [input, setInput] = useState('')
const [isLoading, setIsLoading] = useState(false)
const sendMessage = async () => {
if (!input.trim()) return
const userMessage: Message = { role: 'user', content: input }
setMessages(prev => [...prev, userMessage])
setInput('')
setIsLoading(true)
try {
const response = await fetch('/api/chat/message', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
message: input,
conversationId: 'temp-' + Date.now(),
}),
})
if (!response.body) throw new Error('No response body')
const reader = response.body.getReader()
const decoder = new TextDecoder()
let assistantMessage = ''
// Add empty assistant message
setMessages(prev => [...prev, { role: 'assistant', content: '' }])
while (true) {
const { done, value } = await reader.read()
if (done) break
const chunk = decoder.decode(value)
const lines = chunk.split('\n')
for (const line of lines) {
if (line.startsWith('data: ')) {
const data = line.slice(6)
if (data === '[DONE]') break
try {
const { token } = JSON.parse(data)
assistantMessage += token
// Update last message
setMessages(prev => {
const updated = [...prev]
updated[updated.length - 1].content = assistantMessage
return updated
})
} catch (e) {
// Skip invalid JSON
}
}
}
}
} catch (error) {
console.error('Chat error:', error)
setMessages(prev => [...prev, {
role: 'assistant',
content: 'Sorry, I encountered an error. Please try again.',
}])
} finally {
setIsLoading(false)
}
}
return (
<div className="flex flex-col h-[600px] border rounded-lg">
{/* Messages */}
<div className="flex-1 overflow-y-auto p-4 space-y-4">
{messages.map((message, i) => (
<div
key={i}
className={`flex ${message.role === 'user' ? 'justify-end' : 'justify-start'}`}
>
<div
className={`max-w-[80%] rounded-lg p-3 ${
message.role === 'user'
? 'bg-blue-600 text-white'
: 'bg-gray-100 dark:bg-gray-800'
}`}
>
{message.content}
</div>
</div>
))}
{isLoading && (
<div className="flex justify-start">
<div className="bg-gray-100 dark:bg-gray-800 rounded-lg p-3">
<div className="flex space-x-2">
<div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce" />
<div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce delay-100" />
<div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce delay-200" />
</div>
</div>
</div>
)}
</div>
{/* Input */}
<div className="border-t p-4">
<div className="flex gap-2">
<input
type="text"
value={input}
onChange={(e) => setInput(e.target.value)}
onKeyPress={(e) => e.key === 'Enter' && sendMessage()}
placeholder="Ask about Cuppa Framework..."
className="flex-1 px-4 py-2 border rounded-lg focus:outline-none focus:ring-2 focus:ring-blue-500"
disabled={isLoading}
/>
<button
onClick={sendMessage}
disabled={isLoading || !input.trim()}
className="px-6 py-2 bg-blue-600 text-white rounded-lg hover:bg-blue-700 disabled:opacity-50"
>
Send
</button>
</div>
</div>
</div>
)
}
Next Steps
This guide covered the core architecture. In the next tutorials, we’ll explore:
- Mobile Integration - iOS and Android implementations
- Advanced Features - Conversation history, context persistence
- Cost Optimization - Caching strategies, response streaming
- Production Deployment - Scaling, monitoring, rate limiting
The complete implementation is available in the MyCuppa chatbot architecture document.
Cost Analysis
Compared to traditional RAG:
- No embedding costs ($0 vs ~$0.02 per 1000 tokens)
- 90% cheaper overall
- Faster response times (no vector search latency)
- Simpler infrastructure (no vector database needed)
Perfect for startups and cost-conscious projects!