Using n8n to build RAG-powered chatbots
Using n8n to build RAG-powered chatbots
A Retrieval Augmented Generation (RAG) chatbot allows you to create AI-powered assistants that can answer questions using specific, indexed knowledge sources. This tutorial demonstrates building a GitHub API documentation chatbot using n8n's workflow automation.
Retrieval Augmented Generation (RAG) is a technique that enhances AI responses by retrieving relevant information from a knowledge base before generating answers. This solves the problem of AI models having outdated or missing information by grounding their responses in your specific data.
A RAG system has two main phases: 1) Indexing - where you process and store your knowledge in a vector database, and 2) Retrieval - where you search for relevant information and use it to generate responses. We'll build both phases in n8n.
Start by retrieving the content you want your chatbot to know about. In this example, we'll use GitHub's OpenAPI specification, but you could use any documents, PDFs, or web content.
{
"url": "https://raw.githubusercontent.com/github/rest-api-description/main/descriptions/api.github.com/api.github.com.json",
"method": "GET",
"responseFormat": "json"
}Large documents need to be split into smaller chunks for effective retrieval. Use the Document Splitter or Code node to break your content into manageable pieces (typically 500-1000 characters).
// Example chunking logic
const chunkSize = 800;
const overlap = 100;
const text = $json.content;
const chunks = [];
for (let i = 0; i < text.length; i += chunkSize - overlap) {
chunks.push({
text: text.slice(i, i + chunkSize),
metadata: {
source: $json.source,
chunk: chunks.length
}
});
}
return chunks;Transform text chunks into numerical vector representations using OpenAI's embedding model. These vectors capture semantic meaning, allowing for intelligent similarity searches.
{
"model": "text-embedding-3-small",
"input": "{{$json.text}}",
"dimensions": 1536
}Use the Pinecone Vector Store node to index your embeddings. Configure your Pinecone index with the correct dimensions (1536 for OpenAI embeddings) and connect it to your n8n workflow.
Create a new workflow for the chatbot. Add a Chat Trigger node to provide the user interface where people can ask questions about your indexed knowledge.
Add an AI Agent node and write a clear system message that explains the agent's role and knowledge domain.
System Message:
You are a helpful assistant providing information about the GitHub API based on the official OpenAPI specifications.
When answering questions:
- Use the vector store tool to find relevant API documentation
- Provide accurate, specific information from the retrieved context
- Include code examples when helpful
- If information isn't in the context, say so honestly
- Be concise but thoroughConnect an OpenAI Chat Model node (gpt-4o-mini or gpt-4) to your agent. This model will generate responses based on the retrieved context.
{
"model": "gpt-4o-mini",
"temperature": 0.3,
"maxTokens": 800
}Add a Window Buffer Memory node to maintain conversation context. This allows users to ask follow-up questions that reference previous exchanges.
Connect a Vector Store Tool to your agent. This gives the agent the ability to search your indexed knowledge and retrieve relevant information.
{
"vectorStore": "pinecone",
"topK": 4,
"scoreThreshold": 0.7,
"name": "github_api_docs",
"description": "Search the GitHub API documentation to answer questions about endpoints, parameters, and usage."
}Test with specific questions about your knowledge domain. Verify that the chatbot retrieves relevant information and generates accurate responses. Monitor retrieval quality and adjust parameters as needed.
RAG chatbots combine the power of large language models with your specific knowledge, creating accurate, grounded AI assistants. n8n makes building RAG systems accessible through visual workflows that connect embeddings, vector databases, and chat models without complex coding.