Building an Open Source AI Assistant with n8n and Ollama: A Step-by-Step Guide

In the rapidly evolving landscape of AI-powered applications, building solutions that maintain data privacy, offer customization flexibility, and operate without ongoing API costs has become increasingly important. This comprehensive guide will walk you through creating a powerful, self-hosted AI assistant by combining two exceptional open-source projects: n8n for workflow automation and Ollama for running large language models locally.

By the end of this tutorial, you’ll have a fully functional AI assistant that can respond to queries, access external tools, search the web, and even maintain conversations with memory - all while keeping your data and processing entirely local and under your control.

Understanding the Solution Architecture

Before diving into implementation, let’s understand what we’re building and how the components fit together:

Core Components

n8n: A powerful workflow automation platform that connects various services and provides a visual interface for creating complex workflows. n8n offers native AI capabilities through its integration with LangChain.
Ollama: An open-source tool for running large language models (LLMs) locally on your hardware. Ollama makes it easy to deploy, manage, and interact with state-of-the-art models like Llama 3, Mistral, and many others.
AI Agent Node: n8n’s specialized node that orchestrates AI interactions using the LangChain framework. It coordinates between the LLM and various tools.
PostgreSQL with pgvector (Optional): For enhanced capabilities, we’ll also look at adding vector database functionality using pgvector to enable semantic search and knowledge retrieval.

How It Works

The system operates through this workflow:

User sends a message to the n8n chat interface
The chat message triggers the workflow
The AI Agent processes the message using a local LLM via Ollama
The Agent can use various tools (web search, database lookups, etc.)
A response is generated and sent back to the user
Context is maintained for ongoing conversations

This architecture gives you the benefits of modern AI assistants like ChatGPT or Claude, but with complete control over your data and infrastructure.

Prerequisites

Hardware Requirements

CPU: 4+ cores recommended (more cores will improve performance)
RAM: 8GB minimum, 16GB+ recommended for larger models
Storage: 10GB+ for software and models
GPU: Optional but recommended for better performance (NVIDIA GPU with 8GB+ VRAM)

Software Requirements

Operating System: Linux, macOS, or Windows (WSL recommended for Windows)
Docker (recommended for easy setup) or direct installation capability
Internet connection (for initial setup and optional web search capabilities)

Knowledge Requirements

Basic familiarity with command-line interfaces
Understanding of workflow concepts
No coding experience required, but helpful for advanced customization

Installation and Setup

Let’s start by getting both n8n and Ollama installed on your system.

Setting Up n8n

n8n can be installed in several ways. For this tutorial, we’ll use Docker as it provides the most consistent experience across different platforms.

Installing n8n with Docker

Ensure Docker and Docker Compose are installed on your system.
Create a new directory for your n8n installation:

mkdir n8n-assistant
cd n8n-assistant

Create a docker-compose.yml file with the following content:

version: "3"
services:
  n8n:
    image: docker.n8n.io/n8nio/n8n
    restart: always
    ports:
      - "5678:5678"
    environment:
      - N8N_PORT=5678
      - N8N_PROTOCOL=http
      - NODE_ENV=production
      - N8N_ENCRYPTION_KEY=your-secret-key-here
      - WEBHOOK_URL=http://localhost:5678/
      - OLLAMA_HOST=${OLLAMA_HOST:-host.docker.internal:11434}
    volumes:
      - n8n_data:/home/node/.n8n

volumes:
  n8n_data:

Start the n8n container:

docker-compose up -d

Access the n8n editor by opening your browser and navigating to http://localhost:5678

For alternative installation methods, including npm or direct installation, refer to the n8n documentation.

Setting Up Ollama

Now, let’s install Ollama to run our LLMs locally:

macOS Installation

brew install ollama

Linux Installation

curl -fsSL https://ollama.com/install.sh | sh

Windows Installation

Download and run the installer from the Ollama website

Start Ollama Service

After installation, start the Ollama service:

ollama serve

This command starts the Ollama server, which will be accessible at http://localhost:11434.

Downloading a Model

Before we can use Ollama with n8n, we need to download at least one LLM. For this tutorial, we’ll use Llama 3, which offers a good balance of capabilities and resource requirements:

ollama pull llama3

You can verify the model is working by running:

ollama run llama3

Type a simple prompt like “Hello, how are you?” and make sure you get a reasonable response. Press Ctrl+D to exit the chat.

Connecting n8n to Ollama

Now that both n8n and Ollama are running, let’s connect them:

In your browser, navigate to http://localhost:5678 to access the n8n editor
Click on “Settings” in the left sidebar, then select “Credentials”
Click “Create New Credentials” and search for “Ollama”
Select “Ollama API” from the list
Configure the credentials:
- Name: “Ollama Local”
- Base URL: http://localhost:11434 (if running on the same machine)
- If you’re running n8n in Docker and Ollama directly on the host, use http://host.docker.internal:11434 instead
Click “Test” to verify the connection
Save the credentials

If the test is successful, you’re now ready to start building your AI assistant workflow.

Building the AI Assistant Workflow

With our components installed and connected, let’s build the actual AI assistant workflow:

Creating a New Workflow

In the n8n editor, click on “Workflows” in the left sidebar
Click “Create New” to create a new workflow
Name it “AI Assistant”

Setting Up the Chat Trigger

Click “Add first step” in the center of the canvas
Search for “Chat Trigger” and select it
In the node settings, enable “Allow File Uploads” if you want your assistant to process files
Save the node

Adding the AI Agent Node

Click the “+” button on the Chat Trigger node
Search for “AI Agent” and select it
In the node settings, set the “Prompt” to “Take from previous node automatically”
Under “Options”, add “System message” and enter your desired system prompt. For example:

You are a helpful, accurate, and friendly AI assistant. You answer questions to the best of your ability, using the tools provided when appropriate. When you don't know the answer, admit it rather than making something up.

Connecting the Ollama Model

Click the “+” button at the bottom of the AI Agent node where it says “Chat Model”
Search for “Ollama Chat Model” and select it
In the node settings:
- Credential: Select the “Ollama Local” credential you created earlier
- Model: Select “llama3” (or whichever model you downloaded)
- Adjust Temperature: Set to 0.7 for a balance of creativity and coherence
Save the node

Adding Memory (Optional but Recommended)

To enable your assistant to remember the conversation context:

Click the “+” button at the bottom of the AI Agent node where it says “Memory”
Search for “Simple Memory” and select it
Leave the default settings
Save the node

Enhancing Your AI Assistant with Tools

Now that you have a basic AI assistant set up, let’s enhance it with tools that will make it more powerful and useful. The AI Agent can use various tools to extend its capabilities beyond just conversation.

Adding a HTTP Request Tool for Web Search

Let’s enable our assistant to search the web for information:

Click the “+” button at the bottom of the AI Agent node where it says “Tools”
Search for “HTTP Request Tool” and select it
Configure the tool:
- Name: “Web Search”
- Description: “Useful for searching the web for current information. Use this when you need to find facts, news, or other information that might not be in your training data.”
- Method: “GET”
- URL: https://ddg-api.herokuapp.com/search?query=$fromAI(query)
- Headers: Add a header with key “Accept” and value “application/json”
Save the node

The $fromAI() function is special syntax that allows the AI model to dynamically populate this parameter. When the assistant needs to search for something, it will replace this with the appropriate search query.

Setting Up the Vector Store for Knowledge Base (Optional)

For more advanced capabilities, let’s set up a vector store to provide our assistant with a knowledge base:

First, Set Up PostgreSQL with pgvector

If you’re using Docker, add PostgreSQL with pgvector to your docker-compose.yml:

  postgres:
    image: pgvector/pgvector:pg16
    restart: always
    ports:
      - "5432:5432"
    environment:
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=postgres
      - POSTGRES_DB=knowledge
    volumes:
      - postgres_data:/var/lib/postgresql/data

volumes:
  postgres_data:

Then restart your Docker containers:

docker-compose up -d

Adding the Vector Store Tool

In n8n, create a new PostgreSQL connection in the Credentials manager:
- Host: localhost (or postgres if using Docker networking)
- Port: 5432
- User: postgres
- Password: postgres
- Database: knowledge
Click the “+” button at the bottom of the AI Agent node where it says “Tools”
Search for “Vector Store Question Answer Tool” and select it
Configure the tool:
- Data Name: “Company Knowledge Base”
- Description of Data: “Information about our company’s products, services, and policies”
- Vector Store: Click the “+” button to add a new vector store
- Search for “PGVector Vector Store” and select it
- Configure PGVector:
  - Credential: Select the PostgreSQL credential you created
  - Table Name: “knowledge_base”
  - Operation: “Retrieve Documents (As Vector Store for Chain/Tool)”
- Embeddings: Click the “+” button and select “Embeddings Ollama”
- Configure Embeddings Ollama:
  - Credential: Select your Ollama credential
  - Model: “nomic-embed-text” (if available) or any other embedding model
Save the node

Populating the Vector Store

To populate your vector store with knowledge:

Create a separate workflow for data ingestion
Use nodes like “Read Binary File” to import text files, PDFs, or other documents
Connect it to a “PGVector Vector Store” node configured for the “Insert Documents” operation
Run this workflow whenever you want to update your knowledge base

Testing Your AI Assistant

It’s time to test your AI assistant:

Make sure all nodes are configured and saved
Click the “Deploy” button to activate your workflow
Click the “Chat” button at the bottom of the screen to open the chat interface
Start interacting with your assistant

Try asking questions that would require different capabilities:

General knowledge questions to test the base model
Web searches to test the HTTP Request Tool
Questions about your knowledge base (if you set it up)

Advanced Customization

Fine-tuning the System Prompt

The system prompt is crucial for defining your assistant’s behavior. Here’s an enhanced system prompt you can use:

You are a helpful, accurate, and friendly AI assistant. Your primary goal is to provide useful and truthful information. Follow these principles:

1. Use the tools provided when appropriate to find information or perform actions.
2. When using web search, cite your sources.
3. When you don't know the answer, admit it rather than making something up.
4. Provide concise responses by default, but more detailed explanations when requested.
5. Maintain a friendly, professional tone.
6. For code or technical questions, include practical examples.

Creating Custom JavaScript Tools

For more specific functionality, you can create custom JavaScript tools:

Add a “Code” node to your workflow
Set it to JavaScript mode
Write a function that performs your desired operation
Add an “AI Function Tool” node and connect it to your AI Agent
Configure it to use the output from your Code node

This allows you to extend your assistant with virtually any functionality you can code.

Production Deployment Considerations

When deploying your AI assistant for production use, consider these best practices:

Security

Use HTTPS for all communications
Implement proper authentication for user access
Secure your API keys and credentials
Regularly update all components (n8n, Ollama, models)

Performance Optimization

Use the most efficient models for your needs
Consider running resource-intensive components on separate hardware
Implement caching mechanisms for frequent queries
Monitor system resource usage and scale as needed

High Availability

Set up monitoring and alerts for system health
Implement backup and recovery procedures
Consider redundant deployments for critical applications

Troubleshooting Common Issues

Connectivity Issues Between n8n and Ollama

Issue: n8n can’t connect to Ollama.

Solutions:

If using Docker, ensure that you’re using the correct host address. Try http://host.docker.internal:11434 instead of localhost.
For IPv6 conflicts, use http://127.0.0.1:11434 instead of localhost.
Verify that Ollama is running and accessible by testing with curl: curl http://localhost:11434/api/tags.
Check firewall settings that might block the connection.

Model Loading Issues

Issue: Selected model doesn’t appear in the dropdown or fails to load.

Solutions:

Verify that you’ve downloaded the model with ollama pull [model-name].
Check Ollama’s console output for any errors.
Try restarting the Ollama service.
For large models, ensure your system has enough RAM.

AI Agent Response Errors

Issue: The AI Agent is not using tools properly or gives errors when responding.

Solutions:

Check if your system prompt clearly instructs the agent to use tools.
Verify the tool configurations (especially URLs, credentials, and parameters).
Reduce the temperature setting for more deterministic responses.
Try using a more capable model if your current one struggles with tool use.

Vector Store Integration Issues

Issue: Vector store doesn’t return relevant information or gives errors.

Solutions:

Verify that pgvector is properly installed and configured.
Check if your embeddings model is compatible and working.
Ensure you’ve properly populated the vector store with data.
Test the vector store operations independently of the AI Agent.

Extending the Solution

To add image understanding capabilities:

Download a multimodal model in Ollama: ollama pull llava
Update your Ollama Chat Model configuration to use this model
Make sure “Allow File Uploads” is enabled on your Chat Trigger node

Integration with Other Services

Your AI assistant can be integrated with:

Email (for automated responses)
Slack or Discord (for team collaboration)
CRM systems (for customer service)
Knowledge bases (for retrieving internal documentation)

Use n8n’s extensive node library to connect to these services.

Conclusion

Congratulations! You’ve built a powerful, open-source AI assistant that runs completely on your own infrastructure. This solution provides:

Complete data privacy and control
Customization flexibility
No ongoing API costs
Integration with your existing tools and services

As open-source LLMs continue to improve, your self-hosted assistant will only get better with time. By upgrading to newer models as they become available, you can ensure your assistant remains capable and useful.

This approach represents the future of AI deployment for privacy-conscious individuals and organizations who want to leverage AI capabilities without surrendering control of their data or processes.

Technical FAQs

Q: Which Ollama model offers the best balance between capability and resource requirements?

A: Llama 3 (8B parameter version) currently offers one of the best balances between capability and resource requirements. It can run on systems with 8GB RAM while still providing strong reasoning, instruction-following, and tool use capabilities. For even more resource-constrained environments, consider smaller models like Phi-3-mini or Gemma 2B.

Q: Can I use GPU acceleration with Ollama and n8n?

A: Yes, Ollama automatically uses CUDA-compatible NVIDIA GPUs if present. To enable GPU support with Docker, add the appropriate GPU configurations to your docker-compose file. For non-NVIDIA GPUs, Ollama supports Metal (on Apple Silicon) and limited AMD GPU support.

Q: How can I improve the context window for handling longer conversations?

A: You can set the context length through environment variables when starting Ollama: OLLAMA_CONTEXT_LENGTH=8192 ollama serve. Additionally, using a model with natively larger context, like Llama 3, can help. For persistent memory beyond the context window, implement a vector store-based memory system.

Q: Is it possible to run n8n and Ollama on separate machines?

A: Yes, you can run n8n and Ollama on separate machines. In this case, you would need to:

Configure Ollama to listen on all interfaces: OLLAMA_HOST=0.0.0.0 ollama serve
Set up network security to allow the required connections
In n8n, configure the Ollama credentials to point to the remote machine: http://ollama-server-ip:11434

Q: How can I automate the update of the knowledge base when new documents become available?

A: Create a separate n8n workflow that:

Monitors a directory, email account, or other source for new documents
Processes the documents (text extraction, chunking)
Generates embeddings using Ollama
Stores the vectors in pgvector Set this workflow to run on a schedule or trigger it when new documents are detected.

Understanding the Solution Architecture

Core Components

How It Works

Prerequisites

Hardware Requirements

Software Requirements

Knowledge Requirements

Installation and Setup

Setting Up n8n

Installing n8n with Docker

Setting Up Ollama

macOS Installation

Linux Installation

Windows Installation

Start Ollama Service

Downloading a Model

Connecting n8n to Ollama

Building the AI Assistant Workflow

Creating a New Workflow

Setting Up the Chat Trigger

Adding the AI Agent Node

Connecting the Ollama Model

Adding Memory (Optional but Recommended)

Enhancing Your AI Assistant with Tools

Adding a HTTP Request Tool for Web Search

Setting Up the Vector Store for Knowledge Base (Optional)

First, Set Up PostgreSQL with pgvector

Adding the Vector Store Tool

Populating the Vector Store

Testing Your AI Assistant

Advanced Customization

Fine-tuning the System Prompt

Creating Custom JavaScript Tools

Production Deployment Considerations

Security

Performance Optimization

High Availability

Troubleshooting Common Issues

Connectivity Issues Between n8n and Ollama

Model Loading Issues

AI Agent Response Errors

Vector Store Integration Issues

Extending the Solution

Multi-Modal Capabilities

Integration with Other Services

Conclusion

Technical FAQs

Q: Which Ollama model offers the best balance between capability and resource requirements?

Q: Can I use GPU acceleration with Ollama and n8n?

Q: How can I improve the context window for handling longer conversations?

Q: Is it possible to run n8n and Ollama on separate machines?

Q: How can I automate the update of the knowledge base when new documents become available?