In the rapidly evolving landscape of AI-powered applications, building solutions that maintain data privacy, offer customization flexibility, and operate without ongoing API costs has become increasingly important. This comprehensive guide will walk you through creating a powerful, self-hosted AI assistant by combining two exceptional open-source projects: n8n for workflow automation and Ollama for running large language models locally.

By the end of this tutorial, you’ll have a fully functional AI assistant that can respond to queries, access external tools, search the web, and even maintain conversations with memory - all while keeping your data and processing entirely local and under your control.

Understanding the Solution Architecture

Before diving into implementation, let’s understand what we’re building and how the components fit together:

Core Components

  1. n8n: A powerful workflow automation platform that connects various services and provides a visual interface for creating complex workflows. n8n offers native AI capabilities through its integration with LangChain.

  2. Ollama: An open-source tool for running large language models (LLMs) locally on your hardware. Ollama makes it easy to deploy, manage, and interact with state-of-the-art models like Llama 3, Mistral, and many others.

  3. AI Agent Node: n8n’s specialized node that orchestrates AI interactions using the LangChain framework. It coordinates between the LLM and various tools.

  4. PostgreSQL with pgvector (Optional): For enhanced capabilities, we’ll also look at adding vector database functionality using pgvector to enable semantic search and knowledge retrieval.

How It Works

The system operates through this workflow:

  1. User sends a message to the n8n chat interface
  2. The chat message triggers the workflow
  3. The AI Agent processes the message using a local LLM via Ollama
  4. The Agent can use various tools (web search, database lookups, etc.)
  5. A response is generated and sent back to the user
  6. Context is maintained for ongoing conversations

This architecture gives you the benefits of modern AI assistants like ChatGPT or Claude, but with complete control over your data and infrastructure.

Prerequisites

Hardware Requirements

  • CPU: 4+ cores recommended (more cores will improve performance)
  • RAM: 8GB minimum, 16GB+ recommended for larger models
  • Storage: 10GB+ for software and models
  • GPU: Optional but recommended for better performance (NVIDIA GPU with 8GB+ VRAM)

Software Requirements

  • Operating System: Linux, macOS, or Windows (WSL recommended for Windows)
  • Docker (recommended for easy setup) or direct installation capability
  • Internet connection (for initial setup and optional web search capabilities)

Knowledge Requirements

  • Basic familiarity with command-line interfaces
  • Understanding of workflow concepts
  • No coding experience required, but helpful for advanced customization

Installation and Setup

Let’s start by getting both n8n and Ollama installed on your system.

Setting Up n8n

n8n can be installed in several ways. For this tutorial, we’ll use Docker as it provides the most consistent experience across different platforms.

Installing n8n with Docker

  1. Ensure Docker and Docker Compose are installed on your system.

  2. Create a new directory for your n8n installation:

mkdir n8n-assistant
cd n8n-assistant
  1. Create a docker-compose.yml file with the following content:
version: "3"
services:
  n8n:
    image: docker.n8n.io/n8nio/n8n
    restart: always
    ports:
      - "5678:5678"
    environment:
      - N8N_PORT=5678
      - N8N_PROTOCOL=http
      - NODE_ENV=production
      - N8N_ENCRYPTION_KEY=your-secret-key-here
      - WEBHOOK_URL=http://localhost:5678/
      - OLLAMA_HOST=${OLLAMA_HOST:-host.docker.internal:11434}
    volumes:
      - n8n_data:/home/node/.n8n

volumes:
  n8n_data:
  1. Start the n8n container:
docker-compose up -d
  1. Access the n8n editor by opening your browser and navigating to http://localhost:5678

For alternative installation methods, including npm or direct installation, refer to the n8n documentation.

Setting Up Ollama

Now, let’s install Ollama to run our LLMs locally:

macOS Installation

brew install ollama

Linux Installation

curl -fsSL https://ollama.com/install.sh | sh

Windows Installation

Download and run the installer from the Ollama website

Start Ollama Service

After installation, start the Ollama service:

ollama serve

This command starts the Ollama server, which will be accessible at http://localhost:11434.

Downloading a Model

Before we can use Ollama with n8n, we need to download at least one LLM. For this tutorial, we’ll use Llama 3, which offers a good balance of capabilities and resource requirements:

ollama pull llama3

You can verify the model is working by running:

ollama run llama3

Type a simple prompt like “Hello, how are you?” and make sure you get a reasonable response. Press Ctrl+D to exit the chat.

Connecting n8n to Ollama

Now that both n8n and Ollama are running, let’s connect them:

  1. In your browser, navigate to http://localhost:5678 to access the n8n editor

  2. Click on “Settings” in the left sidebar, then select “Credentials”

  3. Click “Create New Credentials” and search for “Ollama”

  4. Select “Ollama API” from the list

  5. Configure the credentials:

    • Name: “Ollama Local”
    • Base URL: http://localhost:11434 (if running on the same machine)
    • If you’re running n8n in Docker and Ollama directly on the host, use http://host.docker.internal:11434 instead
  6. Click “Test” to verify the connection

  7. Save the credentials

If the test is successful, you’re now ready to start building your AI assistant workflow.

Building the AI Assistant Workflow

With our components installed and connected, let’s build the actual AI assistant workflow:

Creating a New Workflow

  1. In the n8n editor, click on “Workflows” in the left sidebar
  2. Click “Create New” to create a new workflow
  3. Name it “AI Assistant”

Setting Up the Chat Trigger

  1. Click “Add first step” in the center of the canvas
  2. Search for “Chat Trigger” and select it
  3. In the node settings, enable “Allow File Uploads” if you want your assistant to process files
  4. Save the node

Adding the AI Agent Node

  1. Click the “+” button on the Chat Trigger node
  2. Search for “AI Agent” and select it
  3. In the node settings, set the “Prompt” to “Take from previous node automatically”
  4. Under “Options”, add “System message” and enter your desired system prompt. For example:
You are a helpful, accurate, and friendly AI assistant. You answer questions to the best of your ability, using the tools provided when appropriate. When you don't know the answer, admit it rather than making something up.

Connecting the Ollama Model

  1. Click the “+” button at the bottom of the AI Agent node where it says “Chat Model”
  2. Search for “Ollama Chat Model” and select it
  3. In the node settings:
    • Credential: Select the “Ollama Local” credential you created earlier
    • Model: Select “llama3” (or whichever model you downloaded)
    • Adjust Temperature: Set to 0.7 for a balance of creativity and coherence
  4. Save the node

To enable your assistant to remember the conversation context:

  1. Click the “+” button at the bottom of the AI Agent node where it says “Memory”
  2. Search for “Simple Memory” and select it
  3. Leave the default settings
  4. Save the node

Enhancing Your AI Assistant with Tools

Now that you have a basic AI assistant set up, let’s enhance it with tools that will make it more powerful and useful. The AI Agent can use various tools to extend its capabilities beyond just conversation.

Let’s enable our assistant to search the web for information:

  1. Click the “+” button at the bottom of the AI Agent node where it says “Tools”
  2. Search for “HTTP Request Tool” and select it
  3. Configure the tool:
    • Name: “Web Search”
    • Description: “Useful for searching the web for current information. Use this when you need to find facts, news, or other information that might not be in your training data.”
    • Method: “GET”
    • URL: https://ddg-api.herokuapp.com/search?query=$fromAI(query)
    • Headers: Add a header with key “Accept” and value “application/json”
  4. Save the node

The $fromAI() function is special syntax that allows the AI model to dynamically populate this parameter. When the assistant needs to search for something, it will replace this with the appropriate search query.

Setting Up the Vector Store for Knowledge Base (Optional)

For more advanced capabilities, let’s set up a vector store to provide our assistant with a knowledge base:

First, Set Up PostgreSQL with pgvector

If you’re using Docker, add PostgreSQL with pgvector to your docker-compose.yml:

  postgres:
    image: pgvector/pgvector:pg16
    restart: always
    ports:
      - "5432:5432"
    environment:
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=postgres
      - POSTGRES_DB=knowledge
    volumes:
      - postgres_data:/var/lib/postgresql/data

volumes:
  postgres_data:

Then restart your Docker containers:

docker-compose up -d

Adding the Vector Store Tool

  1. In n8n, create a new PostgreSQL connection in the Credentials manager:

    • Host: localhost (or postgres if using Docker networking)
    • Port: 5432
    • User: postgres
    • Password: postgres
    • Database: knowledge
  2. Click the “+” button at the bottom of the AI Agent node where it says “Tools”

  3. Search for “Vector Store Question Answer Tool” and select it

  4. Configure the tool:

    • Data Name: “Company Knowledge Base”
    • Description of Data: “Information about our company’s products, services, and policies”
    • Vector Store: Click the “+” button to add a new vector store
    • Search for “PGVector Vector Store” and select it
    • Configure PGVector:
      • Credential: Select the PostgreSQL credential you created
      • Table Name: “knowledge_base”
      • Operation: “Retrieve Documents (As Vector Store for Chain/Tool)”
    • Embeddings: Click the “+” button and select “Embeddings Ollama”
    • Configure Embeddings Ollama:
      • Credential: Select your Ollama credential
      • Model: “nomic-embed-text” (if available) or any other embedding model
  5. Save the node

Populating the Vector Store

To populate your vector store with knowledge:

  1. Create a separate workflow for data ingestion
  2. Use nodes like “Read Binary File” to import text files, PDFs, or other documents
  3. Connect it to a “PGVector Vector Store” node configured for the “Insert Documents” operation
  4. Run this workflow whenever you want to update your knowledge base

Testing Your AI Assistant

It’s time to test your AI assistant:

  1. Make sure all nodes are configured and saved
  2. Click the “Deploy” button to activate your workflow
  3. Click the “Chat” button at the bottom of the screen to open the chat interface
  4. Start interacting with your assistant

Try asking questions that would require different capabilities:

  • General knowledge questions to test the base model
  • Web searches to test the HTTP Request Tool
  • Questions about your knowledge base (if you set it up)

Advanced Customization

Fine-tuning the System Prompt

The system prompt is crucial for defining your assistant’s behavior. Here’s an enhanced system prompt you can use:

You are a helpful, accurate, and friendly AI assistant. Your primary goal is to provide useful and truthful information. Follow these principles:

1. Use the tools provided when appropriate to find information or perform actions.
2. When using web search, cite your sources.
3. When you don't know the answer, admit it rather than making something up.
4. Provide concise responses by default, but more detailed explanations when requested.
5. Maintain a friendly, professional tone.
6. For code or technical questions, include practical examples.

Creating Custom JavaScript Tools

For more specific functionality, you can create custom JavaScript tools:

  1. Add a “Code” node to your workflow
  2. Set it to JavaScript mode
  3. Write a function that performs your desired operation
  4. Add an “AI Function Tool” node and connect it to your AI Agent
  5. Configure it to use the output from your Code node

This allows you to extend your assistant with virtually any functionality you can code.

Production Deployment Considerations

When deploying your AI assistant for production use, consider these best practices:

Security

  • Use HTTPS for all communications
  • Implement proper authentication for user access
  • Secure your API keys and credentials
  • Regularly update all components (n8n, Ollama, models)

Performance Optimization

  • Use the most efficient models for your needs
  • Consider running resource-intensive components on separate hardware
  • Implement caching mechanisms for frequent queries
  • Monitor system resource usage and scale as needed

High Availability

  • Set up monitoring and alerts for system health
  • Implement backup and recovery procedures
  • Consider redundant deployments for critical applications

Troubleshooting Common Issues

Connectivity Issues Between n8n and Ollama

Issue: n8n can’t connect to Ollama.

Solutions:

  • If using Docker, ensure that you’re using the correct host address. Try http://host.docker.internal:11434 instead of localhost.
  • For IPv6 conflicts, use http://127.0.0.1:11434 instead of localhost.
  • Verify that Ollama is running and accessible by testing with curl: curl http://localhost:11434/api/tags.
  • Check firewall settings that might block the connection.

Model Loading Issues

Issue: Selected model doesn’t appear in the dropdown or fails to load.

Solutions:

  • Verify that you’ve downloaded the model with ollama pull [model-name].
  • Check Ollama’s console output for any errors.
  • Try restarting the Ollama service.
  • For large models, ensure your system has enough RAM.

AI Agent Response Errors

Issue: The AI Agent is not using tools properly or gives errors when responding.

Solutions:

  • Check if your system prompt clearly instructs the agent to use tools.
  • Verify the tool configurations (especially URLs, credentials, and parameters).
  • Reduce the temperature setting for more deterministic responses.
  • Try using a more capable model if your current one struggles with tool use.

Vector Store Integration Issues

Issue: Vector store doesn’t return relevant information or gives errors.

Solutions:

  • Verify that pgvector is properly installed and configured.
  • Check if your embeddings model is compatible and working.
  • Ensure you’ve properly populated the vector store with data.
  • Test the vector store operations independently of the AI Agent.

Extending the Solution

Multi-Modal Capabilities

To add image understanding capabilities:

  1. Download a multimodal model in Ollama: ollama pull llava
  2. Update your Ollama Chat Model configuration to use this model
  3. Make sure “Allow File Uploads” is enabled on your Chat Trigger node

Integration with Other Services

Your AI assistant can be integrated with:

  • Email (for automated responses)
  • Slack or Discord (for team collaboration)
  • CRM systems (for customer service)
  • Knowledge bases (for retrieving internal documentation)

Use n8n’s extensive node library to connect to these services.

Conclusion

Congratulations! You’ve built a powerful, open-source AI assistant that runs completely on your own infrastructure. This solution provides:

  • Complete data privacy and control
  • Customization flexibility
  • No ongoing API costs
  • Integration with your existing tools and services

As open-source LLMs continue to improve, your self-hosted assistant will only get better with time. By upgrading to newer models as they become available, you can ensure your assistant remains capable and useful.

This approach represents the future of AI deployment for privacy-conscious individuals and organizations who want to leverage AI capabilities without surrendering control of their data or processes.

Technical FAQs

Q: Which Ollama model offers the best balance between capability and resource requirements?

A: Llama 3 (8B parameter version) currently offers one of the best balances between capability and resource requirements. It can run on systems with 8GB RAM while still providing strong reasoning, instruction-following, and tool use capabilities. For even more resource-constrained environments, consider smaller models like Phi-3-mini or Gemma 2B.

Q: Can I use GPU acceleration with Ollama and n8n?

A: Yes, Ollama automatically uses CUDA-compatible NVIDIA GPUs if present. To enable GPU support with Docker, add the appropriate GPU configurations to your docker-compose file. For non-NVIDIA GPUs, Ollama supports Metal (on Apple Silicon) and limited AMD GPU support.

Q: How can I improve the context window for handling longer conversations?

A: You can set the context length through environment variables when starting Ollama: OLLAMA_CONTEXT_LENGTH=8192 ollama serve. Additionally, using a model with natively larger context, like Llama 3, can help. For persistent memory beyond the context window, implement a vector store-based memory system.

Q: Is it possible to run n8n and Ollama on separate machines?

A: Yes, you can run n8n and Ollama on separate machines. In this case, you would need to:

  1. Configure Ollama to listen on all interfaces: OLLAMA_HOST=0.0.0.0 ollama serve
  2. Set up network security to allow the required connections
  3. In n8n, configure the Ollama credentials to point to the remote machine: http://ollama-server-ip:11434

Q: How can I automate the update of the knowledge base when new documents become available?

A: Create a separate n8n workflow that:

  1. Monitors a directory, email account, or other source for new documents
  2. Processes the documents (text extraction, chunking)
  3. Generates embeddings using Ollama
  4. Stores the vectors in pgvector Set this workflow to run on a schedule or trigger it when new documents are detected.