shelbybark/neovim_config

Fork 0

Files

Steven Crawford 991df01590 updates

2026-02-13 08:49:20 -06:00

5.7 KiB

Raw Blame History

CodeCompanion + Ollama Setup Guide

This guide explains how to use Ollama with CodeCompanion across your network via Tailscale.

Overview

Your CodeCompanion configuration now supports both Claude (via Anthropic API) and Ollama models. You can:

Use Ollama locally on your main machine
Access Ollama from other machines on your network via Tailscale
Switch between Claude and Ollama models seamlessly

Prerequisites

On Your Ollama Server Machine

Install Ollama (if not already done)

curl -fsSL https://ollama.ai/install.sh | sh

Start Ollama with network binding

By default, Ollama only listens on localhost:11434. To access it from other machines, you need to expose it to your network:

# Option 1: Run Ollama with network binding (temporary)
OLLAMA_HOST=0.0.0.0:11434 ollama serve

# Option 2: Set it permanently in systemd (recommended)
sudo systemctl edit ollama

Add this to the systemd service file:

[Service]
Environment="OLLAMA_HOST=0.0.0.0:11434"

Then restart:

sudo systemctl restart ollama

Pull a model (if not already done)

ollama pull mistral
# Or try other models:
# ollama pull neural-chat
# ollama pull dolphin-mixtral
# ollama pull llama2

Find your Tailscale IP

tailscale ip -4
# Output example: 100.123.45.67

Configuration

On Your Main Machine (with Ollama)

Default behavior: The config will use http://localhost:11434 automatically.

To override, set the environment variable:

export OLLAMA_ENDPOINT="http://localhost:11434"

On Other Machines (without Ollama)

Set the OLLAMA_ENDPOINT environment variable to point to your Ollama server's Tailscale IP:

export OLLAMA_ENDPOINT="http://100.123.45.67:11434"

Make it persistent by adding to your shell config (~/.zshrc, ~/.bashrc, etc.):

export OLLAMA_ENDPOINT="http://100.123.45.67:11434"

Usage

Keymaps

<leader>cll - Toggle chat with Ollama (normal and visual modes)
<leader>cc - Toggle chat with Claude Haiku (default)
<leader>cs - Toggle chat with Claude Sonnet
<leader>co - Toggle chat with Claude Opus
<leader>ca - Show CodeCompanion actions
<leader>cm - Show current model

Switching Models

You can also use the :CodeCompanionSwitchModel command:

:CodeCompanionSwitchModel haiku
:CodeCompanionSwitchModel sonnet
:CodeCompanionSwitchModel opus

To add Ollama to this command, you would need to extend the configuration.

Troubleshooting

"Connection refused" error

Problem: You're getting connection errors when trying to use Ollama.

Solutions:

Verify Ollama is running: curl http://localhost:11434/api/tags
Check if it's bound to the network: sudo netstat -tlnp | grep 11434
Verify Tailscale connectivity: ping 100.x.x.x (use the Tailscale IP)
Check firewall: sudo ufw status (if using UFW)

"Model not found" error

Problem: The model you specified doesn't exist on the Ollama server.

Solution:

List available models: curl http://localhost:11434/api/tags
Pull the model: ollama pull mistral
Update the default model in lua/shelbybark/plugins/codecompanion.lua if needed

Slow responses

Problem: Responses are very slow.

Causes & Solutions:

Network latency: Tailscale adds minimal overhead, but check your network
Model size: Larger models (7B+) are slower. Try smaller models like mistral or neural-chat
Server resources: Check CPU/RAM on the Ollama server with top or htop

Tailscale not connecting

Problem: Can't reach the Ollama server via Tailscale IP.

Solutions:

Verify Tailscale is running: tailscale status
Check both machines are on the same Tailscale network
Verify the Tailscale IP is correct: tailscale ip -4
Check firewall rules on the Ollama server

Recommended Models for CodeCompanion

Model	Size	Speed	Quality	Best For
mistral	7B	Fast	Good	General coding
neural-chat	7B	Fast	Good	Chat/conversation
dolphin-mixtral	8x7B	Slower	Excellent	Complex tasks
llama2	7B/13B	Medium	Good	General purpose
orca-mini	3B	Very Fast	Fair	Quick answers

Advanced Configuration

Custom Model Selection

To change the default Ollama model, edit lua/shelbybark/plugins/codecompanion.lua:

schema = {
    model = {
        default = "neural-chat",  -- Change this to your preferred model
    },
},

Multiple Ollama Servers

If you have multiple Ollama servers, you can create multiple adapters:

ollama_main = function()
    return require("codecompanion.adapters").extend("ollama", {
        env = { url = "http://100.123.45.67:11434" },
        schema = { model = { default = "mistral" } },
    })
end,
ollama_backup = function()
    return require("codecompanion.adapters").extend("ollama", {
        env = { url = "http://100.123.45.68:11434" },
        schema = { model = { default = "neural-chat" } },
    })
end,

Then add keymaps for each.

Performance Tips

Use smaller models for faster responses (mistral, neural-chat)
Run Ollama on a machine with good specs (8GB+ RAM, modern CPU)
Keep Tailscale updated for best network performance
Monitor network latency with ping to your Ollama server
Consider running Ollama on GPU if available for faster inference

5.7 KiB Raw Blame History