Files
neovim_config/docs/OLLAMA_SETUP.md
2026-02-13 08:49:20 -06:00

203 lines
5.7 KiB
Markdown

# CodeCompanion + Ollama Setup Guide
This guide explains how to use Ollama with CodeCompanion across your network via Tailscale.
## Overview
Your CodeCompanion configuration now supports both Claude (via Anthropic API) and Ollama models. You can:
- Use Ollama locally on your main machine
- Access Ollama from other machines on your network via Tailscale
- Switch between Claude and Ollama models seamlessly
## Prerequisites
### On Your Ollama Server Machine
1. **Install Ollama** (if not already done)
```bash
curl -fsSL https://ollama.ai/install.sh | sh
```
2. **Start Ollama with network binding**
By default, Ollama only listens on `localhost:11434`. To access it from other machines, you need to expose it to your network:
```bash
# Option 1: Run Ollama with network binding (temporary)
OLLAMA_HOST=0.0.0.0:11434 ollama serve
# Option 2: Set it permanently in systemd (recommended)
sudo systemctl edit ollama
```
Add this to the systemd service file:
```ini
[Service]
Environment="OLLAMA_HOST=0.0.0.0:11434"
```
Then restart:
```bash
sudo systemctl restart ollama
```
3. **Pull a model** (if not already done)
```bash
ollama pull mistral
# Or try other models:
# ollama pull neural-chat
# ollama pull dolphin-mixtral
# ollama pull llama2
```
4. **Find your Tailscale IP**
```bash
tailscale ip -4
# Output example: 100.123.45.67
```
## Configuration
### On Your Main Machine (with Ollama)
**Default behavior:** The config will use `http://localhost:11434` automatically.
To override, set the environment variable:
```bash
export OLLAMA_ENDPOINT="http://localhost:11434"
```
### On Other Machines (without Ollama)
Set the `OLLAMA_ENDPOINT` environment variable to point to your Ollama server's Tailscale IP:
```bash
export OLLAMA_ENDPOINT="http://100.123.45.67:11434"
```
**Make it persistent** by adding to your shell config (`~/.zshrc`, `~/.bashrc`, etc.):
```bash
export OLLAMA_ENDPOINT="http://100.123.45.67:11434"
```
## Usage
### Keymaps
- **`<leader>cll`** - Toggle chat with Ollama (normal and visual modes)
- **`<leader>cc`** - Toggle chat with Claude Haiku (default)
- **`<leader>cs`** - Toggle chat with Claude Sonnet
- **`<leader>co`** - Toggle chat with Claude Opus
- **`<leader>ca`** - Show CodeCompanion actions
- **`<leader>cm`** - Show current model
### Switching Models
You can also use the `:CodeCompanionSwitchModel` command:
```vim
:CodeCompanionSwitchModel haiku
:CodeCompanionSwitchModel sonnet
:CodeCompanionSwitchModel opus
```
To add Ollama to this command, you would need to extend the configuration.
## Troubleshooting
### "Connection refused" error
**Problem:** You're getting connection errors when trying to use Ollama.
**Solutions:**
1. Verify Ollama is running: `curl http://localhost:11434/api/tags`
2. Check if it's bound to the network: `sudo netstat -tlnp | grep 11434`
3. Verify Tailscale connectivity: `ping 100.x.x.x` (use the Tailscale IP)
4. Check firewall: `sudo ufw status` (if using UFW)
### "Model not found" error
**Problem:** The model you specified doesn't exist on the Ollama server.
**Solution:**
1. List available models: `curl http://localhost:11434/api/tags`
2. Pull the model: `ollama pull mistral`
3. Update the default model in `lua/shelbybark/plugins/codecompanion.lua` if needed
### Slow responses
**Problem:** Responses are very slow.
**Causes & Solutions:**
1. **Network latency**: Tailscale adds minimal overhead, but check your network
2. **Model size**: Larger models (7B+) are slower. Try smaller models like `mistral` or `neural-chat`
3. **Server resources**: Check CPU/RAM on the Ollama server with `top` or `htop`
### Tailscale not connecting
**Problem:** Can't reach the Ollama server via Tailscale IP.
**Solutions:**
1. Verify Tailscale is running: `tailscale status`
2. Check both machines are on the same Tailscale network
3. Verify the Tailscale IP is correct: `tailscale ip -4`
4. Check firewall rules on the Ollama server
## Recommended Models for CodeCompanion
| Model | Size | Speed | Quality | Best For |
|-------|------|-------|---------|----------|
| mistral | 7B | Fast | Good | General coding |
| neural-chat | 7B | Fast | Good | Chat/conversation |
| dolphin-mixtral | 8x7B | Slower | Excellent | Complex tasks |
| llama2 | 7B/13B | Medium | Good | General purpose |
| orca-mini | 3B | Very Fast | Fair | Quick answers |
## Advanced Configuration
### Custom Model Selection
To change the default Ollama model, edit `lua/shelbybark/plugins/codecompanion.lua`:
```lua
schema = {
model = {
default = "neural-chat", -- Change this to your preferred model
},
},
```
### Multiple Ollama Servers
If you have multiple Ollama servers, you can create multiple adapters:
```lua
ollama_main = function()
return require("codecompanion.adapters").extend("ollama", {
env = { url = "http://100.123.45.67:11434" },
schema = { model = { default = "mistral" } },
})
end,
ollama_backup = function()
return require("codecompanion.adapters").extend("ollama", {
env = { url = "http://100.123.45.68:11434" },
schema = { model = { default = "neural-chat" } },
})
end,
```
Then add keymaps for each.
## Performance Tips
1. **Use smaller models** for faster responses (mistral, neural-chat)
2. **Run Ollama on a machine with good specs** (8GB+ RAM, modern CPU)
3. **Keep Tailscale updated** for best network performance
4. **Monitor network latency** with `ping` to your Ollama server
5. **Consider running Ollama on GPU** if available for faster inference
## References
- [Ollama Documentation](https://github.com/ollama/ollama)
- [Tailscale Documentation](https://tailscale.com/kb/)
- [CodeCompanion Documentation](https://github.com/olimorris/codecompanion.nvim)