Skip to main content
Ollama lets you run AI models locally on your machine. screenpipe integrates natively with Ollama — no API keys, no cloud, completely private.

setup

1. install Ollama & pull a model

# install from https://ollama.com then:
ollama run ministral-3
this downloads the model and starts Ollama. you can use any model — ministral-3 is a good starting point (fast, works on most machines).

2. select Ollama in screenpipe

  1. open the screenpipe app
  2. click the AI preset selector (top of the chat/timeline)
  3. click Ollama
  4. pick your model from the dropdown (screenpipe auto-detects pulled models)
  5. start chatting
that’s it. screenpipe talks to Ollama on localhost:11434 automatically.
modelsizebest for
ministral-3~2 GBfast, general use, recommended starting point
gemma3:4b~3 GBstrong quality for size, good for summaries
qwen3:4b~3 GBmultilingual, good reasoning
deepseek-r1:8b~5 GBstrong reasoning, needs 16 GB+ RAM
pull any model with:
ollama pull <model-name>

requirements

  • Ollama installed and running
  • at least one model pulled
  • screenpipe running

custom OpenAI-compatible endpoints

if you’re running a custom LLM server (Qwen, vLLM, Text Generation WebUI, etc.), screenpipe auto-detects the endpoint format:
  1. first tries OpenAI-compatible format: GET {endpoint}/v1/models
  2. falls back to Ollama format: GET {endpoint}/api/tags
if your endpoint uses neither format, you may need to:
  • check what path your server uses for model listing (/models, /v1/list, etc.)
  • if unsure, test with curl first: curl {your-endpoint}/path-to-models
  • join our Discord — we can help troubleshoot custom setups
example: a Qwen server on http://localhost:5000 with OpenAI-compatible API should work automatically. if screenpipe can’t find models, verify the server responds to: curl http://localhost:5000/v1/models

troubleshooting

“ollama not detected”
  • make sure Ollama is running: ollama serve
  • check it’s responding: curl http://localhost:11434/api/tags
model not showing in dropdown?
  • pull it first: ollama pull ministral-3
  • you can also type the model name manually in the input field
slow responses?
  • try a smaller model (ministral-3)
  • close other GPU-heavy apps
  • ensure you have enough free RAM (model size + ~2 GB overhead)

troubleshooting Azure & custom OpenAI endpoints

Error: “unsupported tool use” or “does not support more than one tool call”

screenpipe sends multiple tool calls to the LLM for agentic features. some models (especially older Azure-hosted models like Phi-4, older Llama versions) don’t support this. fixes:
  • use a model that supports tool use: gpt-4, gpt-4-turbo, claude-3-5-sonnet, gpt-oss-120b
  • or disable agentic features in your pipe prompts (remove tool calls, just ask for text summaries)
  • on Azure, try switching to the latest model version available

Error: “max tokens is not supported”

your endpoint doesn’t recognize the max_tokens parameter that screenpipe sends. fixes:
  1. verify your endpoint supports OpenAI-compatible API: curl -H "Authorization: Bearer YOUR_KEY" https://your-endpoint/v1/models
  2. if using Azure, ensure you’re using the OpenAI-compatible endpoint format (not the old REST API format)
  3. try a custom endpoint URL wrapper if your server needs parameter translation

API key not being passed to screenpipe API

if screenpipe says “unauthorized” when accessing the local API, but your custom LLM endpoint is configured: cause: screenpipe CLI doesn’t automatically share API credentials with the local REST API server. fix: configure your pipe or app to use the API key explicitly:
curl "http://localhost:3030/search?limit=5" \
  -H "Authorization: Bearer YOUR_SCREENPIPE_API_KEY"
or set the API key in screenpipe settings → API security → enable API key auth, then provide that key in your requests.

Custom endpoint not responding / models not detected

screenpipe tries both OpenAI and Ollama formats. if neither works:
  1. test your endpoint manually:
    curl https://your-endpoint/v1/models
    curl https://your-endpoint/api/tags
    
    (one should return a model list; if neither does, your server may use a different path)
  2. check authorization:
    curl -H "Authorization: Bearer YOUR_KEY" https://your-endpoint/v1/models
    
  3. verify TLS/SSL: if using https, ensure your certificate is valid (self-signed certs need special config)
  4. common endpoint paths:
    • OpenAI-compatible: /v1/models, /v1/chat/completions
    • Ollama-compatible: /api/tags, /api/generate
    • vLLM: /v1/models (OpenAI-compatible)
    • Text Generation WebUI: /api/v1/models (may vary)
if stuck, join our Discord — share your endpoint URL structure and error logs. need help? join our discord — get recommendations on models and configs from the community.