32dots HEIDELBERG AI
Session 3 easy

Power Hermes with your local model

USE 0 - 20 min

Point Hermes at your Ollama server with two config lines

Hermes is a personal AI assistant you install locally (covered in the Hermes course). It needs a model to think with — by default you point it at a cloud provider, but pointing it at Ollama instead means Hermes runs entirely on your machine: zero cloud calls, zero per-token cost, full privacy. Because Ollama speaks the OpenAI-compatible API you set up in the last lesson, the bridge is just two config lines. If you have not installed Hermes yet, do this lesson anyway — the config steps are the same whenever you are ready.

  1. 1 Make sure Ollama is running and a model is pulled — run ollama list to confirm you have a model such as llama3.
  2. 2 Open ~/.hermes/config.yaml in any text editor. (If the file does not exist, run hermes setup first — see the Hermes course, lesson hermes-00.)
  3. 3 Set these two lines: `yaml provider: custom base_url: "http://localhost:11434/v1/" ` Save the file.
  4. 4 Add a placeholder API key — open ~/.hermes/.env and add (or confirm the presence of): ` OPENAI_API_KEY=ollama ` Ollama ignores the key value; Hermes still needs the variable to be set.
  5. 5 Start Hermes: run hermes --tui in a terminal, or open Hermes Desktop. Type: What model are you running on? — it should identify the Ollama model you have loaded.
  6. 6 Optional — use a model on another machine on your LAN. Replace localhost with that machine's IP, e.g. base_url: "http://192.168.1.42:11434/v1/". That machine must have Ollama bound to the network (see lesson ollama-04).

Hermes responds from your local Ollama model. No internet connection is required for the conversation to work.

BUILD 20 - 30 min

Compare local vs cloud Hermes on a real task

Switching between a local and a cloud model in Hermes is a two-line config change. Use that to find where the local model is good enough and where you need the cloud.

Run the same Hermes task twice — once with Ollama, once with a cloud provider — and decide which you would use for each kind of job in your research.

  1. 1 With Ollama active, give Hermes a research task: Search for recent papers on [your topic] and summarise three key findings.
  2. 2 Switch config.yaml to a cloud provider (anthropic, openai, or openrouter) and repeat the exact same task.
  3. 3 Compare quality, speed, and what left your machine.
  4. 4 Write a one-sentence rule for your own use: 'I will use local for X and cloud for Y.'
Deliverable

Both outputs side by side and your one-sentence local-vs-cloud rule.