Call the OpenAI-compatible local API

USE 0 - 20 min

Hit your local model from curl and Python — like the OpenAI API

Once Ollama is installed, its server is already running in the background at http://localhost:11434. It exposes two APIs: a native one at /api/chat, and an OpenAI-compatible one at /v1/. That second one is the powerful part: any script, tool, or SDK that talks to the OpenAI API can talk to your local model instead — swap one base URL, keep your existing code, and no data leaves the machine. This is what makes Ollama the local engine behind your analysis scripts and tools like Hermes.

1 Confirm the server is up: run ollama list — if it responds, the server at localhost:11434 is running. (If not, run ollama serve to start it manually.)
2 Test the native API with curl: ` curl http://localhost:11434/api/chat -d '{ "model": "llama3", "messages": [{"role": "user", "content": "Name three open-access genomics databases in one sentence each."}], "stream": false }' ` With "stream": false you get one JSON object back; the message.content field holds the text.
3 Test the OpenAI-compatible API with Python — if you have the openai package (pip install openai), run: `python from openai import OpenAI client = OpenAI(base_url='http://localhost:11434/v1/', api_key='ollama') response = client.chat.completions.create( model='llama3', messages=[{'role': 'user', 'content': 'Summarise the central dogma in 30 words.'}] ) print(response.choices[0].message.content) ` The api_key value is required by the SDK but ignored by Ollama — any non-empty string works.
4 Notice the one-line difference from real OpenAI code: only the base_url changed. Everything else is identical.

✓

curl and Python both returned a model-generated response from `localhost:11434`. You have a private OpenAI-compatible endpoint running on your own machine.

BUILD 20 - 30 min

Write a one-function Python helper that wraps your local model

A reusable wrapper means you can call your local model from any analysis script with one import — the same way you would use the real OpenAI SDK.

Your task

Write a short Python function `ask_local(prompt, model='llama3')` that hits your Ollama server and returns the text reply. Test it with a science question.

1 Create a file local_llm.py with a function that creates an openai.OpenAI(base_url='http://localhost:11434/v1/', api_key='ollama') client and returns response.choices[0].message.content.
2 Accept model as a parameter with a sensible default (the model name you use most).
3 Call it with: print(ask_local('List three bioinformatics tools for differential expression analysis.'))
4 Confirm the reply is correct and the response time is acceptable for your hardware.

Deliverable

A working `local_llm.py` file with the `ask_local` function and one test output.