Call the OpenAI-compatible local API
Hit your local model from curl and Python — like the OpenAI API
Once Ollama is installed, its server is already running in the background at http://localhost:11434. It exposes two APIs: a native one at /api/chat, and an OpenAI-compatible one at /v1/. That second one is the powerful part: any script, tool, or SDK that talks to the OpenAI API can talk to your local model instead — swap one base URL, keep your existing code, and no data leaves the machine. This is what makes Ollama the local engine behind your analysis scripts and tools like Hermes.
- 1 Confirm the server is up: run
ollama list— if it responds, the server atlocalhost:11434is running. (If not, runollama serveto start it manually.) - 2 Test the native API with curl:
`curl http://localhost:11434/api/chat -d '{ "model": "llama3", "messages": [{"role": "user", "content": "Name three open-access genomics databases in one sentence each."}], "stream": false }'`With"stream": falseyou get one JSON object back; themessage.contentfield holds the text. - 3 Test the OpenAI-compatible API with Python — if you have the
openaipackage (pip install openai), run:`python from openai import OpenAI client = OpenAI(base_url='http://localhost:11434/v1/', api_key='ollama') response = client.chat.completions.create( model='llama3', messages=[{'role': 'user', 'content': 'Summarise the central dogma in 30 words.'}] ) print(response.choices[0].message.content)`Theapi_keyvalue is required by the SDK but ignored by Ollama — any non-empty string works. - 4 Notice the one-line difference from real OpenAI code: only the
base_urlchanged. Everything else is identical.
curl and Python both returned a model-generated response from `localhost:11434`. You have a private OpenAI-compatible endpoint running on your own machine.
Write a one-function Python helper that wraps your local model
A reusable wrapper means you can call your local model from any analysis script with one import — the same way you would use the real OpenAI SDK.
Write a short Python function `ask_local(prompt, model='llama3')` that hits your Ollama server and returns the text reply. Test it with a science question.
- 1 Create a file
local_llm.pywith a function that creates anopenai.OpenAI(base_url='http://localhost:11434/v1/', api_key='ollama')client and returnsresponse.choices[0].message.content. - 2 Accept
modelas a parameter with a sensible default (the model name you use most). - 3 Call it with:
print(ask_local('List three bioinformatics tools for differential expression analysis.')) - 4 Confirm the reply is correct and the response time is acceptable for your hardware.
A working `local_llm.py` file with the `ask_local` function and one test output.