32dots HEIDELBERG AI
Session 3 intermediate

Turn on the local OpenAI-compatible API server

USE 0 - 20 min

Expose your local model as an API and call it from curl

Jan can expose your local model as an OpenAI-compatible API server at http://localhost:1337. Once that server is running, any script, tool, or app that knows how to talk to the OpenAI API can talk to your local model instead — with one URL change and no real API key. This is what lets you swap a cloud API call for fully local inference in your own Python or Node projects, or wire Jan into editor extensions for private, on-machine coding assistance.

  1. 1 Make sure a model is loaded in Jan (from lesson 00).
  2. 2 Enable the local API server in Jan's settings (the local API server / developer section). Confirm it is listening on http://localhost:1337.
  3. 3 Test with curl — open a terminal and run: ` curl http://localhost:1337/v1/chat/completions \ -H 'Content-Type: application/json' \ -d '{ "messages": [{"role": "user", "content": "Name three open-access genomics databases in one sentence each."}] }' ` You will see a JSON response. Set the model field to the name of the model you have loaded in Jan if your build requires it.
  4. 4 Test with Python — if you have the openai package (pip install openai), run: `python from openai import OpenAI client = OpenAI(base_url='http://localhost:1337/v1', api_key='local') response = client.chat.completions.create( model='local-model', messages=[{'role': 'user', 'content': 'Summarise the central dogma in 30 words.'}] ) print(response.choices[0].message.content) ` The api_key value can be anything — local servers ignore it. Use the model name shown in Jan for the model field.
  5. 5 Turn the server off in settings when you are done.

curl or Python returned a model-generated response from `localhost:1337`. You have a private, OpenAI-compatible endpoint running on your own machine.

BUILD 20 - 30 min

Write a one-function Python helper that wraps your local model

A reusable wrapper means you can call your local Jan model from any script with one import — the same way you would use the real OpenAI SDK.

Write a short Python function `ask_local(prompt)` that hits Jan's local server and returns the text reply. Test it with a science question.

  1. 1 Create a file local_llm.py with a function that creates an openai.OpenAI(base_url='http://localhost:1337/v1', api_key='local') client and returns response.choices[0].message.content.
  2. 2 Use the model name shown in Jan for the model argument.
  3. 3 Call it with: print(ask_local('List three bioinformatics tools for differential expression analysis.'))
  4. 4 Confirm the reply is correct and the response time is acceptable for your hardware.
Deliverable

A working `local_llm.py` file with the `ask_local` function and one test output.