32dots HEIDELBERG AI
Session 2 easy

Turn it into a local API server

USE 0 - 20 min

Start the server and call it from a curl command

LM Studio can expose your local model as an OpenAI-compatible API at http://localhost:1234/v1. Once that server is running, any script, tool, or application that knows how to talk to the OpenAI API can talk to your local model instead — with one URL change and no API key. This is what makes LM Studio the local engine that tools like Hermes, AnythingLLM, and your own Python scripts can point at.

  1. 1 Start the server via the GUI: click the Developer tab (angle-bracket icon in the left sidebar), then click 'Start Server'. The status bar should show a green dot and localhost:1234.
  2. 2 Alternatively, use the CLI: open a terminal and run lms server start. Run lms server status to confirm it is up.
  3. 3 Test with curl — open a terminal and run: ` curl http://localhost:1234/v1/chat/completions \ -H 'Content-Type: application/json' \ -d '{ "model": "lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF", "messages": [{"role": "user", "content": "Name three open-access genomics databases in one sentence each."}] }' ` You will see a JSON response stream back. The model name must match what you have loaded.
  4. 4 Test with Python — if you have the openai package (pip install openai), run: `python from openai import OpenAI client = OpenAI(base_url='http://localhost:1234/v1', api_key='local') response = client.chat.completions.create( model='lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF', messages=[{'role': 'user', 'content': 'Summarise the central dogma in 30 words.'}] ) print(response.choices[0].message.content) ` Note: the api_key value can be anything — local servers ignore it.
  5. 5 Stop the server when done: Developer tab → Stop Server, or lms server stop in the terminal.

curl or Python returned a model-generated response from `localhost:1234`. You have a private OpenAI-compatible endpoint running on your own machine.

BUILD 20 - 30 min

Write a one-function Python helper that wraps your local model

A reusable wrapper means you can call your local model from any script with one import — the same way you would use the real OpenAI SDK.

Write a short Python function `ask_local(prompt, model=None)` that hits your LM Studio server and returns the text reply. Test it with a science question.

  1. 1 Create a file local_llm.py with a function that creates an openai.OpenAI(base_url='http://localhost:1234/v1', api_key='local') client and returns response.choices[0].message.content.
  2. 2 Accept model as a parameter with a sensible default (the model name you use most).
  3. 3 Call it with: print(ask_local('List three bioinformatics tools for differential expression analysis.'))
  4. 4 Confirm the reply is correct and the response time is acceptable for your hardware.
Deliverable

A working `local_llm.py` file with the `ask_local` function and one test output.