Install LM Studio & download your first model

USE 0 - 20 min

Get a local model answering questions in under 20 minutes

LM Studio is a free desktop app that turns downloading and running open AI models into a point-and-click experience — no command line, no account, no data leaving your machine. The first step is always picking a model that fits your RAM. Models come in GGUF format with quantization levels: a Q4_K_M (4-bit) version of a 7–8 billion parameter model needs roughly 5–6 GB of RAM and runs on most modern laptops. A 13B Q4 needs around 8–10 GB; a 70B Q4 needs 40+ GB and usually requires a GPU. When in doubt, start with a 7B or 8B Q4_K_M model.

1 Download LM Studio at lmstudio.ai — choose the installer for your platform (macOS with Apple Silicon, Windows, or Linux). Run it; no account is required.
2 Open the Discover tab (the magnifying-glass icon on the left sidebar). Search for a model — a good first pick is lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF. Look for the Q4_K_M variant. Click Download.
3 Wait for the download to finish (the model file is 4–6 GB). The progress bar is in the bottom status bar.
4 Switch to the Chat tab (speech-bubble icon), select your downloaded model in the top dropdown, and type: Explain RNA-seq in two sentences for a biologist who has never heard of it.
5 Read the reply. If it arrives — even slowly — your setup is working. Nothing was sent to the internet.

✓

You received a coherent reply from a local model. The network tab in your OS shows no outbound traffic to any AI service.

BUILD 20 - 30 min

Find the smallest model that answers well enough for your work

Bigger is not always better when RAM is the constraint. The goal is the smallest model that gives you answers you can trust for your actual tasks.

Your task

Download one more model at a different size or quantization level, ask both the same science question, and decide which is your daily driver.

1 In the Discover tab, find a smaller variant of the same model family (e.g. a Q2_K or a 3B parameter model) or a different family entirely (Qwen, Mistral, Phi).
2 Download it and chat with it using the same prompt you used in the USE phase.
3 Compare: response quality, response speed, and RAM usage (visible in the LM Studio status bar).
4 Pick a winner and note why — quality, speed, or memory fit.

Deliverable

A one-sentence verdict: which model you chose and the reason (quality / speed / RAM).