Inference Providers — Cheat Sheet
Groq · Cerebras · OpenRouter · all three use the OpenAI-compatible API — change 3 lines to switch · COS · 32dots.de
Provider quick-reference
Provider Free tier Speed base_url Key Models Best for
Groq ~14,400 req/day · ~30 req/min · per-model daily caps ~500–800 tok/s https://api.groq.com/openai/v1 console.groq.com Llama 3.x, DeepSeek-R1, gpt-oss, Whisper STT Fast iteration, n8n pipelines, voice-to-text
Cerebras 1,000,000 tokens/day · ~8K context cap · ~30 req/min ~1,800–3,000 tok/s https://api.cerebras.ai/v1 cloud.cerebras.ai Llama 3.3 70B, Llama 4 Scout, Qwen3-235B High-volume free usage, fastest big models
OpenRouter BYO key — pay per model; some models have free tiers Varies by model https://openrouter.ai/api/v1 openrouter.ai GPT-4o, Claude, Gemma, Llama, hundreds more Widest model choice, one key for everything
The OpenAI-compatible swap — 3 lines to switch provider Change base_url · api_key · model
curl
curl https://api.groq.com/openai/v1/chat/completions \
  -H 'Authorization: Bearer $GROQ_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "llama-3.1-8b-instant",
    "messages": [{"role":"user",
      "content":"Hello"}]
  }'

# Swap to Cerebras: change the URL and model:
# https://api.cerebras.ai/v1/chat/completions
# model: llama-3.3-70b

# Swap to OpenRouter:
# https://openrouter.ai/api/v1/chat/completions
# model: openai/gpt-4o  (or any slug)
Python — openai SDK
from openai import OpenAI

# --- Groq ---
client = OpenAI(
  base_url="https://api.groq.com/openai/v1",
  api_key="YOUR_GROQ_KEY"
)

# --- Cerebras (swap 2 lines) ---
# base_url="https://api.cerebras.ai/v1",
# api_key="YOUR_CEREBRAS_KEY"

# --- OpenRouter (swap 2 lines) ---
# base_url="https://openrouter.ai/api/v1",
# api_key="YOUR_OR_KEY"

r = client.chat.completions.create(
  model="llama-3.1-8b-instant",
  messages=[{"role":"user",
             "content":"Hello"}]
)
print(r.choices[0].message.content)
n8n — HTTP Request node
Method: POST

# Groq
URL: https://api.groq.com/openai/v1
     /chat/completions

# Cerebras (change URL only)
URL: https://api.cerebras.ai/v1
     /chat/completions

# OpenRouter (change URL only)
URL: https://openrouter.ai/api/v1
     /chat/completions

Headers:
  Authorization: Bearer {{ $env.GROQ_KEY }}
  Content-Type: application/json

Body (JSON):
{
  "model": "llama-3.1-8b-instant",
  "messages": [
    {"role":"user","content":"{{ $json.prompt }}"}
  ]
}
The rule: all three providers speak the OpenAI chat completions format exactly. Your SDK code, your error handling, and your response parsing are unchanged — only the URL, key, and model name move. Whisper STT is Groq-only: POST /audio/transcriptions, model whisper-large-v3.

Own-key warning: always use your own free key — never a shared course key. The 32dots project Groq key was suspended after a silent batch job spent €19.71 overnight. Personal free keys are instant to create and isolated to you.