Session 3

AI that reads your documents

easy

USE 0 - 15 min

Ask a document any question — get a grounded answer

1 Go to Downloads (curriculum.32dots.de/share) and download 'Session 3 — AI that reads your documents'.
2 In n8n: click the ⋯ menu → Import from file. Select the downloaded JSON.
3 The workflow opens. Click 'Chat' (bottom right) to open the chat panel.
4 Ask: 'What is a transformer architecture?'
5 Ask: 'What is the difference between a language model and an embedding model?'
6 Ask: 'Who invented the Higgs boson?' (Not in the document — notice what happens.)
7 Open the Document URL node and change the URL to any Wikipedia page about your research topic.

✓

You got grounded answers for the first two questions and a correct refusal (or honest 'not in document') for the third.

UNDERSTAND 15 - 60 min

How document QA works in n8n

Key concept

This pattern — fetch a document, stuff it into context, answer from it — works for any public URL. The limitation is the context window (~12,000 chars here). For longer documents you need chunking and retrieval (RAG). Session 14 covers that.

?What happens when you ask about something not in the document? Is the refusal consistent?
?Open the Prepare Context node. Where does $('When chat message received').first().json.chatInput appear — and why not just $json.chatInput?
?What would break if you removed the Simple Memory node?

BUILD 60 - 90 min

Point the workflow at your own document

Your task

Change the Document URL to a Wikipedia article, a PubMed abstract, or any public page in your field. Verify the AI answers correctly and refuses questions outside the document.

1 Open the Document URL node. Paste a URL for a page relevant to your research.
2 Open the chat. Ask three questions: one clearly answered by the page, one at the edge, one definitely outside.
3 In the Prepare Context node, change substring(0, 12000) to a larger or smaller value. Test how this affects response quality.
4 Change the system prompt in the AI Agent to add a citation format: 'Always end your answer with: Source: [section name]'.
5 Try two different types of documents (e.g. a Wikipedia article and a PubMed abstract). Which one answers better and why?

Deliverable

Share a screenshot of three test questions (one inside, one edge, one outside) with a one-sentence explanation of why the edge case worked or failed.

Self-check · tick before you mark done

I can explain why the session key uses $('When chat message received').first().json.sessionId instead of $json.sessionId.
I understand the context window limit and why long documents need a different approach.
I adapted the workflow to my own document and tested the refusal behaviour.

✎This workflow always reads the full page on every question. What are the tradeoffs compared to chunking the document once and storing it in a vector database?