Multi-stage literature pipeline
Run a five-stage literature pipeline — query to comparison table
- 1 Go to Downloads (curriculum.32dots.de/share) and download 'Session 8 — Multi-stage literature pipeline'.
- 2 In n8n: ⋯ → Import from file. Open the chat panel.
- 3 Type: 'mTOR inhibitor resistance mechanisms in breast cancer'.
- 4 Wait — the pipeline runs 5 stages (watch the execution log on the right as each node lights up).
- 5 Read the Markdown table in the response. Check: does the AI correctly identify methods and limitations?
- 6 Run again with your own research topic.
- 7 Click into the 'Stage 3 — AI Extract' node in the execution log. Read the raw JSON it returned.
You see a comparison table with at least 3-4 papers. You can identify which stage is Stage 3 (AI extraction) and describe what it does.
Five-stage pipeline design
Keep each stage's responsibility obvious and testable. You can copy one abstract into the Stage 3 AI Agent and run it alone to check extraction quality. Separation of concerns is not just good engineering — it is good scientific workflow design.
- ?What happens if the AI returns slightly malformed JSON in Stage 3? Open the Code node and find where this is handled.
- ?The extraction prompt asks for 5 fields. What happens if an abstract does not mention methodology? Is the result filtered out?
- ?How would you extend Stage 5 to also produce a BibTeX citation file alongside the Markdown table?
Add a sixth stage: citation counts
After Stage 2 (Fetch), add a Semantic Scholar API call that retrieves citation counts, then incorporate them into the Stage 5 table.
- 1 After Stage 2 — Fetch (PubMed efetch), add an HTTP Request: GET https://api.semanticscholar.org/graph/v1/paper/PMID:{pmid}?fields=citationCount — start with one PMID ($('Stage 2 — Fetch').first().json.ids.split(',')[0]).
- 2 Add a Set node that extracts citationCount and passes it forward alongside the abstracts.
- 3 Update the Stage 3 AI Extract system prompt: add a 'citations' field to the requested JSON (pass the count as context).
- 4 Update the Stage 4+5 Code node to include a Citations column in the Markdown table.
- 5 Test with a well-known paper. Does the count match Google Scholar?
- 6 Test with a paper from 2024. What happens when citation data is not yet available?
Screenshot of a comparison table with a Citations column, plus a one-sentence note on what happened with the newest paper.
✎Your pipeline runs in about 60 seconds for 5 papers. How would you adapt it to run nightly on a saved PubMed search and send you a Mattermost message when new papers appear — without triggering it manually?