Pulling data from scientific databases
Query PubMed with natural language — no credentials needed
- 1 Go to Downloads (curriculum.32dots.de/share) and download 'Session 5 — Pulling data from scientific databases'.
- 2 In n8n: ⋯ → Import from file. Open the chat panel.
- 3 Type: 'Find papers about CRISPR base editing from 2024'.
- 4 Wait for the response — the workflow makes two API calls before the AI answers.
- 5 Type: 'What are the main limitations mentioned in those papers?'
- 6 Type: 'Find me papers about mRNA vaccine immunogenicity'. Note: a new search runs.
- 7 Look at the execution log on the right — click each node to see its output at each stage.
You see AI-summarised paper lists for both queries. You can explain what each node in the execution log produced.
The two-step PubMed API pattern
Scientific databases are not magic — they are REST APIs returning structured data. The pattern is always: search → get IDs → fetch details → extract fields. Once you know this pattern, connecting to UniProt, ChEMBL, or Semantic Scholar is identical.
- ?What is the NCBI rate limit for unauthenticated API calls? Where in the workflow would you add a delay to avoid hitting it?
- ?Change retmax=5 to retmax=20 in the Build Search URL Code node. What happens to response quality vs. cost?
- ?UniProt also has a REST API. What would the esearch-equivalent URL look like to find all human proteins involved in apoptosis?
Extend to a second database
After the AI answer, add a Semantic Scholar API call to retrieve citation counts for the top papers, and include that data in the response.
- 1 Note the PMIDs in the Extract IDs node output.
- 2 After the Fetch Abstracts node, add an HTTP Request: GET https://api.semanticscholar.org/graph/v1/paper/PMID:{pmid}?fields=citationCount
- 3 For simplicity, do this for just the first PMID (use $json.ids.split(',')[0]).
- 4 Add a Set node: citationCount = $json.citationCount, pmid = $json.paperId.
- 5 Update the Prepare Context Set node to include the citation count.
- 6 Update the AI system prompt to mention citation counts in the summary.
- 7 Test: does the AI now include citation data in its response?
Screenshot of a workflow run that includes citation count data in the AI's response.
✎Your workflow runs a fresh PubMed query on every message. What would you need to change to cache results — so repeat queries for the same topic don't burn rate limits?