Architecture¶
System Overview¶
Olink RAG is organized into four main layers:
┌─────────────────────────────────────────────────────────┐
│ Frontend (React + Sigma.js) — separate repo │
├─────────────────────────────────────────────────────────┤
│ API Layer (FastAPI + Granian) │
│ ├── ingestion/ query/ workspace/ platform/ │
│ └── shared/ (middleware, security, rate limiting, SSE) │
├─────────────────────────────────────────────────────────┤
│ Core Services & Agents │
│ ├── agents/ (TwoPhase, Dynamic, Neo4j, LangGraph) │
│ ├── services/ (query, ingestion, workspace, platform) │
│ ├── factories/ (database, LLM, embedding, agent) │
│ └── kg_tools/ (registry, CLI, MCP, LangChain adapter) │
├─────────────────────────────────────────────────────────┤
│ Pipeline Layer │
│ ├── ingest/ (fetchers, chunkers, extractors, KG pipe) │
│ ├── enrichment/ (column analysis, node annotation) │
│ └── processors/ (41 modules: consolidation, enrichment, │
│ science skills integrators) │
├─────────────────────────────────────────────────────────┤
│ Data Layer │
│ ├── Neo4j (knowledge graph, vector indexes) │
│ └── Redis (sessions, query cache, job state) │
└─────────────────────────────────────────────────────────┘
Key Architectural Patterns¶
-
Factory Pattern —
src/factories/creates database, LLM, embedding, and agent instances based on configuration. Supports Ollama, Bedrock, SageMaker, and OpenAI backends. -
Domain-Based Routing — API organized by business domain (ingestion, query, workspace, platform), each with its own router and routes subdirectory.
-
Two-Phase Query Architecture — Phase 1: deterministic execution of all graph tools (no LLM). Phase 2: single LLM call to synthesize results. Guarantees tool execution with zero hallucinated data.
-
Cascading Entity Consolidation — UniProt ID matching → synonym/gene symbol matching → fuzzy name matching. Each stage catches what the previous missed.
-
Relationship Consolidation — Merges duplicate edges between the same entity pair while preserving all evidence (PMIDs, confidence scores, extraction methods, temporal tracking).
-
Registry Pattern for Tools — Single
ToolRegistryis the source of truth for all 16 KG tools. CLI, MCP, and LangChain interfaces derive from it automatically.
Architecture Diagrams¶
Full System¶
flowchart TB
PUBMED[("PubMed<br/>(Entrez API)")]
CSV[("CSVs<br/>(Ontologies, Proteins)")]
subgraph INGEST ["INGESTION PIPELINE"]
direction TB
subgraph FETCH ["Data Fetching"]
ING["ingestor.py<br/>fetch_pubmed_abstracts()"]
SCRAPER["pubmed_mass_scraper.py"]
end
subgraph KG_BUILD ["KG Construction"]
KGP["KGPipeline"]
subgraph STATE ["StateGraph"]
direction LR
EXT["extract_kg<br/>LLM → JSON"]
FILT["filter_ontology"]
LINK["link_protein_entities<br/>UniProt matching"]
EXT --> FILT --> LINK
end
KGP --> STATE
end
subgraph ENRICH ["Enrichment & Resolution"]
ER["entity_resolver.py<br/>MONDO mapping"]
IC["incremental_consolidation.py"]
RC["relationship_counter.py"]
end
subgraph EMBED ["Embedding Pipeline"]
EP["embedding_pipeline.py"]
CHUNK["chunker.py"]
end
end
subgraph STORAGE ["STORAGE"]
direction LR
NEO4J[("Neo4j")]
VECTOR[("Vector Index")]
end
subgraph FACTORIES ["FACTORIES"]
direction TB
LLM_F["llm_factory → Ollama/Bedrock/SageMaker"]
EMB_F["embedding_factory → SentenceTransformers"]
DB_F["database_factory → Neo4j"]
QA_F["query_agent_factory"]
end
subgraph QUERY ["QUERY LAYER"]
direction TB
subgraph AGENTS ["Agents"]
DQA["DynamicQueryAgent"]
NQA["Neo4jQueryAgent"]
TPA["TwoPhaseAgent"]
end
subgraph RETRIEVAL ["Retrieval"]
VR["VectorRetriever"]
HR["HybridRetriever"]
T2C["Text2CypherRetriever"]
end
subgraph PROCESSING ["Processing"]
QP["QueryProcessor"]
ED["EntityDiscovery"]
RP["ResultProcessor<br/>CrossEncoder"]
RF["REFRAG compression"]
end
end
subgraph API ["API LAYER"]
FAST["FastAPI + Granian"]
QS["QueryService"]
FAST --> QS
end
USER(("User"))
REACT["React Frontend"]
PUBMED --> ING
CSV --> KGP
ING --> KGP
LINK --> ER & IC & RC
KGP --> EP --> CHUNK
RC --> NEO4J
EP --> VECTOR --> NEO4J
USER --> REACT --> FAST
QS --> QA_F --> AGENTS
AGENTS --> RETRIEVAL --> NEO4J
RETRIEVAL --> RP --> RF --> LLM_F
LLM_F -.-> EXT & AGENTS
EMB_F -.-> EP
DB_F -.-> KGP & AGENTS
API + Query Agent Flow¶
flowchart TB
subgraph API["API Layer"]
EP_SESSION["POST /v1/sessions"]
EP_QUERY["POST /v1/sessions/{id}/query"]
end
subgraph SM["Session Manager"]
SM_CREATE["create_session()"]
SM_EXEC["execute_query()"]
end
subgraph QS["Query Service"]
QS_CREATE["create_agent()"]
QS_EXEC["execute_query()"]
QS_REFRAG["REFRAG compression"]
end
subgraph AGENTS["Agent Layer"]
TPA["TwoPhaseAgent<br/>(default)"]
NQA["Neo4jQueryAgent"]
DQA["DynamicQueryAgent"]
end
subgraph TPA_INTERNAL["TwoPhaseAgent Pipeline"]
direction LR
TP_DISC["Phase 1: Discovery<br/>(all tools, no LLM)"] --> TP_EXP["Phase 2: Expansion<br/>(neighbor traversal)"] --> TP_SYNTH["Synthesis<br/>(single LLM call)"]
end
subgraph DYN_INTERNAL["DynamicQueryAgent Pipeline"]
direction LR
QP["QueryProcessor"] --> ED["EntityDiscovery"] --> SS["6 Search Strategies"] --> RP["CrossEncoder rerank"]
end
subgraph RETRIEVERS["Neo4j Retrievers"]
direction LR
VR["VectorRetriever"]
HR["HybridRetriever"]
T2CR["Text2CypherRetriever"]
KNN["APOC KNN"]
end
NEO4J[("Neo4j")]
EP_SESSION --> SM_CREATE --> QS_CREATE
EP_QUERY --> SM_EXEC --> QS_EXEC --> QS_REFRAG
QS_CREATE -->|"default"| TPA
QS_CREATE -->|"Standard"| NQA
QS_CREATE -->|"Dynamic"| DQA
TPA --> TPA_INTERNAL
TPA_INTERNAL --> NEO4J
NQA -.->|extends| DQA
DQA --> DYN_INTERNAL
NQA --> RETRIEVERS
RETRIEVERS --> NEO4J
Directory Structure¶
API Layer (api/)¶
api/
├── app.py # FastAPI app, lifespan, middleware, router mounting
├── ingestion/ # /v1/ingestion/* — KG assembly
│ ├── router.py
│ └── routes/
├── query/ # /v1/sessions/*, /v1/evidence, /v1/communities/*
│ ├── router.py
│ └── routes/
├── workspace/ # /v1/cells/*, /v1/my-files/*, /v1/auto-discovery/*
│ ├── router.py
│ └── routes/
├── platform/ # /health, /v1/metrics/*, /v1/dashboard/*
│ ├── router.py
│ └── routes/
└── shared/ # Cross-cutting concerns
├── models/ # Pydantic request/response models
├── middleware.py # CORS, logging, error handling
├── rate_limit.py # slowapi rate limiting
├── security.py # Cypher injection prevention, PII detection
├── sse_streamer.py # Server-sent events for streaming
└── validation.py # Request/response validation
Core Library (src/)¶
src/
├── agents/ # Query agents (TwoPhase, Dynamic, Neo4j, LangGraph)
├── services/ # Business logic by domain
│ ├── ingestion/ # Job management
│ ├── query/ # Query service, sessions, memory, cache
│ ├── workspace/ # Cells, files, auto-discovery
│ ├── platform/ # Audit, cost, telemetry, feedback, Glicko
│ └── sdcg/ # Semantic Dynamic Context Graph
├── factories/ # Factory pattern for DI
├── models/ # Pydantic/dataclass models
├── core/ # Database interfaces (Neo4j)
├── cache/ # Redis caching
├── security/ # Cypher builder, guardrails, sanitizer
├── kg_tools/ # Tool registry, CLI, MCP adapter
├── mcp_server/ # MCP server for biomarker/pathway tools
├── ml/ # ML scoring, evidence weighting
├── evaluation/ # Eval framework (scorers, adapters, runners)
├── tools/ # Graph tools, migration tools
└── utils/ # Logging, retry, token counting, mappers
Pipeline Layer (pipeline/)¶
pipeline/
├── ingest/ # 25 modules: fetchers, chunkers, extractors, KG pipeline
├── enrichment/ # CSV/Parquet enrichment: column analysis, annotation
└── processors/ # 41 modules: entity resolution, community detection,
# evidence scoring, external integrations, PDF handling,
# science skills integrators (STRING, HPA, Reactome,
# ChEMBL, ClinVar, OpenAlex)
Data Flow¶
Ingestion Flow¶
Data Sources (PubMed, bioRxiv, PMC, PDF, CSV, External APIs)
↓
Fetchers (pubmed, biorxiv, pmc, pdf)
↓
Token-based Chunking (512 tokens, 64 overlap, sentence boundaries)
↓
LLM Entity/Relationship Extraction (with optional gleaning)
↓
Neo4j Storage (nodes + relationships)
↓
Entity Consolidation (UniProt → synonyms → fuzzy)
↓
Relationship Consolidation (merge duplicates, preserve evidence)
↓
Vector Embeddings (sentence-transformers → Neo4j vector indexes)
↓
External API Enrichment (STRING, HPA, Reactome, ChEMBL, ClinVar, OpenAlex)
Query Flow¶
User Query
↓
Security Guardrails (PII detection, injection check)
↓
Session Manager (restore history, create agent)
↓
Query Mode Router (auto-classify: local/global/hybrid/naive)
↓
Agent Execution (TwoPhase: deterministic tools → LLM synthesis)
↓
REFRAG Compression (optional context compression)
↓
SSE Streaming Response (token-by-token)
LLM Service Configuration¶
| Service | Backend | Use Case |
|---|---|---|
local |
Ollama | Development (default) |
bedrock |
AWS Bedrock | Production |
sagemaker-llama3 |
AWS SageMaker | Production (custom endpoints) |
openai |
OpenAI API | Alternative |
Related Pages¶
- Query Agents — detailed agent architecture
- Ingestion Pipeline — pipeline components
- Infrastructure — deployment architecture
- Advanced Features (SDCG) — future architecture vision