Buckets:

372 MB
279 files
Updated 4 days ago
NameSize
.oncoagent
agents
data
data_prep
deploy
docs
logs
rag_engine
scripts
tests
ui
.env.production795 Bytes
xet
Dockerfile1.97 kB
xet
README.es.md8.32 kB
xet
README.md7.71 kB
xet
app.py22.4 kB
xet
config.json43.2 kB
xet
oncoagent_master_directive.md3.5 kB
xet
requirements.txt1.71 kB
xet
screenshot.py860 Bytes
xet
test_pipeline.py1.64 kB
xet
test_ui.py74 Bytes
xet
ui_app.log5.66 kB
xet
ui_app.pid7 Bytes
xet
upload_output.txt5.59 kB
xet
README.md

🧬 OncoAgent β€” Multi-Agent Oncology Triage System

ROCm Python vLLM LangGraph Gradio

AMD Developer Hackathon 2026 Β· Powered by AMD Instinctβ„’ MI300X Β· ROCm 7.2

🌍 100% Open-Source: Democratizing Oncology

OncoAgent is proudly 100% open-source. We believe that life-saving clinical intelligence should not be locked behind proprietary APIs. Our solution is designed to:

  • Guarantee Patient Privacy: Run locally on AMD MI300X hardware or private clouds, ensuring zero patient data leaves the hospital.
  • Foster Global Contribution: Allow medical communities worldwide to easily audit, modify, and contribute to the RAG knowledge base.

OncoAgent is a state-of-the-art multi-agent clinical triage system designed to combat unstructured data blindness in primary care oncology. It leverages a tier-adaptive architecture featuring Qwen 3.5-9B (Speed Triage) and Qwen 3.6-27B (Deep Reasoning) models. Orchestrated via a sophisticated LangGraph state machine, it provides evidence-based oncological reasoning strictly grounded in NCCN/ESMO clinical guidelines, with built-in human-in-the-loop (HITL) safety gates and a Reflexion-based critic loop.


πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Router │──▢│Ingestion│──▢│Corrective│──▢│ Specialist │◀────│ Critic     β”‚   β”‚ Formatterβ”‚
β”‚(Triage)β”‚   β”‚ (PHI)   β”‚   β”‚  RAG    β”‚   β”‚ (Qwen 9B/  β”‚     β”‚(Reflexion  β”‚   β”‚(Output)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚    27B)    │────▢│ Validation)β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
    β”‚           β”‚             β”‚          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β–²
    β”‚           β”‚             β”‚                 β”‚                   β”‚              β”‚
    β–Ό           β–Ό             β–Ό                 β–Ό                   β–Ό              β”‚
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚                           Fallback Node                           β”‚      β”‚ HITL Gate  β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚(Acuity Chk)β”‚
                                                                             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Components:

Module Description
data_prep/ Dataset builder: PMC-Patients/OncoCoT β†’ Strict JSONL (Llama 3 chat template)
rag_engine/ The "Brain": PyMuPDF extraction, Adaptive Semantic Chunking of NCCN/ESMO PDFs, & ChromaDB + PubMedBERT vectorization.
agents/ The "Reasoning": LangGraph multi-agent orchestration (Router β†’ Corrective RAG β†’ Specialist ↔ Critic β†’ HITL Gate).
ui/ The "Face": Gradio 6 UI with Glassmorphism for clinical note input, real-time source citations, and reasoning output.

🧠 Dual-Tier Model Strategy (Qwen)

To maximize the compute capabilities of the AMD MI300X, OncoAgent implements a dynamic Dual-Tier routing strategy using the Qwen model family. Both tiers have been fine-tuned on +200,000 real-world oncological cases covering all major cancer types (derived from PMC-Patients and OncoCoT datasets) to ensure hyper-specialized medical reasoning:

  • Tier 1: Qwen 3.5-9B (Speed Triage): A lightweight, extremely fast model used by the Router to assess initial complexity, perform simple triage, and handle low-risk queries.
  • Tier 2: Qwen 3.6-27B (Deep Reasoning): The heavy-lifter. Activated for high-complexity clinical cases (e.g., metastasis, multi-mutations). It performs deep reasoning and entailment checks, avoiding confirmation bias through rigorous Reflexion loops.

⚑ Hardware Target

  • GPU: AMD Instinctβ„’ MI300X (192GB HBM3)
  • Software Stack: ROCm 7.2.x, PyTorch (HIP), vLLM with PagedAttention
  • Models: Qwen/Qwen3.5-9B (Speed Triage) & Qwen/Qwen3.6-27B-Instruct (Deep Reasoning)
  • Precision: QLoRA 4-bit NormalFloat4 via bitsandbytes (ROCm compatible)

πŸš€ Quick Start

# 1. Clone and setup
git clone <repo-url>
cd OncoAgent

# 2. Install dependencies
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# 3. Start Inference Server (vLLM on Docker)
# This spins up the Qwen models optimized for AMD MI300X via ROCm PagedAttention
docker run --device /dev/kfd --device /dev/dri -p 8000:8000 rocm/vllm:latest \
    --model Qwen/Qwen3.6-27B-Instruct --tensor-parallel-size 1

# 4. Configure environment & Run UI
cp .env.example .env
# Set VLLM_API_BASE=http://localhost:8000/v1 in .env
python -m ui.app

πŸ“ Project Structure

β”œβ”€β”€ docs/                   # Documentation & research
β”‚   β”œβ”€β”€ research/           # Deep Research analysis documents
β”‚   β”œβ”€β”€ ADR/                # Architectural Decision Records
β”‚   β”œβ”€β”€ oncoagent_master_directive.md
β”‚   └── antigravity_rules.md
β”œβ”€β”€ data_prep/              # Dataset preparation (Fase 0)
β”œβ”€β”€ rag_engine/             # RAG ingestion & retrieval (Fase 0-3)
β”œβ”€β”€ agents/                 # LangGraph orchestration (Fase 3)
β”œβ”€β”€ ui/                     # Gradio frontend (Fase 4)
β”œβ”€β”€ tests/                  # Unit & integration tests
β”œβ”€β”€ scripts/                # Utility scripts
β”œβ”€β”€ logs/                   # Paper log & social media log
β”œβ”€β”€ requirements.txt        # Pinned dependencies
└── Dockerfile              # HF Spaces deployment

🩺 Safety Guarantees

  • Reflexion-based Critic Loop: A dedicated safety node audits the Specialist's output against the RAG context (entailment verification). It forces the Specialist to regenerate its output if it detects ungrounded claims or invented dosages.
  • Human-In-The-Loop (HITL) Gate: An acuity-based checkpoint that stops the pipeline for human clinician approval on high-risk cases (e.g., Stage IV + complex mutations).
  • Corrective RAG: The system grades retrieved context relevance. If insufficient evidence is found, it safely falls back instead of guessing.
  • Zero-PHI: Regex-based PII redaction before any processing
  • Reproducibility: Fixed seeds (torch.manual_seed(42)) across all ML scripts

πŸ“„ License

This project was built for the AMD Developer Hackathon 2026.


πŸ‘₯ Team

Built with ❀️ and AMD Instinct MI300X.

Total size
372 MB
Files
279
Last updated
May 10
Pre-warmed CDN
US EU US EU

Contributors