Buckets:

lablab-ai-amd-developer-hackathon
/

OncoAgent

372 MB

279 files

Updated 4 days ago

Ctrl+K

Name	Size	Uploaded	Xet hash
.oncoagent		5 days ago
agents		5 days ago
data		4 days ago
data_prep		5 days ago
deploy		5 days ago
docs		4 days ago
logs		4 days ago
rag_engine		5 days ago
scripts		4 days ago
tests		5 days ago
ui		5 days ago
.env.production	795 Bytes xet	4 days ago	f50acbc2
Dockerfile	1.97 kB xet	5 days ago	460a1342
README.es.md	8.32 kB xet	5 days ago	f605ae6f
README.md	7.71 kB xet	5 days ago	b57b0bcf
app.py	22.4 kB xet	4 days ago	0a4f1ac5
config.json	43.2 kB xet	5 days ago	762d2e64
oncoagent_master_directive.md	3.5 kB xet	5 days ago	28615910
requirements.txt	1.71 kB xet	5 days ago	c9472c3e
screenshot.py	860 Bytes xet	5 days ago	8d22896c
test_pipeline.py	1.64 kB xet	5 days ago	71d37f7b
test_ui.py	74 Bytes xet	5 days ago	b39bd317
ui_app.log	5.66 kB xet	5 days ago	577a0809
ui_app.pid	7 Bytes xet	5 days ago	d1f9b243
upload_output.txt	5.59 kB xet	4 days ago	91f9ece6

README.md

🧬 OncoAgent — Multi-Agent Oncology Triage System

AMD Developer Hackathon 2026 · Powered by AMD Instinct™ MI300X · ROCm 7.2

🌍 100% Open-Source: Democratizing Oncology

OncoAgent is proudly 100% open-source. We believe that life-saving clinical intelligence should not be locked behind proprietary APIs. Our solution is designed to:

Guarantee Patient Privacy: Run locally on AMD MI300X hardware or private clouds, ensuring zero patient data leaves the hospital.
Foster Global Contribution: Allow medical communities worldwide to easily audit, modify, and contribute to the RAG knowledge base.

OncoAgent is a state-of-the-art multi-agent clinical triage system designed to combat unstructured data blindness in primary care oncology. It leverages a tier-adaptive architecture featuring Qwen 3.5-9B (Speed Triage) and Qwen 3.6-27B (Deep Reasoning) models. Orchestrated via a sophisticated LangGraph state machine, it provides evidence-based oncological reasoning strictly grounded in NCCN/ESMO clinical guidelines, with built-in human-in-the-loop (HITL) safety gates and a Reflexion-based critic loop.

🏗️ Architecture

┌────────┐   ┌─────────┐   ┌─────────┐   ┌────────────┐      ┌────────────┐   ┌─────────┐
│ Router │──▶│Ingestion│──▶│Corrective│──▶│ Specialist │◀────│ Critic     │   │ Formatter│
│(Triage)│   │ (PHI)   │   │  RAG    │   │ (Qwen 9B/  │     │(Reflexion  │   │(Output)  │
└────────┘   └─────────┘   └─────────┘   │    27B)    │────▶│ Validation)│   └─────────┘
    │           │             │          └────────────┘      └────────────┘        ▲
    │           │             │                 │                   │              │
    ▼           ▼             ▼                 ▼                   ▼              │
  ┌───────────────────────────────────────────────────────────────────┐      ┌────────────┐
  │                           Fallback Node                           │      │ HITL Gate  │
  └───────────────────────────────────────────────────────────────────┘      │(Acuity Chk)│
                                                                             └────────────┘

Key Components:

Module	Description
`data_prep/`	Dataset builder: PMC-Patients/OncoCoT → Strict JSONL (Llama 3 chat template)
`rag_engine/`	The "Brain": PyMuPDF extraction, Adaptive Semantic Chunking of NCCN/ESMO PDFs, & ChromaDB + PubMedBERT vectorization.
`agents/`	The "Reasoning": LangGraph multi-agent orchestration (Router → Corrective RAG → Specialist ↔ Critic → HITL Gate).
`ui/`	The "Face": Gradio 6 UI with Glassmorphism for clinical note input, real-time source citations, and reasoning output.

🧠 Dual-Tier Model Strategy (Qwen)

To maximize the compute capabilities of the AMD MI300X, OncoAgent implements a dynamic Dual-Tier routing strategy using the Qwen model family. Both tiers have been fine-tuned on +200,000 real-world oncological cases covering all major cancer types (derived from PMC-Patients and OncoCoT datasets) to ensure hyper-specialized medical reasoning:

Tier 1: Qwen 3.5-9B (Speed Triage): A lightweight, extremely fast model used by the Router to assess initial complexity, perform simple triage, and handle low-risk queries.
Tier 2: Qwen 3.6-27B (Deep Reasoning): The heavy-lifter. Activated for high-complexity clinical cases (e.g., metastasis, multi-mutations). It performs deep reasoning and entailment checks, avoiding confirmation bias through rigorous Reflexion loops.

⚡ Hardware Target

GPU: AMD Instinct™ MI300X (192GB HBM3)
Software Stack: ROCm 7.2.x, PyTorch (HIP), vLLM with PagedAttention
Models: Qwen/Qwen3.5-9B (Speed Triage) & Qwen/Qwen3.6-27B-Instruct (Deep Reasoning)
Precision: QLoRA 4-bit NormalFloat4 via bitsandbytes (ROCm compatible)

🚀 Quick Start

# 1. Clone and setup
git clone <repo-url>
cd OncoAgent

# 2. Install dependencies
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# 3. Start Inference Server (vLLM on Docker)
# This spins up the Qwen models optimized for AMD MI300X via ROCm PagedAttention
docker run --device /dev/kfd --device /dev/dri -p 8000:8000 rocm/vllm:latest \
    --model Qwen/Qwen3.6-27B-Instruct --tensor-parallel-size 1

# 4. Configure environment & Run UI
cp .env.example .env
# Set VLLM_API_BASE=http://localhost:8000/v1 in .env
python -m ui.app

📁 Project Structure

├── docs/                   # Documentation & research
│   ├── research/           # Deep Research analysis documents
│   ├── ADR/                # Architectural Decision Records
│   ├── oncoagent_master_directive.md
│   └── antigravity_rules.md
├── data_prep/              # Dataset preparation (Fase 0)
├── rag_engine/             # RAG ingestion & retrieval (Fase 0-3)
├── agents/                 # LangGraph orchestration (Fase 3)
├── ui/                     # Gradio frontend (Fase 4)
├── tests/                  # Unit & integration tests
├── scripts/                # Utility scripts
├── logs/                   # Paper log & social media log
├── requirements.txt        # Pinned dependencies
└── Dockerfile              # HF Spaces deployment

🩺 Safety Guarantees

Reflexion-based Critic Loop: A dedicated safety node audits the Specialist's output against the RAG context (entailment verification). It forces the Specialist to regenerate its output if it detects ungrounded claims or invented dosages.
Human-In-The-Loop (HITL) Gate: An acuity-based checkpoint that stops the pipeline for human clinician approval on high-risk cases (e.g., Stage IV + complex mutations).
Corrective RAG: The system grades retrieved context relevance. If insufficient evidence is found, it safely falls back instead of guessing.
Zero-PHI: Regex-based PII redaction before any processing
Reproducibility: Fixed seeds (torch.manual_seed(42)) across all ML scripts

📄 License

This project was built for the AMD Developer Hackathon 2026.

👥 Team

Built with ❤️ and AMD Instinct MI300X.

Total size: 372 MB

Files: 279

Last updated: May 10

Pre-warmed CDN: US EU US EU