A Causal Language Modeling Detour Improves Encoder Continued Pretraining Paper • 2605.12438 • Published 5 days ago • 5 • 3
Biomed-Enriched: A Biomedical Dataset Enriched with LLMs for Pretraining and Extracting Rare and Hidden Content Paper • 2506.20331 • Published Jun 25, 2025 • 6 • 1