UnityShots: Memory-Driven Multi-Shot Audio-Video Generation with Boundary-Aware Gating Paper • 2606.21661 • Published 8 days ago • 24
RepFusion: Leveraging Multimodal Priors for Denoising in Representation Space Paper • 2606.14700 • Published 15 days ago • 18
Reinforcing Dual-Path Reasoning in Spatial Vision Language Models Paper • 2606.17539 • Published 11 days ago • 15
OmniDirector: General Multi-Shot Camera Cloning without Cross-Paired Data Paper • 2606.13432 • Published 16 days ago • 111
DecMem: Towards Minute-Long Consistent World Generation with Decoupled Memory Paper • 2605.31336 • Published 29 days ago • 12
RectifID: Personalizing Rectified Flow with Anchored Classifier Guidance Paper • 2405.14677 • Published May 23, 2024 • 11
SANA-Streaming: Real-time Streaming Video Editing with Hybrid Diffusion Transformer Paper • 2605.30409 • Published about 1 month ago • 41
DecMem: Towards Minute-Long Consistent World Generation with Decoupled Memory Paper • 2605.31336 • Published 29 days ago • 12
DecMem: Towards Minute-Long Consistent World Generation with Decoupled Memory Paper • 2605.31336 • Published 29 days ago • 12
ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling Paper • 2603.25746 • Published Mar 26 • 155
GARDO: Reinforcing Diffusion Models without Reward Hacking Paper • 2512.24138 • Published Dec 30, 2025 • 30
SVG-T2I: Scaling Up Text-to-Image Latent Diffusion Model Without Variational Autoencoder Paper • 2512.11749 • Published Dec 12, 2025 • 39