58 186

Takanori Yoshimoto

ramu0e

AI & ML interests

robotics, image generation, RL, Foundation Models,

Recent Activity

updated a model 17 days ago

ramu0e/lapo_p

published a model 17 days ago

ramu0e/lapo_p

published a model about 2 months ago

ramu0e/diffusion-latent-action-model-final

View all activity

Organizations

upvoted a paper 6 months ago

MultiBanana: A Challenging Benchmark for Multi-Reference Text-to-Image Generation

Paper • 2511.22989 • Published Nov 28, 2025 • 17

upvoted a paper 7 months ago

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published Aug 7, 2025 • 189

upvoted an article 9 months ago

Article

RynnVLA-001: Using Human Demonstrations to Improve Robot Manipulation

Alibaba-DAMO-Academy

•

Aug 11, 2025

• 28

upvoted a collection 9 months ago

Awesome SFT datasets

Collection

A curated list of interesting datasets to fine-tune language models with. • 41 items • Updated Mar 2 • 152

upvoted a paper 12 months ago

JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse

Paper • 2503.16365 • Published Mar 20, 2025 • 41

upvoted a collection about 1 year ago

OpenX-LeRobot

Collection

Open X-Embodiment datasets in LeRobot format with standard transfomation (https://github.com/Tavish9/any4lerobot) • 32 items • Updated Mar 2 • 36

upvoted 2 papers over 1 year ago

Large Language Diffusion Models

Paper • 2502.09992 • Published Feb 14, 2025 • 128

Magma: A Foundation Model for Multimodal AI Agents

Paper • 2502.13130 • Published Feb 18, 2025 • 58

upvoted an article over 1 year ago

Article

π0 and π0-FAST: Vision-Language-Action Models for General Robot Control

danaaubakirova, Molbap, mshukor, cadene

•

Feb 4, 2025

• 192

upvoted 2 papers over 1 year ago

Distillation Scaling Laws

Paper • 2502.08606 • Published Feb 12, 2025 • 47

Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution

Paper • 2312.06640 • Published Dec 11, 2023 • 49

upvoted a collection over 1 year ago

Cosmos

Collection

⚠️ This collection is archived. 👉 https://huggingface.co/collections/nvidia/nvidia-cosmos-2 • 14 items • Updated 3 days ago • 302

upvoted 8 papers over 1 year ago

PLEX: Making the Most of the Available Data for Robotic Manipulation Pretraining

Paper • 2303.08789 • Published Mar 15, 2023 • 2

Star Attention: Efficient LLM Inference over Long Sequences

Paper • 2411.17116 • Published Nov 26, 2024 • 53

MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control

Paper • 2411.13807 • Published Nov 21, 2024 • 11

BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games

Paper • 2411.13543 • Published Nov 20, 2024 • 19

How Far is Video Generation from World Model: A Physical Law Perspective

Paper • 2411.02385 • Published Nov 4, 2024 • 34

TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation

Paper • 2411.04709 • Published Nov 5, 2024 • 27

BitNet a4.8: 4-bit Activations for 1-bit LLMs

Paper • 2411.04965 • Published Nov 7, 2024 • 69

JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation

Paper • 2411.07975 • Published Nov 12, 2024 • 32

Takanori Yoshimoto

AI & ML interests

Recent Activity

Organizations

ramu0e's activity

RynnVLA-001: Using Human Demonstrations to Improve Robot Manipulation

π0 and π0-FAST: Vision-Language-Action Models for General Robot Control