X's picture

2

X

Phoebe13

·

AI & ML interests

None yet

Recent Activity

updated a model about 14 hours ago

Phoebe13/Video-MTR

updated a model 14 days ago

Phoebe13/Video-MTR

upvoted a paper 8 months ago

Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards

View all activity

Organizations

None yet

Papers 1

arxiv:2508.20478

models 15

Phoebe13/Video-MTR

Visual Question Answering • 8B • Updated about 14 hours ago • 26 • 7

Phoebe13/Qwen-2.5-7B-Instruct_Explore0.5_30k_stage234_v1.2_ev_handcomp_simple_with_handtype

Updated Apr 20, 2025

Phoebe13/Qwen-2.5-7B-Instruct_Explore0.5_30k_stage234_v1.2_ev_handcomp_simple

Updated Apr 8, 2025

Phoebe13/Qwen-2.5-7B-Instruct_Explore0.5_30k_stage234_ev_handcomp_simple

Updated Apr 8, 2025

Phoebe13/Qwen-2.5-7B-Instruct_Explore0.25_12k_stage234_ev_handcomp_simple

Updated Apr 7, 2025

Phoebe13/Qwen-2.5-7B-Instruct-Poker-30k_stage234_ev-by-handcomp-simple

Updated Mar 31, 2025

Phoebe13/Qwen-2.5-7B-Instruct-Poker-30k_stage1234_ev-by-handcomp-simple

Updated Mar 31, 2025

Phoebe13/Qwen-2.5-7B-Instruct-Poker-16k_stage234_ev-by-handcomp-simple

Updated Mar 30, 2025

Phoebe13/Qwen-2.5-7B-Instruct-Poker-ev-by-handcomp-simple

Updated Mar 30, 2025

Phoebe13/Qwen-2.5-7B-Poker-RL-StrictFormat-ev-by-handcomp-simple

Updated Mar 28, 2025

datasets 0

None public yet