This collection contains curriculum-RLed Olmo models.
SeanWang0027 PRO
SeanWang0027
AI & ML interests
Continual Learning
Recent Activity
published a dataset about 4 hours ago
SeanWang0027/extreme_hard_4B updated a dataset about 4 hours ago
SeanWang0027/extreme_hard_4B published a model 1 day ago
SeanWang0027/rl_warm_up_stitch_survo-parquet_qwen3-1.7b_epoch_3_maskOrganizations
Continual-SFT-Olmo
This contains the SFT-ed Olmo models, and some models built upon.
-
SeanWang0027/olmo-7b-synlogic-survo-sft
7B • Updated • 1 -
SeanWang0027/olmo-7b-synlogic-survo-space_reasoning-sft
7B • Updated • 1 -
SeanWang0027/olmo-7b-synlogic-survo-space_reasoning-math_path-sft
7B • Updated • 1 -
SeanWang0027/sci-10k-olmo-7b-synlogic-survo-space_reasoning-math_path-sft
7B • Updated • 1
Curriculum-RL
This collection contains curriculum-RLed Olmo models.
Continual-SFT-Olmo
This contains the SFT-ed Olmo models, and some models built upon.
-
SeanWang0027/olmo-7b-synlogic-survo-sft
7B • Updated • 1 -
SeanWang0027/olmo-7b-synlogic-survo-space_reasoning-sft
7B • Updated • 1 -
SeanWang0027/olmo-7b-synlogic-survo-space_reasoning-math_path-sft
7B • Updated • 1 -
SeanWang0027/sci-10k-olmo-7b-synlogic-survo-space_reasoning-math_path-sft
7B • Updated • 1
models 42
SeanWang0027/rl_warm_up_stitch_survo-parquet_qwen3-1.7b_epoch_3_mask
2B • Updated • 13
SeanWang0027/rl_warm_up_mixed_survo_correct-parquet_qwen3-1.7b_epoch_3_mask
2B • Updated • 8
SeanWang0027/rl_warm_up_mixed_minesweeper_correct-parquet_qwen3-1.7b_epoch_3_mask
2B • Updated • 14
SeanWang0027/rl_warm_up_stitch_minesweeper-parquet_qwen3-1.7b_epoch_3_mask
2B • Updated • 13
SeanWang0027/rl_warm_up_stitch_minesweeper-parquet_qwen3-1.7b_epoch_1_mask
2B • Updated • 11
SeanWang0027/rl_warm_up_mixed_minesweeper_correct-parquet_qwen3-1.7b_epoch_1_mask
2B • Updated • 23
SeanWang0027/mixed_sdft_solution_sudoku_qwen3_4b_thinking_1_epoch_8192_32_batch_2e-5_lr_qwen3_1_7b
Updated • 1
SeanWang0027/dolci-wildchat-think-singleturn
Updated
SeanWang0027/student_prefix_kukurasu_20K_nemotron8b_continual_Q_nemotron-cascade-8b_cutoff2048_epoch_3_mask
8B • Updated • 5
SeanWang0027/student_prefix_kukurasu_20K_nemotron8b_continual_Q_nemotron-cascade-8b_cutoff1024_epoch_3_mask
Updated
datasets 30
SeanWang0027/extreme_hard_4B
Viewer • Updated • 5.27k
SeanWang0027/rlve_mixed_20envs_stitch_full
Viewer • Updated • 16k • 34
SeanWang0027/verl_mask_training
Updated • 37
SeanWang0027/rlve_30b_qwen_1.7b_mixed_20envs_10
Viewer • Updated • 16k • 23
SeanWang0027/teacher_prefix_sudoku_10K_sequential_qwen3_4b_thinking_continual_nemotron-cascade-8b
Updated • 43
SeanWang0027/student_prefix_sequential
Viewer • Updated • 3k • 82 • 1
SeanWang0027/RAGEN
Updated • 1.11k
SeanWang0027/mixed_sdft_solution_sequential_minesweeper_kukurasu_qwen3_4b_thinking
Updated • 62
SeanWang0027/teacher_prefix_sudoku_10K_qwen3_4b_thinking_continual_qwen3-1-7b-parquet_qwen3-1.7b_epoch_3
Updated • 42
SeanWang0027/mixed_sdft_solution_kukurasu_qwen3_4b_thinking_1_epoch_8192_32_batch_2e-5_lr_qwen3_1_7b
Updated • 43