arxiv:2504.01931
Chakraborty
souradip24
AI & ML interests
Reinforcement Learning, Machine Learning, NLP
Recent Activity
updated a model 3 days ago
souradip24/dpo-llama-3.2-3b-set2-samp500-pref100 published a model 3 days ago
souradip24/dpo-llama-3.2-3b-set2-samp500-pref100 updated a model 6 days ago
souradip24/dpo-llama-3.2-3b-set1-pref100