Open CoT Leaderboard

community

Activity Feed Request to join this org

AI & ML interests

Chain of Thought, LLM Evaluation

Recent Activity

yakazimir authored a paper 6 days ago

AstaBench: Rigorous Benchmarking of AI Agents with a Scientific Research Suite

yakazimir authored a paper 6 days ago

TinyScientist: An Interactive, Extensible, and Controllable Framework for Building Research Agents

yakazimir authored a paper 6 days ago

Probabilistic Programs of Thought

View all activity

cot-leaderboard 's datasets 4

cot-leaderboard/cot-leaderboard-requests

Preview • Updated Feb 26, 2025 • 30

cot-leaderboard/cot-leaderboard-results

Viewer • Updated Feb 26, 2025 • 133 • 22

cot-leaderboard/cot-eval-results

Updated Feb 26, 2025 • 122

cot-leaderboard/cot-eval-traces-2.0

Viewer • Updated Feb 26, 2025 • 3.75M • 16.7k • 8