DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle Paper • 2512.04324 • Published 7 days ago • 146
Reasoning-Table: Exploring Reinforcement Learning for Table Reasoning Paper • 2506.01710 • Published Jun 2 • 2
S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language Models Paper • 2310.15147 • Published Oct 23, 2023 • 2