Front-Loading Reasoning: The Synergy between Pretraining and Post-Training Data Paper • 2510.03264 • Published Sep 26, 2025 • 23
RLAD: Training LLMs to Discover Abstractions for Solving Reasoning Problems Paper • 2510.02263 • Published Oct 2, 2025 • 8
VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use Paper • 2509.01055 • Published Sep 1, 2025 • 76
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model Paper • 2508.14444 • Published Aug 20, 2025 • 39
Think Only When You Need with Large Hybrid-Reasoning Models Paper • 2505.14631 • Published May 20, 2025 • 20
Optimizing Anytime Reasoning via Budget Relative Policy Optimization Paper • 2505.13438 • Published May 19, 2025 • 36
Model Merging in Pre-training of Large Language Models Paper • 2505.12082 • Published May 17, 2025 • 40
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures Paper • 2505.09343 • Published May 14, 2025 • 74