view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 20 days ago • 100
Nemotron-Cascade Collection Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models • 18 items • Updated 5 days ago • 43
TimeBill: Time-Budgeted Inference for Large Language Models Paper • 2512.21859 • Published 12 days ago • 22
Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers Paper • 2512.17351 • Published 18 days ago • 25
Depth Any Panoramas: A Foundation Model for Panoramic Depth Estimation Paper • 2512.16913 • Published 19 days ago • 33
Qwen-Image-Layered: Towards Inherent Editability via Layer Decomposition Paper • 2512.15603 • Published 20 days ago • 59
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 Dec 1, 2025 • 265
view article Article Apriel-1.6-15b-Thinker: Cost-efficient Frontier Multimodal Performance 28 days ago • 82
view article Article How We Use Claude Code Skills to Run 1,000+ ML Experiments a Day 29 days ago • 47
view article Article Introducing swift-huggingface: The Complete Swift Client for Hugging Face Dec 5, 2025 • 34
view article Article Introducing AnyLanguageModel: One API for Local and Remote LLMs on Apple Platforms Nov 20, 2025 • 36
ReviewerToo: Should AI Join The Program Committee? A Look At The Future of Peer Review Paper • 2510.08867 • Published Oct 9, 2025 • 5
Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory Paper • 2511.20857 • Published Nov 25, 2025 • 2
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices Paper • 2512.01374 • Published Dec 1, 2025 • 96