Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures Paper • 2505.09343 • Published May 14 • 73
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning Paper • 2504.17192 • Published Apr 24 • 120
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs Paper • 2504.11536 • Published Apr 15 • 63
DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning Paper • 2504.07128 • Published Apr 2 • 86
SmolVLM: Redefining small and efficient multimodal models Paper • 2504.05299 • Published Apr 7 • 200
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems Paper • 2504.01990 • Published Mar 31 • 300
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published Mar 18 • 143
Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models Paper • 2402.14207 • Published Feb 22, 2024 • 10
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published Jan 28 • 123
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated Jul 21 • 549
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs Paper • 2501.06186 • Published Jan 10 • 65
Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Dec 6, 2024 • 696