Kimi Linear: An Expressive, Efficient Attention Architecture Paper • 2510.26692 • Published Oct 30, 2025 • 120
Running on CPU Upgrade Featured 2.91k The Smol Training Playbook 📚 2.91k The secrets to building world-class LLMs
ByteDance-Seed/Seed-OSS-36B-Instruct Text Generation • 36B • Updated Aug 26, 2025 • 7.47k • 477
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention Paper • 2506.13585 • Published Jun 16, 2025 • 273
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding Paper • 2505.22618 • Published May 28, 2025 • 44
Inference-Time Hyper-Scaling with KV Cache Compression Paper • 2506.05345 • Published Jun 5, 2025 • 27