Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation Paper • 2506.09991 • Published Jun 11, 2025 • 55
Faster Video Diffusion with Trainable Sparse Attention Paper • 2505.13389 • Published May 19, 2025 • 37
MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding Paper • 2408.11049 • Published Aug 20, 2024 • 13