Yuchen Cheng

rudeigerc

https://rudeigerc.dev

AI & ML interests

Kubernetes / LLMOps

Recent Activity

liked a model about 2 months ago

deepseek-ai/DeepSeek-V3.2

upvoted a paper 3 months ago

Kimi Linear: An Expressive, Efficient Attention Architecture

liked a Space 3 months ago

HuggingFaceTB/smol-training-playbook

View all activity

Organizations

None yet

liked a model about 2 months ago

deepseek-ai/DeepSeek-V3.2

Text Generation • 685B • Updated Dec 1, 2025 • 182k • • 1.15k

upvoted a paper 3 months ago

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published Oct 30, 2025 • 120

liked a Space 3 months ago

The Smol Training Playbook

📚

2.91k

The secrets to building world-class LLMs

liked 2 models 3 months ago

deepseek-ai/DeepSeek-OCR

Image-Text-to-Text • 3B • Updated Nov 4, 2025 • 2.99M • 3.1k

MiniMaxAI/MiniMax-M2

Text Generation • 229B • Updated about 1 month ago • 110k • • 1.45k

liked a model 4 months ago

deepseek-ai/DeepSeek-V3.2-Exp

Text Generation • 685B • Updated Nov 18, 2025 • 65k • • 941

liked 5 models 5 months ago

liked 3 models 6 months ago

tencent/Hunyuan-1.8B-Instruct

Text Generation • 2B • Updated Aug 6, 2025 • 238 • 344

openai/gpt-oss-20b

Text Generation • 22B • Updated Aug 26, 2025 • 6.67M • • 4.23k

openai/gpt-oss-120b

Text Generation • 120B • Updated Aug 26, 2025 • 3.09M • • 4.37k

liked a model 7 months ago

moonshotai/Kimi-K2-Instruct

Text Generation • 1T • Updated Nov 7, 2025 • 133k • • 2.31k

upvoted a paper 7 months ago

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16, 2025 • 273

liked 2 models 7 months ago

MiniMaxAI/MiniMax-M1-80k

Text Generation • 456B • Updated Jul 7, 2025 • 232 • • 687

mistralai/Magistral-Small-2506

24B • Updated Jul 28, 2025 • 30.5k • 607

upvoted 2 papers 8 months ago

Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding

Paper • 2505.22618 • Published May 28, 2025 • 44

Inference-Time Hyper-Scaling with KV Cache Compression

Paper • 2506.05345 • Published Jun 5, 2025 • 27

Yuchen Cheng

AI & ML interests

Recent Activity

Organizations

rudeigerc's activity

The Smol Training Playbook