Ankit

Ajax0564

Ajax0564

AI & ML interests

NLP

Recent Activity

upvoted a paper 11 days ago

Bolmo: Byteifying the Next Generation of Language Models

upvoted an article 22 days ago

Why You Should Care About Partial Differential Equations (PDEs)

reacted to sergiopaniego's post with 👍 about 1 month ago

you gotta go fast and go read the latest blog by @ror et al. explaining Continuous Batching in depth https://huggingface.co/blog/continuous_batching

View all activity

Organizations

None yet

upvoted a paper 11 days ago

Bolmo: Byteifying the Next Generation of Language Models

Paper • 2512.15586 • Published 17 days ago • 14

upvoted an article 22 days ago

Article

Why You Should Care About Partial Differential Equations (PDEs)

23 days ago

•

upvoted an article about 1 month ago

Article

Continuous batching from first principles

Nov 25, 2025

•

291

upvoted a paper about 2 months ago

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published Nov 9, 2025 • 132

upvoted an article 2 months ago

Article

LightOnOCR-1B: The Case for End-to-End and Efficient Domain-Specific Vision-Language Models for OCR

Oct 23, 2025

•

upvoted 2 papers 4 months ago

AToken: A Unified Tokenizer for Vision

Paper • 2509.14476 • Published Sep 17, 2025 • 36

SAIL-VL2 Technical Report

Paper • 2509.14033 • Published Sep 17, 2025 • 44

upvoted an article 5 months ago

Article

Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training

Aug 8, 2025

•

upvoted 2 papers 5 months ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24, 2025 • 316

nablaNABLA: Neighborhood Adaptive Block-Level Attention

Paper • 2507.13546 • Published Jul 17, 2025 • 124

upvoted an article 6 months ago

Article

Understanding Gemma 3n: How MatFormer Gives You Many Models in One

Jun 26, 2025

•

upvoted 3 papers 6 months ago

Kwai Keye-VL Technical Report

Paper • 2507.01949 • Published Jul 2, 2025 • 130

Ovis-U1 Technical Report

Paper • 2506.23044 • Published Jun 29, 2025 • 61

Vision-Guided Chunking Is All You Need: Enhancing RAG with Multimodal Document Understanding

Paper • 2506.16035 • Published Jun 19, 2025 • 88

upvoted an article 7 months ago

Article

Learn the Hugging Face Kernel Hub in 5 Minutes

Jun 12, 2025

•

151

upvoted a paper 7 months ago

LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning

Paper • 2505.16933 • Published May 22, 2025 • 34

upvoted 2 papers 8 months ago

MMaDA: Multimodal Large Diffusion Language Models

Paper • 2505.15809 • Published May 21, 2025 • 97

Multi-Token Prediction Needs Registers

Paper • 2505.10518 • Published May 15, 2025 • 14

upvoted an article 9 months ago

Article

The NLP Course is becoming the LLM Course

Apr 3, 2025

•

103

upvoted an article 10 months ago

Article

Open R1: How to use OlympicCoder locally for coding

Mar 20, 2025

•

Ankit

AI & ML interests

Recent Activity

Organizations

Ajax0564's activity

Why You Should Care About Partial Differential Equations (PDEs)

Continuous batching from first principles

LightOnOCR-1B: The Case for End-to-End and Efficient Domain-Specific Vision-Language Models for OCR

Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training

Understanding Gemma 3n: How MatFormer Gives You Many Models in One

Learn the Hugging Face Kernel Hub in 5 Minutes

The NLP Course is becoming the LLM Course

Open R1: How to use OlympicCoder locally for coding