Prithiv Sakthi's picture

Building on HF

Prithiv Sakthi PRO

prithivMLmods

·

https://linktr.ee/prithivsakthi

AI & ML interests

computer vision, nlp, multimodality - HuggingFace Fellow🤗

Recent Activity

upvoted a paper about 2 hours ago

HiF-VLA: Hindsight, Insight and Foresight through Motion Representation for Vision-Language-Action Models

upvoted a paper about 2 hours ago

InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models

updated a Space about 2 hours ago

prithivMLmods/Qwen-Image-Edit-2509-LoRAs-Fast

View all activity

Organizations

upvoted 2 papers about 2 hours ago

HiF-VLA: Hindsight, Insight and Foresight through Motion Representation for Vision-Language-Action Models

Paper • 2512.09928 • Published about 14 hours ago • 4

InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models

Paper • 2512.08829 • Published 1 day ago • 8

upvoted a collection 2 days ago

Dynamic markdowns

Document-AI, HTML, OCR • 4 items • Updated 4 days ago • 2

upvoted 3 papers 2 days ago

Colon-X: Advancing Intelligent Colonoscopy from Multimodal Understanding to Clinical Reasoning

Paper • 2512.03667 • Published 8 days ago • 3

EditThinker: Unlocking Iterative Reasoning for Any Image Editor

Paper • 2512.05965 • Published 6 days ago • 36

EMMA: Efficient Multimodal Understanding, Generation, and Editing with a Unified Architecture

Paper • 2512.04810 • Published 7 days ago • 24

upvoted a collection 3 days ago

GLM-4.6V

3 items • Updated 3 days ago • 38

upvoted 2 papers 3 days ago

COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence

Paper • 2512.04563 • Published 7 days ago • 13

PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling

Paper • 2512.04784 • Published 9 days ago • 23

upvoted an article 3 days ago

Article

Building Deep Research: How we Achieved State of the Art

17 days ago

•

22

upvoted a paper 3 days ago

Fara-7B: An Efficient Agentic Model for Computer Use

Paper • 2511.19663 • Published 17 days ago • 12

upvoted 3 articles 4 days ago

Article

Gemini-3 Benchmarkathon

13 days ago

•

10

Article

Building Jobly: Semantic Job Matching with RAG and Vector Embeddings

13 days ago

•

12

Article

TurkColBERT: A Benchmark of Dense and Late-Interaction Models for Turkish Information Retrieval

7 days ago

•

18

upvoted 2 papers 4 days ago

DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling

Paper • 2512.03000 • Published 9 days ago • 34

Semantics Lead the Way: Harmonizing Semantic and Texture Modeling with Asynchronous Latent Diffusion

Paper • 2512.04926 • Published 7 days ago • 40

upvoted 2 papers 5 days ago

SIMA 2: A Generalist Embodied Agent for Virtual Worlds

Paper • 2512.04797 • Published 7 days ago • 18

Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length

Paper • 2512.04677 • Published 7 days ago • 165

upvoted an article 5 days ago

Article

We Got Claude to Fine-Tune an Open Source LLM

7 days ago

•

441

upvoted a paper 6 days ago

DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle

Paper • 2512.04324 • Published 7 days ago • 146