DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models Paper • 2512.24165 • Published 4 days ago • 20
Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space Paper • 2512.24617 • Published 3 days ago • 30
AI Meets Brain: Memory Systems from Cognitive Neuroscience to Autonomous Agents Paper • 2512.23343 • Published 5 days ago • 19
Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process Paper • 2512.23988 • Published 4 days ago • 12
Forging Spatial Intelligence: A Roadmap of Multi-Modal Data Pre-Training for Autonomous Systems Paper • 2512.24385 • Published 4 days ago • 7
Factorized Learning for Temporally Grounded Video-Language Models Paper • 2512.24097 • Published 4 days ago • 5
PhyGDPO: Physics-Aware Groupwise Direct Preference Optimization for Physically Consistent Text-to-Video Generation Paper • 2512.24551 • Published 3 days ago • 16
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem Paper • 2512.24873 • Published 3 days ago • 39
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models Paper • 2512.24618 • Published 3 days ago • 60
VL-JEPA: Joint Embedding Predictive Architecture for Vision-language Paper • 2512.10942 • Published 23 days ago • 22
UltraShape 1.0: High-Fidelity 3D Shape Generation via Scalable Geometric Refinement Paper • 2512.21185 • Published 10 days ago • 21
SpotEdit: Selective Region Editing in Diffusion Transformers Paper • 2512.22323 • Published 8 days ago • 36
Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone Paper • 2512.22615 • Published 7 days ago • 39