Real2Edit2Real: Generating Robotic Demonstrations via a 3D Control Interface Paper • 2512.19402 • Published 11 days ago • 7
Robust-R1: Degradation-Aware Reasoning for Robust Visual Understanding Paper • 2512.17532 • Published 14 days ago • 64
StereoPilot: Learning Unified and Efficient Stereo Conversion via Generative Priors Paper • 2512.16915 • Published 15 days ago • 37
End-to-End Training for Autoregressive Video Diffusion via Self-Resampling Paper • 2512.15702 • Published 16 days ago • 14
UniUGP: Unifying Understanding, Generation, and Planing For End-to-end Autonomous Driving Paper • 2512.09864 • Published 23 days ago • 10
Kimi Linear: An Expressive, Efficient Attention Architecture Paper • 2510.26692 • Published Oct 30, 2025 • 119
Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing Paper • 2510.19808 • Published Oct 22, 2025 • 29
GigaBrain-0: A World Model-Powered Vision-Language-Action Model Paper • 2510.19430 • Published Oct 22, 2025 • 49
Spatial Forcing: Implicit Spatial Representation Alignment for Vision-language-action Model Paper • 2510.12276 • Published Oct 14, 2025 • 145
ByteWrist: A Parallel Robotic Wrist Enabling Flexible and Anthropomorphic Motion for Confined Spaces Paper • 2509.18084 • Published Sep 22, 2025 • 13
VolSplat: Rethinking Feed-Forward 3D Gaussian Splatting with Voxel-Aligned Prediction Paper • 2509.19297 • Published Sep 23, 2025 • 24
FlexPainter: Flexible and Multi-View Consistent Texture Generation Paper • 2506.02620 • Published Jun 3, 2025 • 14
LayerAnimate: Layer-specific Control for Animation Paper • 2501.08295 • Published Jan 14, 2025 • 1
RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints Paper • 2503.16408 • Published Mar 20, 2025 • 42
Phantom: Subject-consistent video generation via cross-modal alignment Paper • 2502.11079 • Published Feb 16, 2025 • 59
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model Paper • 2502.10248 • Published Feb 14, 2025 • 55