VINO: A Unified Visual Generator with Interleaved OmniModal Context Paper • 2601.02358 • Published 4 days ago • 27
AI for Service: Proactive Assistance with AI Glasses Paper • 2510.14359 • Published Oct 16, 2025 • 74
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling Paper • 2509.12201 • Published Sep 15, 2025 • 106
Symbolic Graphics Programming with Large Language Models Paper • 2509.05208 • Published Sep 5, 2025 • 46
Training Superior Sparse Autoencoders for Instruct Models Paper • 2506.07691 • Published Jun 9, 2025 • 2
wan2.2 controlnets Collection See code on github: https://github.com/TheDenk/wan2.2-controlnet • 6 items • Updated Oct 7, 2025 • 7
π^3: Scalable Permutation-Equivariant Visual Geometry Learning Paper • 2507.13347 • Published Jul 17, 2025 • 65
Lumina-Image 2.0: A Unified and Efficient Image Generative Framework Paper • 2503.21758 • Published Mar 27, 2025 • 22
LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis Paper • 2503.21749 • Published Mar 27, 2025 • 26
IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models Paper • 2501.13920 • Published Jan 23, 2025 • 19
Lumina-Video: Efficient and Flexible Video Generation with Multi-scale Next-DiT Paper • 2502.06782 • Published Feb 10, 2025 • 15
II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models Paper • 2406.05862 • Published Jun 9, 2024 • 4
Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models Paper • 2409.18943 • Published Sep 27, 2024 • 28