Introducing Visual Perception Token into Multimodal Large Language Model Paper • 2502.17425 • Published Feb 24, 2025 • 16
CoT-Valve: Length-Compressible Chain-of-Thought Tuning Paper • 2502.09601 • Published Feb 13, 2025 • 14
Attention Prompting on Image for Large Vision-Language Models Paper • 2409.17143 • Published Sep 25, 2024 • 7