VL-JEPA: Joint Embedding Predictive Architecture for Vision-language Paper โข 2512.10942 โข Published about 1 month ago โข 41 โข 5