LAION-5B: An open large-scale dataset for training next generation image-text models Paper โข 2210.08402 โข Published Oct 16, 2022 โข 5
A Challenging Multimodal Video Summary: Simultaneously Extracting and Generating Keyframe-Caption Pairs from Video Paper โข 2312.01575 โข Published Dec 4, 2023 โข 1