Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods Paper • 2502.01384 • Published Feb 3, 2025 • 1
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer Paper • 2511.22699 • Published Nov 27, 2025 • 225
Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield Paper • 2511.22677 • Published Nov 27, 2025 • 29
LLMs as In-Context Meta-Learners for Model and Hyperparameter Selection Paper • 2510.26510 • Published Oct 30, 2025 • 2
From Data to Rewards: a Bilevel Optimization Perspective on Maximum Likelihood Estimation Paper • 2510.07624 • Published Oct 8, 2025 • 7
AutoEdit: Automatic Hyperparameter Tuning for Image Editing Paper • 2509.15031 • Published Sep 18, 2025 • 4
SiM3D: Single-instance Multiview Multimodal and Multisetup 3D Anomaly Detection Benchmark Paper • 2506.21549 • Published Jun 26, 2025 • 1
Towards Reliable Identification of Diffusion-based Image Manipulations Paper • 2506.05466 • Published Jun 5, 2025
Test Time Training for Industrial Anomaly Segmentation Paper • 2404.03743 • Published Apr 4, 2024 • 1
Multimodal Industrial Anomaly Detection by Crossmodal Feature Mapping Paper • 2312.04521 • Published Dec 7, 2023 • 1
Booster: a Benchmark for Depth from Images of Specular and Transparent Surfaces Paper • 2301.08245 • Published Jan 19, 2023 • 1
Learning Depth Estimation for Transparent and Mirror Surfaces Paper • 2307.15052 • Published Jul 27, 2023 • 1
Discrete Noise Inversion for Next-scale Autoregressive Text-based Image Editing Paper • 2509.01984 • Published Sep 2, 2025 • 6
Inherently Faithful Attention Maps for Vision Transformers Paper • 2506.08915 • Published Jun 10, 2025 • 3
Robustness in Both Domains: CLIP Needs a Robust Text Encoder Paper • 2506.03355 • Published Jun 3, 2025 • 6
On the Adversarial Robustness of Multi-Modal Foundation Models Paper • 2308.10741 • Published Aug 21, 2023
Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models Paper • 2402.12336 • Published Feb 19, 2024
FuseLIP: Multimodal Embeddings via Early Fusion of Discrete Tokens Paper • 2506.03096 • Published Jun 3, 2025 • 4