pencaharlangit
's Collections
Interesting Papers
updated
ReZero: Enhancing LLM search ability by trying one-more-time
Paper
•
2504.11001
•
Published
•
16
FonTS: Text Rendering with Typography and Style Controls
Paper
•
2412.00136
•
Published
•
1
GenEx: Generating an Explorable World
Paper
•
2412.09624
•
Published
•
97
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for
Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper
•
2412.13663
•
Published
•
158
An Empirical Study of GPT-4o Image Generation Capabilities
Paper
•
2504.05979
•
Published
•
64
OmniSVG: A Unified Scalable Vector Graphics Generation Model
Paper
•
2504.06263
•
Published
•
182
DreamO: A Unified Framework for Image Customization
Paper
•
2504.16915
•
Published
•
24
DreamID: High-Fidelity and Fast diffusion-based Face Swapping via
Triplet ID Group Learning
Paper
•
2504.14509
•
Published
•
50
Tina: Tiny Reasoning Models via LoRA
Paper
•
2504.15777
•
Published
•
56
A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training
and Deployment
Paper
•
2504.15585
•
Published
•
12
Personalized Text-to-Image Generation with Auto-Regressive Models
Paper
•
2504.13162
•
Published
•
18
Paper2Code: Automating Code Generation from Scientific Papers in Machine
Learning
Paper
•
2504.17192
•
Published
•
120
MUSAR: Exploring Multi-Subject Customization from Single-Subject Dataset
via Attention Routing
Paper
•
2505.02823
•
Published
•
5
Style Customization of Text-to-Vector Generation with Image Diffusion
Priors
Paper
•
2505.10558
•
Published
•
16
InstanceGen: Image Generation with Instance-level Instructions
Paper
•
2505.05678
•
Published
•
7
Marigold: Affordable Adaptation of Diffusion-Based Image Generators for
Image Analysis
Paper
•
2505.09358
•
Published
•
26
SageAttention2++: A More Efficient Implementation of SageAttention2
Paper
•
2505.21136
•
Published
•
45
OmniConsistency: Learning Style-Agnostic Consistency from Paired
Stylization Data
Paper
•
2505.18445
•
Published
•
63
ComfyMind: Toward General-Purpose Generation via Tree-Based Planning and
Reactive Feedback
Paper
•
2505.17908
•
Published
•
3
Inverse Virtual Try-On: Generating Multi-Category Product-Style Images
from Clothed Individuals
Paper
•
2505.21062
•
Published
•
3
ARM: Adaptive Reasoning Model
Paper
•
2505.20258
•
Published
•
45
Jodi: Unification of Visual Generation and Understanding via Joint
Modeling
Paper
•
2505.19084
•
Published
•
20
D-AR: Diffusion via Autoregressive Models
Paper
•
2505.23660
•
Published
•
34
LoRAShop: Training-Free Multi-Concept Image Generation and Editing with
Rectified Flow Transformers
Paper
•
2505.23758
•
Published
•
22
Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and
Preference Alignment
Paper
•
2505.18600
•
Published
•
48
UniWorld: High-Resolution Semantic Encoders for Unified Visual
Understanding and Generation
Paper
•
2506.03147
•
Published
•
58
RelationAdapter: Learning and Transferring Visual Relation with
Diffusion Transformers
Paper
•
2506.02528
•
Published
•
15
Native-Resolution Image Synthesis
Paper
•
2506.03131
•
Published
•
18
Voyager: Long-Range and World-Consistent Video Diffusion for Explorable
3D Scene Generation
Paper
•
2506.04225
•
Published
•
28
Image Editing As Programs with Diffusion Models
Paper
•
2506.04158
•
Published
•
24
SeedVR2: One-Step Video Restoration via Diffusion Adversarial
Post-Training
Paper
•
2506.05301
•
Published
•
58
Surfer-H Meets Holo1: Cost-Efficient Web Agent Powered by Open Weights
Paper
•
2506.02865
•
Published
•
33
FlexPainter: Flexible and Multi-View Consistent Texture Generation
Paper
•
2506.02620
•
Published
•
14
SkyReels-Audio: Omni Audio-Conditioned Talking Portraits in Video
Diffusion Transformers
Paper
•
2506.00830
•
Published
•
7
MARBLE: Material Recomposition and Blending in CLIP-Space
Paper
•
2506.05313
•
Published
•
2
Text-Aware Image Restoration with Diffusion Models
Paper
•
2506.09993
•
Published
•
43
AniMaker: Automated Multi-Agent Animated Storytelling with MCTS-Driven
Clip Generation
Paper
•
2506.10540
•
Published
•
37
PosterCraft: Rethinking High-Quality Aesthetic Poster Generation in a
Unified Framework
Paper
•
2506.10741
•
Published
•
27
DreamActor-H1: High-Fidelity Human-Product Demonstration Video
Generation via Motion-designed Diffusion Transformers
Paper
•
2506.10568
•
Published
•
8
PAROAttention: Pattern-Aware ReOrdering for Efficient Sparse and
Quantized Attention in Visual Generation Models
Paper
•
2506.16054
•
Published
•
60
Align Your Flow: Scaling Continuous-Time Flow Map Distillation
Paper
•
2506.14603
•
Published
•
19
Marrying Autoregressive Transformer and Diffusion with Multi-Reference
Autoregression
Paper
•
2506.09482
•
Published
•
45
Auto-Regressively Generating Multi-View Consistent Images
Paper
•
2506.18527
•
Published
•
8
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos
with Spatio-Temporal Diffusion Models
Paper
•
2507.13344
•
Published
•
57
AgentsNet: Coordination and Collaborative Reasoning in Multi-Agent LLMs
Paper
•
2507.08616
•
Published
•
14
Neural-Driven Image Editing
Paper
•
2507.05397
•
Published
•
26
S^2-Guidance: Stochastic Self Guidance for Training-Free Enhancement of
Diffusion Models
Paper
•
2508.12880
•
Published
•
46
Paper
•
2508.10104
•
Published
•
291
Thyme: Think Beyond Images
Paper
•
2508.11630
•
Published
•
81
NextStep-1: Toward Autoregressive Image Generation with Continuous
Tokens at Scale
Paper
•
2508.10711
•
Published
•
145
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent
Paper
•
2508.05748
•
Published
•
141
VertexRegen: Mesh Generation with Continuous Level of Detail
Paper
•
2508.09062
•
Published
•
38