estudanteBr
's Collections
papers
updated
Less is More: Recursive Reasoning with Tiny Networks
Paper
•
2510.04871
•
Published
•
501
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs
Paper
•
2510.07499
•
Published
•
48
Improving Context Fidelity via Native Retrieval-Augmented Reasoning
Paper
•
2509.13683
•
Published
•
8
Multimodal Iterative RAG for Knowledge-Intensive Visual Question
Answering
Paper
•
2509.00798
•
Published
•
1
Retrieval Feedback Memory Enhancement Large Model Retrieval Generation
Method
Paper
•
2508.17862
•
Published
Improving Factuality in LLMs via Inference-Time Knowledge Graph
Construction
Paper
•
2509.03540
•
Published
•
1
Transforming Questions and Documents for Semantically Aligned
Retrieval-Augmented Generation
Paper
•
2508.09755
•
Published
MIRAGE: Scaling Test-Time Inference with Parallel
Graph-Retrieval-Augmented Reasoning Chains
Paper
•
2508.18260
•
Published
From Ranking to Selection: A Simple but Efficient Dynamic Passage
Selector for Retrieval Augmented Generation
Paper
•
2508.09497
•
Published
MemMamba: Rethinking Memory Patterns in State Space Model
Paper
•
2510.03279
•
Published
•
72
Scaling LLM Test-Time Compute Optimally can be More Effective than
Scaling Model Parameters
Paper
•
2408.03314
•
Published
•
63
TAG: A Decentralized Framework for Multi-Agent Hierarchical
Reinforcement Learning
Paper
•
2502.15425
•
Published
•
9
Visual-RFT: Visual Reinforcement Fine-Tuning
Paper
•
2503.01785
•
Published
•
85
Qwen2.5-Omni Technical Report
Paper
•
2503.20215
•
Published
•
168
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning
Paper
•
2406.06469
•
Published
•
29
Cognitive Kernel: An Open-source Agent System towards Generalist
Autopilots
Paper
•
2409.10277
•
Published
•
1
Breaking the Modality Barrier: Universal Embedding Learning with
Multimodal LLMs
Paper
•
2504.17432
•
Published
•
40
ARM: Adaptive Reasoning Model
Paper
•
2505.20258
•
Published
•
45
Beyond the Exploration-Exploitation Trade-off: A Hidden State Approach
for LLM Reasoning in RLVR
Paper
•
2509.23808
•
Published
•
47
Reactive Transformer (RxT) -- Stateful Real-Time Processing for
Event-Driven Reactive Language Models
Paper
•
2510.03561
•
Published
•
24
JULI: Jailbreak Large Language Models by Self-Introspection
Paper
•
2505.11790
•
Published
Paper
•
2505.14674
•
Published
•
37
CodeContests+: High-Quality Test Case Generation for Competitive
Programming
Paper
•
2506.05817
•
Published
•
9
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens
Paper
•
2508.01191
•
Published
•
238
Revisiting Chain-of-Thought Prompting: Zero-shot Can Be Stronger than
Few-shot
Paper
•
2506.14641
•
Published
The Synergy Dilemma of Long-CoT SFT and RL: Investigating Post-Training
Techniques for Reasoning VLMs
Paper
•
2507.07562
•
Published
•
1
Chain-of-Thought Prompting Obscures Hallucination Cues in Large Language
Models: An Empirical Evaluation
Paper
•
2506.17088
•
Published
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Paper
•
2503.14476
•
Published
•
144
Towards General-Purpose Model-Free Reinforcement Learning
Paper
•
2501.16142
•
Published
•
30
Does Reinforcement Learning Really Incentivize Reasoning Capacity in
LLMs Beyond the Base Model?
Paper
•
2504.13837
•
Published
•
139
Learning to Reason under Off-Policy Guidance
Paper
•
2504.14945
•
Published
•
88
TTRL: Test-Time Reinforcement Learning
Paper
•
2504.16084
•
Published
•
120
Distilling LLM Agent into Small Models with Retrieval and Code Tools
Paper
•
2505.17612
•
Published
•
81
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Paper
•
2505.03335
•
Published
•
188
MemOS: A Memory OS for AI System
Paper
•
2507.03724
•
Published
•
157
A Survey of Context Engineering for Large Language Models
Paper
•
2507.13334
•
Published
•
259
Fast-dLLM v2: Efficient Block-Diffusion LLM
Paper
•
2509.26328
•
Published
•
55
CoDA: Coding LM via Diffusion Adaptation
Paper
•
2510.03270
•
Published
•
42
Drax: Speech Recognition with Discrete Flow Matching
Paper
•
2510.04162
•
Published
•
27
Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model
Reasoning
Paper
•
2510.04081
•
Published
•
23