papers - a estudanteBr Collection

estudanteBr 's Collections

papers

papers

updated Oct 12, 2025

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6, 2025 • 501
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs

Paper • 2510.07499 • Published Oct 8, 2025 • 48
Improving Context Fidelity via Native Retrieval-Augmented Reasoning

Paper • 2509.13683 • Published Sep 17, 2025 • 8
Multimodal Iterative RAG for Knowledge-Intensive Visual Question Answering

Paper • 2509.00798 • Published Aug 31, 2025 • 1
Retrieval Feedback Memory Enhancement Large Model Retrieval Generation Method

Paper • 2508.17862 • Published Aug 25, 2025
Improving Factuality in LLMs via Inference-Time Knowledge Graph Construction

Paper • 2509.03540 • Published Aug 31, 2025 • 1
Transforming Questions and Documents for Semantically Aligned Retrieval-Augmented Generation

Paper • 2508.09755 • Published Aug 13, 2025
MIRAGE: Scaling Test-Time Inference with Parallel Graph-Retrieval-Augmented Reasoning Chains

Paper • 2508.18260 • Published Aug 25, 2025
From Ranking to Selection: A Simple but Efficient Dynamic Passage Selector for Retrieval Augmented Generation

Paper • 2508.09497 • Published Aug 13, 2025
MemMamba: Rethinking Memory Patterns in State Space Model

Paper • 2510.03279 • Published Sep 28, 2025 • 72
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

Paper • 2408.03314 • Published Aug 6, 2024 • 63
TAG: A Decentralized Framework for Multi-Agent Hierarchical Reinforcement Learning

Paper • 2502.15425 • Published Feb 21, 2025 • 9
Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published Mar 3, 2025 • 85
Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published Mar 26, 2025 • 168
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning

Paper • 2406.06469 • Published Jun 10, 2024 • 29
Cognitive Kernel: An Open-source Agent System towards Generalist Autopilots

Paper • 2409.10277 • Published Sep 16, 2024 • 1
Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs

Paper • 2504.17432 • Published Apr 24, 2025 • 40
ARM: Adaptive Reasoning Model

Paper • 2505.20258 • Published May 26, 2025 • 45
Beyond the Exploration-Exploitation Trade-off: A Hidden State Approach for LLM Reasoning in RLVR

Paper • 2509.23808 • Published Sep 28, 2025 • 47
Reactive Transformer (RxT) -- Stateful Real-Time Processing for Event-Driven Reactive Language Models

Paper • 2510.03561 • Published Oct 3, 2025 • 24
JULI: Jailbreak Large Language Models by Self-Introspection

Paper • 2505.11790 • Published May 17, 2025
Reward Reasoning Model

Paper • 2505.14674 • Published May 20, 2025 • 37
CodeContests+: High-Quality Test Case Generation for Competitive Programming

Paper • 2506.05817 • Published Jun 6, 2025 • 9
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens

Paper • 2508.01191 • Published Aug 2, 2025 • 238
Revisiting Chain-of-Thought Prompting: Zero-shot Can Be Stronger than Few-shot

Paper • 2506.14641 • Published Jun 17, 2025
The Synergy Dilemma of Long-CoT SFT and RL: Investigating Post-Training Techniques for Reasoning VLMs

Paper • 2507.07562 • Published Jul 10, 2025 • 1
Chain-of-Thought Prompting Obscures Hallucination Cues in Large Language Models: An Empirical Evaluation

Paper • 2506.17088 • Published Jun 20, 2025
DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18, 2025 • 144
Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27, 2025 • 30
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published Apr 18, 2025 • 139
Learning to Reason under Off-Policy Guidance

Paper • 2504.14945 • Published Apr 21, 2025 • 88
TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22, 2025 • 120
Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published May 23, 2025 • 81
Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6, 2025 • 188
MemOS: A Memory OS for AI System

Paper • 2507.03724 • Published Jul 4, 2025 • 157
A Survey of Context Engineering for Large Language Models

Paper • 2507.13334 • Published Jul 17, 2025 • 259
Fast-dLLM v2: Efficient Block-Diffusion LLM

Paper • 2509.26328 • Published Sep 30, 2025 • 55
CoDA: Coding LM via Diffusion Adaptation

Paper • 2510.03270 • Published Sep 27, 2025 • 42
Drax: Speech Recognition with Discrete Flow Matching

Paper • 2510.04162 • Published Oct 5, 2025 • 27
Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning

Paper • 2510.04081 • Published Oct 5, 2025 • 23