RMCBench: Benchmarking Large Language Models' Resistance to Malicious Code Paper • 2409.15154 • Published Sep 23, 2024
Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought Paper • 2505.15431 • Published May 21, 2025 • 1
ArtifactsBench: Bridging the Visual-Interactive Gap in LLM Code Generation Evaluation Paper • 2507.04952 • Published Jul 7, 2025 • 10
Adaptive Termination for Multi-round Parallel Reasoning: An Universal Semantic Entropy-Guided Framework Paper • 2507.06829 • Published Jul 9, 2025