OpenDataArena

community

https://opendataarena.github.io

OpenDataArena

Activity Feed

AI & ML interests

Data-centric AI, LLM, MLLM

Recent Activity

yu0226 authored a paper 7 days ago

REST: Stress Testing Large Reasoning Models by Asking Multiple Problems at Once

yu0226 authored a paper 7 days ago

InverTune: Removing Backdoors from Multimodal Contrastive Learning Models via Trigger Inversion and Activation Tuning

yu0226 authored a paper 7 days ago

ScaleDiff: Scaling Difficult Problems for Advanced Mathematical Reasoning

View all activity

yu0226

authored 4 papers 7 days ago

REST: Stress Testing Large Reasoning Models by Asking Multiple Problems at Once

Paper • 2507.10541 • Published Jul 14 • 29

InverTune: Removing Backdoors from Multimodal Contrastive Learning Models via Trigger Inversion and Activation Tuning

Paper • 2506.12411 • Published Jun 14

ScaleDiff: Scaling Difficult Problems for Advanced Mathematical Reasoning

Paper • 2509.21070 • Published Sep 25 • 9

Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning

Paper • 2510.04081 • Published Oct 5 • 23

Xiaoyang318

updated a dataset 11 days ago

OpenDataArena/MathLake

Viewer • Updated 11 days ago • 8.31M • 870 • 9

Xiaoyang318

published a dataset 11 days ago

OpenDataArena/MathLake

Viewer • Updated 11 days ago • 8.31M • 870 • 9

LHL3341

authored 5 papers about 2 months ago

LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models

Paper • 2410.09732 • Published Oct 13, 2024 • 54

A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis

Paper • 2504.12322 • Published Apr 11 • 28

Can One Domain Help Others? A Data-Centric Study on Multi-Domain Reasoning via Reinforcement Learning

Paper • 2507.17512 • Published Jul 23 • 36

Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning

Paper • 2510.04081 • Published Oct 5 • 23

Where am I? Cross-View Geo-localization with Natural Language Descriptions

Paper • 2412.17007 • Published Dec 22, 2024

apeters

authored 9 papers 2 months ago

FABind: Fast and Accurate Protein-Ligand Binding

Paper • 2310.06763 • Published Oct 10, 2023

BioT5: Enriching Cross-modal Integration in Biology with Chemical Knowledge and Natural Language Associations

Paper • 2310.07276 • Published Oct 11, 2023 • 5

MolXPT: Wrapping Molecules with Text for Generative Pre-training

Paper • 2305.10688 • Published May 18, 2023 • 1

BioT5+: Towards Generalized Biological Understanding with IUPAC Integration and Multi-task Tuning

Paper • 2402.17810 • Published Feb 27, 2024 • 1

Leveraging Biomolecule and Natural Language through Multi-Modal Learning: A Survey

Paper • 2403.01528 • Published Mar 3, 2024 • 1

SSM-DTA: Breaking the Barriers of Data Scarcity in Drug-Target Affinity Prediction

Paper • 2206.09818 • Published Jun 20, 2022

3D-MolT5: Towards Unified 3D Molecule-Text Modeling with 3D Molecular Tokenization

Paper • 2406.05797 • Published Jun 9, 2024 • 2

Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining

Paper • 2410.08102 • Published Oct 10, 2024 • 21

Improving Temporal Generalization of Pre-trained Language Models with Lexical Semantic Change

Paper • 2210.17127 • Published Oct 31, 2022 • 1

AI & ML interests

Recent Activity

Team members 13

OpenDataArena's activity