Zhiyuan Ning

nzynzy

AI & ML interests

None yet

Recent Activity

upvoted a paper 30 days ago

The Station: An Open-World Environment for AI-Driven Discovery

upvoted a paper 30 days ago

GigaEvo: An Open Source Optimization Framework Powered By LLMs And Evolution Algorithms

upvoted a paper 30 days ago

P1: Mastering Physics Olympiads with Reinforcement Learning

View all activity

Organizations

None yet

upvoted 8 papers 30 days ago

The Station: An Open-World Environment for AI-Driven Discovery

Paper • 2511.06309 • Published Nov 9 • 36

GigaEvo: An Open Source Optimization Framework Powered By LLMs And Evolution Algorithms

Paper • 2511.17592 • Published Nov 17 • 118

Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning

Paper • 2511.16043 • Published Nov 20 • 108

Latent Collaboration in Multi-Agent Systems

Paper • 2511.20639 • Published Nov 25 • 117

upvoted a paper about 1 month ago

What Does It Take to Be a Good AI Research Agent? Studying the Role of Ideation Diversity

Paper • 2511.15593 • Published Nov 19 • 57

upvoted a paper 2 months ago

DeepAgent: A General Reasoning Agent with Scalable Toolsets

Paper • 2510.21618 • Published Oct 24 • 99

upvoted 10 papers 3 months ago

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

Paper • 2510.08697 • Published Oct 9 • 36

LaSeR: Reinforcement Learning with Last-Token Self-Rewarding

Paper • 2510.14943 • Published Oct 16 • 39

Multi-Agent Tool-Integrated Policy Optimization

Paper • 2510.04678 • Published Oct 6 • 30

Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels

Paper • 2510.06499 • Published Oct 7 • 31

Vibe Checker: Aligning Code Evaluation with Human Preference

Paper • 2510.07315 • Published Oct 8 • 32

RLFR: Extending Reinforcement Learning for LLMs with Flow Environment

Paper • 2510.10201 • Published Oct 11 • 35

The Alignment Waltz: Jointly Training Agents to Collaborate for Safety

Paper • 2510.08240 • Published Oct 9 • 41

MITS: Enhanced Tree Search Reasoning for LLMs via Pointwise Mutual Information

Paper • 2510.03632 • Published Oct 4 • 41

Training-Free Group Relative Policy Optimization

Paper • 2510.08191 • Published Oct 9 • 44

Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward

Paper • 2510.03222 • Published Oct 3 • 75

Zhiyuan Ning

AI & ML interests

Recent Activity

Organizations

nzynzy's activity