Post Training Versions - Qwen 0.6B - a AIPlans Collection

AIPlans 's Collections

Post Training Versions - Qwen 0.6B

updated 13 days ago

Different versions of Qwen 0.6b, where the only difference is the post training method used. The post training database will be the HelpSteer2 dataset

Upvote

AIPlans/Qwen3-0.6B-ORPO

Text Generation • Updated Nov 28, 2025 • 4
AIPlans/Qwen3-0.6B-DPO_NOTLORA

Text Generation • 0.6B • Updated Nov 25, 2025 • 3
AIPlans/Qwen3-0.6B-GRPO_Epoch2

Text Generation • 0.6B • Updated 17 days ago • 12
AIPlans/Qwen3-0.6B-ReMax

Reinforcement Learning • 0.6B • Updated 13 days ago • 25 • 1
AIPlans/Qwen3-0.6B-IPO

Reinforcement Learning • 0.6B • Updated 24 days ago • 9
AIPlans/Qwen3-0.6B-KTO

Text Generation • Updated Nov 22, 2025 • 2 • 1

Upvote

Collection guide
Browse collections