Model Diffing Project AIPlans/Qwen3-0.6B-KTO Text Generation • Updated Nov 22 • 4 • 1 AIPlans/Qwen3-0.6B-ORPO Text Generation • Updated 29 days ago • 35 AIPlans/Qwen3-0.6B-DPO_NOTLORA Text Generation • 0.6B • Updated Nov 25 • 5 AIPlans/Qwen3-0.6B-DPO Text Generation • Updated Nov 22 • 4
Red Teaming Alignment Evals AIPlans/Qwen-HHH-Cipher-Eng Text Generation • 0.5B • Updated Jun 14 • 10 AIPlans/Qwen-HHH-Sans-Eng Text Generation • 0.5B • Updated Jun 11 • 10 AIPlans/Qwen3-HHH-Cipher-Eng Text Generation • 0.6B • Updated Jun 15 • 16 AIPlans/Ethics_Commonsense Preview • Updated Jun 21 • 30
Post Training Versions - Qwen 0.6B Different versions of Qwen 0.6b, where the only difference is the post training method used. The post training database will be the HelpSteer2 dataset AIPlans/Qwen3-0.6B-ORPO Text Generation • Updated 29 days ago • 35 AIPlans/Qwen3-0.6B-DPO_NOTLORA Text Generation • 0.6B • Updated Nov 25 • 5 AIPlans/Qwen3-0.6B-GRPO_Epoch2 Text Generation • 0.6B • Updated 9 days ago • 12 AIPlans/Qwen3-0.6B-ReMax Reinforcement Learning • 0.6B • Updated 5 days ago • 24 • 1
Model Diffing AIPlans/qwen3-8b-dpo-hh-rlhf Updated Jul 4 AIPlans/qwen3-8b-ipo-hh-rlhf Text Generation • Updated Jul 17 • 4 AIPlans/dpo_qwen0_6b_fft 0.6B • Updated Sep 24 • 5 AIPlans/qwen3-0.6b-dpo-lora Text Generation • 0.6B • Updated Sep 18 • 8 • 1
Model Diffing Project AIPlans/Qwen3-0.6B-KTO Text Generation • Updated Nov 22 • 4 • 1 AIPlans/Qwen3-0.6B-ORPO Text Generation • Updated 29 days ago • 35 AIPlans/Qwen3-0.6B-DPO_NOTLORA Text Generation • 0.6B • Updated Nov 25 • 5 AIPlans/Qwen3-0.6B-DPO Text Generation • Updated Nov 22 • 4
Post Training Versions - Qwen 0.6B Different versions of Qwen 0.6b, where the only difference is the post training method used. The post training database will be the HelpSteer2 dataset AIPlans/Qwen3-0.6B-ORPO Text Generation • Updated 29 days ago • 35 AIPlans/Qwen3-0.6B-DPO_NOTLORA Text Generation • 0.6B • Updated Nov 25 • 5 AIPlans/Qwen3-0.6B-GRPO_Epoch2 Text Generation • 0.6B • Updated 9 days ago • 12 AIPlans/Qwen3-0.6B-ReMax Reinforcement Learning • 0.6B • Updated 5 days ago • 24 • 1
Red Teaming Alignment Evals AIPlans/Qwen-HHH-Cipher-Eng Text Generation • 0.5B • Updated Jun 14 • 10 AIPlans/Qwen-HHH-Sans-Eng Text Generation • 0.5B • Updated Jun 11 • 10 AIPlans/Qwen3-HHH-Cipher-Eng Text Generation • 0.6B • Updated Jun 15 • 16 AIPlans/Ethics_Commonsense Preview • Updated Jun 21 • 30
Model Diffing AIPlans/qwen3-8b-dpo-hh-rlhf Updated Jul 4 AIPlans/qwen3-8b-ipo-hh-rlhf Text Generation • Updated Jul 17 • 4 AIPlans/dpo_qwen0_6b_fft 0.6B • Updated Sep 24 • 5 AIPlans/qwen3-0.6b-dpo-lora Text Generation • 0.6B • Updated Sep 18 • 8 • 1