SelectiveDPO - a glorgao Collection

glorgao 's Collections

SelectiveDPO

updated May 15, 2025

Released models trained by Selective DPO.

glorgao/SelectiveDPO-Gemma2-9B-SFT-UFBinarized

Text Generation • 9B • Updated May 15, 2025 • 12
glorgao/SelectiveDPO-Llama3-8B-SFT-UFBinarized

Text Generation • 8B • Updated May 15, 2025 • 14 • 1
glorgao/SelectiveDPO-Qwen2.5-7B-SFT-UFBinarized

Text Generation • 7B • Updated May 15, 2025 • 11 • 1
glorgao/SelectiveDPO-Mistral-7B-SFT-UFBinarized

Text Generation • 7B • Updated May 15, 2025 • 7
Principled Data Selection for Alignment: The Hidden Risks of Difficult Examples

Paper • 2502.09650 • Published Feb 11, 2025