D-RLAIF openai/summarize_from_feedback Viewer • Updated Jan 3, 2023 • 194k • 3.46k • 216 trl-internal-testing/tldr-preference-sft-trl-style Viewer • Updated Aug 20, 2024 • 130k • 384 • 3
D-RLAIF openai/summarize_from_feedback Viewer • Updated Jan 3, 2023 • 194k • 3.46k • 216 trl-internal-testing/tldr-preference-sft-trl-style Viewer • Updated Aug 20, 2024 • 130k • 384 • 3