RLHF -- DPO Last updated on November 23, 2025 pm RLHF – DPO ongoing LLM > RLHF #深度学习 #智能系统 #AIGC RLHF -- DPO https://lynx-li.github.io/2025/05/12/llms/rlhf/dpo/ Author Lynx Li Posted on May 12, 2025 Licensed under Optimizer Factory -- 写一个能够按层衰减的优化器工厂 Previous RLHF -- GRPO Next