RLHF -- From Zero to PPO 代码篇
Last updated on November 23, 2025 pm
RLHF: From Zero to PPO 代码篇
1 简单的强化学习示例
ongoing
2 从OpenRLHF中看PPO实现
ongoing
RLHF -- From Zero to PPO 代码篇
https://lynx-li.github.io/2025/05/12/llms/rlhf/ppo_from_start_code/
Last updated on November 23, 2025 pm
ongoing
ongoing