RLHF -- From Zero to PPO 代码篇

本文最后更新于：February 16, 2025 pm

RLHF: From Zero to PPO 代码篇

1 简单的强化学习示例

ongoing

2 从OpenRLHF中看PPO实现

ongoing

AIGC > RLHF

#智能系统 #深度学习 #AIGC

RLHF -- From Zero to PPO 代码篇

https://jesseprince.github.io/2025/02/16/aigcs/rlhf/ppo_from_start_code/

Author

林正

Posted on

February 16, 2025

Licensed under

RLHF -- DPO Previous

RLHF -- From Zero to PPO 理论篇 Next