Lynx Li Blog
  • Home
  • Archives
  • Categories
  • Tags
  • About

57 posts in total


2025

05-12
Optimizer Factory -- 写一个能够按层衰减的优化器工厂
05-12
RLHF -- DPO
05-12
RLHF -- GRPO
05-12
RLHF -- From Zero to PPO 代码篇
05-12
RLHF -- From Zero to PPO 理论篇
05-12
最初的sin/cos编码
05-12
Why model.enable_input_require_grads()?
05-08
Rethinking R1-like Rule-based RL
05-08
05:矩阵分解

2024

01-07
02:信号的分析方法
123456

Search

Hexo Fluid