Lynx Li
Home
Archives
Categories
Tags
About
58 posts in total
2025
05-08
Differentiable Permutation Layer
05-08
Rethinking R1-like Rule-based RL
04-30
01-EM Models
02-16
RLHF -- GRPO
02-16
RLHF -- DPO
02-16
RLHF -- From Zero to PPO 代码篇
02-13
RLHF -- From Zero to PPO 理论篇
02-06
最初的sin/cos编码
2024
07-26
栈的基本应用
03-11
Mamba
1
2
3
…
6
Search
×
Keyword
Blog works best with JavaScript enabled