Lynx Li
  • Home
  • Archives
  • Categories
  • Tags
  • About
May 8, 2025 am
1 字 1 分钟 次

Rethinking R1-like Rule-based RL

本文最后更新于:May 8, 2025 am

Introduction


Research Blogs
#LLM #Reasoning
Rethinking R1-like Rule-based RL
https://jesseprince.github.io/2025/05/08/research/rule_based_rl/
Author
林正
Posted on
May 8, 2025
Licensed under
Differentiable Permutation Layer Previous
01-EM Models Next

Table of Contents

Search

Hexo Fluid
总访问量 次 总访客数 人