Subscribe
Sign in
Share this post
arXiv Daily
2025년 7월 8일
Copy link
Facebook
Email
Notes
More
2025년 7월 8일
Kim Seonghyeon
Jul 8
7
Share this post
arXiv Daily
2025년 7월 8일
Copy link
Facebook
Email
Notes
More
Pre-Trained Policy Discriminators are General Reward Models
Read →
Comments
Share
Copy link
Facebook
Email
Notes
More
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts
Share this post
2025년 7월 8일
Share this post
Pre-Trained Policy Discriminators are General Reward Models