Hacker Newsnew | past | comments | ask | show | jobs | submit | mluo's submissionslogin
1.DeepScaleR: Surpassing O1-Preview with a 1.5B Model by Scaling RL (pretty-radio-b75.notion.site)
19 points by mluo 11 months ago | past

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: