Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Seems like the model isn't limited to those though, from the paper:

> as well as some additional relevant languages (Arabic, Catalan, Chinese, Galician, Hindi, Japanese, Korean, Norwegian, Russian, Turkish, and Ukrainian).

https://arxiv.org/pdf/2409.16235

The paper also goes into detail on training set sources, which I feel like a curation thereof might be considered the main contribution of this publication?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: