Hacker Newsnew | past | comments | ask | show | jobs | submit | fromlogin
Compiling LLMs into a MegaKernel: A path to low-latency inference (zhihaojia.medium.com)
314 points by matt_d 4 months ago | past | 76 comments

Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: