Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Author of related work here. This is very cool! I was hoping that they would try to invert layer by layer from the output to the input but it seems that they do a search process at the input layer instead. They rightly point out the residual connections make a layer by layer approach difficult. I may point out though that an rmsnorm layer should be invertible due to the epsilon term in the denominator which can be used to recover the input magnitude




What is meant by "residual connections" here?



Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: