As embedding models become more performant and context windows increase, “ideal ...

emporas · on June 8, 2024

Bigger context windows don't work as well as advertised. The "hay in a haystack" problem is not solved yet.

Also bigger context windows mean a lot more time waiting for an answer. Given the quadratic nature of context windows, we are stuck using transformers in smaller chunks. Other architectures like Mamba may solve that, but even then, increases in context window accuracy are not 1000x.