Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> There's also the question of pushing the semantic aspect to the ranker vs the index

Could it make sense to perform dynamic vector lookup over the FTS result set best fragments? This could save a lot of money if you have a massive corpus to index because you'd only be paying to embed things that are being searched for at runtime.

Focusing on just the best fragments could also improve the SnR going into the final vector search phase, especially if the fragment length is managed appropriately for each kind of document. If we are dealing with a method from a codebase, then we might prefer to have an unlimited fragment length. For a 20 megabyte PDF, it could be closer to the size of a tweet.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: