Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Is anyone working on or knows a library for evaluating LLMs for application features and/or application features that use LLMs? I am wondering what people use or if anyone has their own solution.




There would be so much subjectivity to this. I like the idea but executing in a reliable, repeatable way would be very challenging imo.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: