Is anyone working on or knows a library for evaluating LLMs for application feat... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		koakuma-chan 4 days ago \| parent \| context \| favorite \| on: Ask HN: What Are You Working On? (Nov 2025) Is anyone working on or knows a library for evaluating LLMs for application features and/or application features that use LLMs? I am wondering what people use or if anyone has their own solution.

Supercompressor 4 days ago [–]

There would be so much subjectivity to this. I like the idea but executing in a reliable, repeatable way would be very challenging imo.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact