Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This paper suggests that you don't need the debating part: just get LLM to work on the problem independently, and choose the most popular answer.


The paper says that it enhances existing methods such as prompt engineering (chain of thought) and LLM debate. This agent method is orthogonal to LLM debate.


Interesting. Somehow it seems odd to add randomness (temperature) and then wash it away by averaging it out.


In optimization problems, randomness can often get you out of local minima/maxima, and so averaging out a bunch of random search paths might get you better results in the worst case. Something similar might be happening here. The training set will be biased in various ways that might create weird local min/max points and so this process could avoid those weird kinks.


temp applies to each token so the range of temperature is significantly larger than the average being pulled




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: