>Sorry, it won’t. This might even be peak GPT. Training data comes from human co...

yohannesk · on March 25, 2023

In go, or similarly chess, the AI can play stupendous number of games against itself and get accurate feedback for every single game. Everything is there to create your own training set just from knowing the rules. But outside of such games, how does an AI create it's own training data when there is no function to tell you how well you are doing? This might be a dumb question, I don't have any idea on how LLMs work

thom · on March 25, 2023

One such function is “what happens next?” which may work as well in the real world as on textual training data. Certainly it’s part of how human babies learn, via schemas.

typon · on March 25, 2023

Creating something is much harder than verifying it.

A simple setup for improving coding skills is the following:

1. GPT is given a coding task to implement as a high level prompt.

2. It generates unit tests to verify that the implementation is correct.

3. It generates code to implement the algorithm.

4. It runs the generated code against the generated unit tests. If there are errors generated by the interpreter/compiler, go back to Step 3, modify the code appropriately and try again.

5. If there are no errors found, take the generated code as a positive example and update the model weights with reinforcement learning.

ses1984 · on March 25, 2023

What if it’s wrong at step 2?

PoignardAzur · on March 25, 2023

The most naive way you could do things could be to procedurally generate immense amounts of python code, then ask the model to predict whether the code will compile, whether it will crash, what its outputs will be given certain inputs, etc.

visarga · on March 25, 2023

Code execution is also a good way to collect feedback signals.

trc001 · on March 25, 2023

Well, there sort of is a linear relationship between the ammount of training data and the quality of the model [1]

[1]: https://arxiv.org/abs/2203.15556