> The prompt used to create the code should also be provided. The LLM-generated code should be clearly marked.
I have a feeling the people who write these haven't really used LLMs for programming because even just playing around with them will make it obvious that this makes no sense - especially if you try to use something local based that lets you rewrite the discussion at will, including any code the LLM generated. E.g. sometimes when trying to get Devstral make something for me, i let it generate whatever (sometimes buggy/not working) code it comes up with[0] and then i start editing its response to fix the bug so that further instructions are under the assumption it generated the correct code from the get go instead of trying to convince it[0] to fix the code it generated. In such a scenario there is no clear separation between LLM-generated code and manually written code nor any specific "prompt" (unless you count all snapshots of the entire discussion every time one hits the "submit" button as a series of prompts, which technically is what the LLM using as a prompt instead of what the user types, but i doubt this was what the author had in mind).
And all that without taking into account what someone commented in the article about code not even done in a single session but with plans, restarting from scratch, summarizing, etc (and there are tools to automate these too and those can use a variety of prompts by themselves that the end user isn't even aware of).
TBH i think if FSF wants to "consider LLMs" they should begin by gaining some real experience using them first - and bringing people with such experience on board to explain things for them.
[0] i do not like anthropomorphizing LLMs, but i cannot think of another description for that :-P
> I have a feeling the people who write these haven't really used LLMs for programming because even just playing around with them will make it obvious that this makes no sense
This is one problem with LLM generated code. It is very greenfield. There’s no correct or even good way to do it. Because it’s a little bit unbounded in possible approaches and quality of output.
I’ve tried tracking prompt history in many permutations as a means to documenting and making rollbacks more possible. I hasn’t felt like that's the right way to think about it.
What you're describing isn't any different from a branch of commits between two people practicing a form of continuous integration where they commit whatever they have (whether it breaks the build or not, or is buggy, etc.), capped off by a merge commit when it's finally in the finished state.
Eh, i do not think these are comparable, unless you really stretch the idea of what is a "commit", who makes it and you consider all sorts of destructive modifications of branch history and commits normal.
Hal and Dave work together. Hal is going home at 6:00 PM, but before it's time to leave, Dave tells Hal to go ahead and start working on some new feature. At 5:50 PM, Hal hits Cmd+Q, saving whatever unfinished work there is and no matter what state it's in and commits it to a new development branch with the commit message "Start on $X" followed by a copy of the explanation of that Dave first gave Hal about what they needed to do. Then Hal pushes that commit upstream for Dave and leaves. At 6:00 PM Dave, still at the office, runs git-pull, spends a little time fixing up several issues with the code Hal wrote, then commits the result and pushes it to the development branch of the shared repo. Dave's changes mainly focus on getting the project to build again and making sure some or all of the existing tests pass. Dave then writes an email to Hal about this progress. At 8:30 PM Hal reads Dave's email about what Dave fixed and what Hal should do now. Hal then runs git-pull and writes some more code, pushing the result to the development branch before watching a movie and going to bed. Around midnight, Dave runs git-pull, fixes some more problems with the code that Hal wrote, and then pushes that to the repo. The next day at the office, they resume their work together following this pattern, where Hal writes the bulk of the code followed by Dave fixing it up and/or providing instruction for Hal about how to proceed. When they're done, one of them switches to the main branch with `git checkout main` and runs `git merge $OUR_DEVELOPMENT_BRANCH_NAME`.
Which part of this entails "destructive modifications of branch history"?
I have a feeling the people who write these haven't really used LLMs for programming because even just playing around with them will make it obvious that this makes no sense - especially if you try to use something local based that lets you rewrite the discussion at will, including any code the LLM generated. E.g. sometimes when trying to get Devstral make something for me, i let it generate whatever (sometimes buggy/not working) code it comes up with[0] and then i start editing its response to fix the bug so that further instructions are under the assumption it generated the correct code from the get go instead of trying to convince it[0] to fix the code it generated. In such a scenario there is no clear separation between LLM-generated code and manually written code nor any specific "prompt" (unless you count all snapshots of the entire discussion every time one hits the "submit" button as a series of prompts, which technically is what the LLM using as a prompt instead of what the user types, but i doubt this was what the author had in mind).
And all that without taking into account what someone commented in the article about code not even done in a single session but with plans, restarting from scratch, summarizing, etc (and there are tools to automate these too and those can use a variety of prompts by themselves that the end user isn't even aware of).
TBH i think if FSF wants to "consider LLMs" they should begin by gaining some real experience using them first - and bringing people with such experience on board to explain things for them.
[0] i do not like anthropomorphizing LLMs, but i cannot think of another description for that :-P