what was supposed to be the important part, "AI bad"? the author is not some clueless pedestrian, they are clearly online enough to be fully aware that all social media companies treat their users like cattle. so why the pikachu face when Instagram (of all things!) does something it is designed to do - squeezing every last bit of value from its digital serfs?
Llama was not great, it was barely good, it wasn't very smart nor creative and had it's guardrails cranked up to 11. Local models didn't get interesting until Mistral and China entered the game. Meta still hasn't released it's image models which has been trained on 10s of thousands of my photos.
yeah, well, it was all we had, hence llama.cpp, ollama, r/localllama, etc, all of which look increasingly silly now that it's highly unlikely we'll ever have another Llama.
Yea I’ll give meta a bit of credit for that but I remember llama’s first release was a leak and i remember frantically downloading them just incase they would get permanently taken down but to everyone’s surprise meta decided to roll with it and embrace the open source community. Unfortunately they face planted with llama4 which was weird since they were supposed to have so much “talent” working on it.
The new SAM (segment anything) and SAM3D are actually impressive and good on them for releasing it to the public. They still need to release an image model.
I honestly believe the weird pursuit for “safety” is what sabotaged them, it seems to lobotomy models. It’s also the reason Stable Diffusion went from the hot thing to a joke. Stable diffusion 3 was so safe you couldn’t generate a woman laying down on some grass because that’s apparently dangerous for reasons unknown.
All models have had their “safety” and guardrails removed by the community and the world didn’t end.
You wouldn’t though. And that’s part of the allure. Someone doing something transgressive that the rest of us wouldn’t, and (sometimes) making something beautiful. I certainly wouldn’t give the label superhero. But it can be interesting.
If solutions always come first then you might never get a chance to vent. Maybe venting clears the annoyance from the brain enough to make it easier to understand any solutions that might be offered. Also sometimes I have been offered solutions that seem obvious to me, like did you really think I hadn’t thought of that? Which is especially piquing haha
> Also sometimes I have been offered solutions that seem obvious to me, like did you really think I hadn’t thought of that? Which is especially piquing haha
Yes, but that's still a solution minded thing. I sometimes complain as well, but, as mentioned, as sort of a rubber ducking method. I listen to the proposals again, I go, nah, tried that, It leads to X, that doesn't work because of Y, but, sometimes, even with these obvious solutions, there are tiny aspects I overlooked or bypasses I did not consider, so this is still potentially useful. And, yes, if we both can't find a solutin that is acceptable, then comiseration is in order. But I'd never manifest anger or disapproval about someone wanting to help.
Yes. The learning comes from running tests on the program and ensuring they pass. So running as an agent. Tests and compiler give hard feedback- thats the data outside the model that it learns from.
I think modern RLHF schemes have models that train LLMs. LLMs teaching each other isn't new.
It’s basically called “reinforced learning” and it’s a common technique for machine learning.
You provide a goal as a big reward (eg test passing), and smaller rewards for any particular behaviours you want to encourage, and then leave the machine to figure out the best way to achieve those rewards through trial and error.
After a few million attempts, you generally either have a decent result, or more data around additional weights you need to apply before reiterating on the training.
Defining the goal is the easy part: as I said in my OP, the goal is unit tests passing.
It’s the other weights that are harder. You might want execution speed to be one metric. But how do you add weights to prevent cheating (eg hardcoding the results)? Or use of anti-patterns like global variables? (For example. Though one could argue that scoped variables aren’t something an AI-first language would need)
This is where the human feedback part comes into play.
It’s definitely not an easy problem. But it’s still more pragmatic than having a human curate the corpus. Particularly considering the end goal (no pun intended) is having an AI-first programming language.
I should close off by saying that I’m very skeptical that there’s any real value in an AI-first PL. so all of this is just a thought experiment rather than something I’d advocate.
With such learning your model needs to be able to provide some kind of solution or at least approximate it right off the bat. Otherwise it will keep producing random sequences of tokens and will not learn anything ever because there will be nothing in its output to reward, so no guidance.
I don’t agree it needs to provide a solution off the bat. But I do agree there is some initial weights you need to define.
With a AI-first language, I suspect the primitives to be more similar to assembly or WASM rather than something human readable like Rust or Python. So the amount of pre-training preparation would’ve a little easier since syntax errors due to parser constraints.
I’m not suggesting this would be easy though haha. I think it’s a solvable problem but that doesn’t mean it’s easy.
reply