Hacker Newsnew | past | comments | ask | show | jobs | submit | dmarchand90's commentslogin

This is wild


I've found how CNN map to visual cortex to be very clear. But I've always been a bit confused about how llms map to the brain. Is that even the case?


They probably don’t. They’re very different. LLM’s seem to be based on pragmatic, mathematical techniques developed over time to produce patterns from data.

There’s at least three fields in this:

1. Machine learning using non-neurological techniques (most stuff). These use a combination of statistical algorithms stitched together with hyperparameter tweaking. Also, usually global optimization by heavy methods like backpropagation.

2. “Brain-inspired” or “biologically accurate”algorithms that try to imitate the brain. They sometimes include evidence their behavior matches experimental observations of brain behavior. Many of these use complex neurons, spiking nets, and/or local learning (Hebbian).

(Note: There is some work on hybrids such as integrating hippocampus-like memory or doing limited backpropagation on Hebbian-like architectures.)

3. Computational neuroscience which aims to make biologically-accurate models at various levels of granularity. Their goal is to understand brain function. A common reason is diagnosing and treating neurological disorders.

Making an LLM like the brain would require use of brain-inspired components, multiple systems specialized for certain tasks, memory integrated into all of them, and a brain-like model for reinforcement. Imitating God’s complex design is simply much more difficult than combining proven algorithms that work well enough. ;)

That said, I keep collecting work on both efficient ML and brain-inspired ML. I think some combination of the techniques might have high impact later. I think the lower, training costs of some brain-inspired methods, especially Hebbian learning, justify more experimentation by small teams with small, GPU budgets. Might find something cost-effective in that research. We need more of it on common platforms, too, like HughingFace libraries and cheap VM’s.


> how llms map to the brain

For the lower level - word embedings (word2vec, "King – Man + Woman = Queen") - one can see a similarity

https://www.nature.com/articles/d41586-019-00069-1 and https://gallantlab.org/viewer-huth-2016/

"The map reveals how language is spread throughout the cortex and across both hemispheres, showing groups of words clustered together by meaning."


That is the latent space.

Very different from a feed forward network with perceptrons, auttograd, etc...

Inner product spaces are fixed points, mapping between models is less surprising because the general case is a merger set IIRC.


You have to be careful with nature. There are the scientific articles which are top of the line and written by deep experts.

Then you have the journalism side which is basically just another main stream news website. (I'm not even against main stream news but it's not the same as scientific articles)


You should probably have a requirements.txt file instead of just a list of requirements. It's often hard to tell which combination of package versions will 'actually' work when running these things


Forgot that. Fixed now


My guess is it will be like pascal or smalltalk, an important development for illustrating a concept but is ultimately replaced by something more rigorous


I kinda suspect it might be that chagpt is excellent at getting you to an "average" performance in any field.

My background is computational material science, but more on materials than the computational part. I have an ok broad knowledge of most CS topics but I'm always finding I'm playing catch up. My work also involves a lot of making research prototypes in areas I don't have time to get a proper background in.

For me GPT has had a transformative impact on my work.

For example I had a lot of projects that needed Docker. I have an ok idea of what Docker is and what i want to do with it. But, I don't have the time of a real software developer to learn the syntax and deal with subtle bugs or how to do basic things, e.g., "how do I ssh into my Docker container X"

I think I'm on the end of users that is best poised to make use of llms. A decent knowledge of what strategy i want to go for but don't know the tactics. And I'm mediocre enough at programming that the Llm can usually beat me. Another example, I would just never write any unit tests, not enough time. With llms I can get simple dirty tests done + I know enough about testing to filter out the bad ones and tune the best ones.

I see poor responders on two extremes on either side of me. People who really don't know what they are doing and can't prompt correct the llm into doing anything better. And people who really know what they are doing and are generally working on one tech stack/ project and don't need help getting dumb basics in place + have more time to write things themselves.


I really wish these public articles tried to give some estimate of the magnitude of the effect. Is this a big difference? Barely noticeable without using advanced statistics?

I often think of David Mitchell's commentary on whether stripes make you look fat: https://youtu.be/ISZyJ5MHApI?si=4_hJVLfsvMWWDXKE


"Well, Stan, the truth is marijuana probably isn't gonna make you kill people, and it most likely isn't gonna fund terrorism, but, well son, pot makes you feel fine with being bored, and it's when you're bored that you should be learning some new skill or discovering some new science or being creative." Randy Marsh, South Park


if you can channel this okay-with-boredom attitude into something productive you can be super productive. I don't think everyone can do it but I feel like it's definitely true of some people.


Not really a new problem:

Take this quote from George Orwell on the Spanish Civil War

"I saw newspaper reports which did not bear any relation to the facts, not even the relationship which is implied in an ordinary lie. I saw great battles reported where there had been no fighting, and complete silence where hundreds of men had been killed. I saw troops who had fought bravely denounced as cowards and traitors, and others who had never seen a shot fired hailed as the heroes of imaginary victories, and I saw newspapers in London retailing these lies and eager intellectuals building emotional superstructures over events that had never happened. I saw, in fact, history being written not in terms of what happened but of what ought to have happened according to various ‘party lines’. Yet in a way, horrible as all this was, it was unimportant."


"the researchers built a simulation of a real social network"

I just find this to be such a large detail that it's hard to gloss over. What are the assumptions of this model in the simulation? How accurate is the model? How did you evaluate that accuracy? How sensitive are the results to the model's parameters?


It's disappointing to me that 10+ hours after this post your comment or one like it isn't (even close to) the top one. The article doesn't explain, AT ALL, how their model works. Does tik tok "shadow ban" the way they model? Does any social network?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: