Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
"Attention is all you need" paper digested (2018) (youtube.com)
139 points by binidxaba on Oct 7, 2023 | hide | past | favorite | 29 comments


For a higher level shorter overview, I found this video informative: https://youtu.be/SZorAJ4I-sA?si=pnfzZ17PYQfV4aqq


IMO understanding this kind of content depends on where you're starting out, or you're getting too much or too little explanation of the supporting components.

Personally, I used ChatGPT to help me understand this paper from the stance of a senior engineer with some ML experience, and followed up to clarify where I had questions: https://chat.openai.com/share/c17dbcfc-7e3a-44fd-aa88-28c117...


Attention is super helpful when you have enough data but it didn’t really help me on a problem with a tiny dataset



I need this breakdown, I tried to read the paper but it's not accessible to a more than average interested layman


Google regrets publishing this paper!


Maybe, but it would have surfaced regardless, either directly or through related things. While the transformer may evolve into the next thing, it's equaly likely the next evolution will be unrelated to transformer.

Moreover while the transformers and current LLMs are a leap, the monoculture around them is not necessarily a good thing, defocusing many good researchers from otherwise promising tech.

Finally, cross-polination of ideas is where the magic happens.


It’s probably Nobel worth based on its world impact.


Nobel is a vanity certificate. The author of that paper must have made more money than whatever the Nobel committee would have given to him.


Isn’t Nobel like $1M?


You mean a Turing Award...


What impact?


It's used in all sorts of advanced models, like LLMs and image generators. They lie at the heart of so called foundation models. https://en.wikipedia.org/wiki/Foundation_models


Right, but what is their "world impact"?


That of allowing people to generate, synthesize, and transform all major modalities of media, from text, to audio and video, for peanuts, instantly.

They create immense economic value and increase productivity.


It's deniable right now. Future looks bright for them, but impact _now_ is more smoke than actual effect.


On top of that, they face massive lawsuits for violating existing IP etc. It's far from a proven business model indeed (heck, OpenAI isn't even profitable yet), and even further as a technology (because it is so inherently unreliable - no matter the model size).

So atm they just sell hype, and not much more.


World impact is not how Nobels are won. If that was the case Elon Musk, Jeff Bezos, Zuckerberg and others would all have multiple ones.

I honestly do not see this paper as being in the same magnitude of brilliance as a typical Nobel would be. Not to mention that it barely counts as science (actually it probably does not). Don't get me wrong. It is a huge achievement for both the machine learning research field and for humanity as a whole, but putting along the achievements of Nobel physicists and such feels wrong.


objectively speaking what makes them so much better?


Here’s the thing, if google didn’t publish this paper, it would probably just collect dust somewhere at Google.

OpenAi (ClosedAi) saw the potential and demonstrated its impressive and unbelievable capabilities!

If Google never shared the transformer model with the world, we would probably not have what we have today.


Very much disagree. Google clearly saw the potential of this as well, and did a ton of work and created a lot of leading models based on this.

The big difference between Google and OpenAI is that Google "had a ton more to lose" so to speak and went forward much more cautiously. See all the hullaballoo they had to deal with e.g. with their "Ethical AI" group and the Timnit Gebru fiasco, as well as cases like where that dim bulb Google employee claimed that LaMDA was sentient. OpenAI, on the other hand, was "full speed ahead" from the get-go.

As a result, many of the top AI researchers left Google. After all, wouldn't you rather work at a place where you could see your work productized as fast as you could build it, rather than at a place where other sizable teams in your company were actively working in an adversarial role to put up roadblocks and vetoes wherever they could?


Very sensible comment, agree with it.

I still think that the big cos. (Google, MS, Oracle, ...) will win the AI race in the medium/long term as they have a huge momentum behind them, so they didn't really "miss" anything.

ChatGPT is much better than Google Search, though!


OpenAI is basically front for Microsoft now.



Not sure if sarcasm or not.



Why do you think it's sarcasm?


Love Yannic


Actually, all you need is all you need—anything that claims otherwise is just wrong. In other words, you need all the things that you need.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: