"Attention is all you need" paper digested (2018)

wanderingstan · on Oct 7, 2023

For a higher level shorter overview, I found this video informative: https://youtu.be/SZorAJ4I-sA?si=pnfzZ17PYQfV4aqq

personjerry · on Oct 8, 2023

IMO understanding this kind of content depends on where you're starting out, or you're getting too much or too little explanation of the supporting components.

Personally, I used ChatGPT to help me understand this paper from the stance of a senior engineer with some ML experience, and followed up to clarify where I had questions: https://chat.openai.com/share/c17dbcfc-7e3a-44fd-aa88-28c117...

alexfromapex · on Oct 8, 2023

Attention is super helpful when you have enough data but it didn’t really help me on a problem with a tiny dataset

machinelearning · on Oct 7, 2023

https://www.askyoutube.ai/share/6521aa6077733b7c0ad24b55

piyh · on Oct 7, 2023

I need this breakdown, I tried to read the paper but it's not accessible to a more than average interested layman

dzign · on Oct 7, 2023

Google regrets publishing this paper!

random3 · on Oct 7, 2023

Maybe, but it would have surfaced regardless, either directly or through related things. While the transformer may evolve into the next thing, it's equaly likely the next evolution will be unrelated to transformer.

Moreover while the transformers and current LLMs are a leap, the monoculture around them is not necessarily a good thing, defocusing many good researchers from otherwise promising tech.

Finally, cross-polination of ideas is where the magic happens.

mensetmanusman · on Oct 7, 2023

It’s probably Nobel worth based on its world impact.

akomtu · on Oct 7, 2023

Nobel is a vanity certificate. The author of that paper must have made more money than whatever the Nobel committee would have given to him.

mensetmanusman · on Oct 8, 2023

Isn’t Nobel like $1M?

dzign · on Oct 7, 2023

You mean a Turing Award...

dinvlad · on Oct 7, 2023

What impact?

esafak · on Oct 7, 2023

It's used in all sorts of advanced models, like LLMs and image generators. They lie at the heart of so called foundation models. https://en.wikipedia.org/wiki/Foundation_models

dinvlad · on Oct 8, 2023

Right, but what is their "world impact"?

esafak · on Oct 8, 2023

That of allowing people to generate, synthesize, and transform all major modalities of media, from text, to audio and video, for peanuts, instantly.

They create immense economic value and increase productivity.

KptMarchewa · on Oct 8, 2023

It's deniable right now. Future looks bright for them, but impact _now_ is more smoke than actual effect.

dinvlad · on Oct 9, 2023

On top of that, they face massive lawsuits for violating existing IP etc. It's far from a proven business model indeed (heck, OpenAI isn't even profitable yet), and even further as a technology (because it is so inherently unreliable - no matter the model size).

So atm they just sell hype, and not much more.

numbers_guy · on Oct 7, 2023

World impact is not how Nobels are won. If that was the case Elon Musk, Jeff Bezos, Zuckerberg and others would all have multiple ones.

I honestly do not see this paper as being in the same magnitude of brilliance as a typical Nobel would be. Not to mention that it barely counts as science (actually it probably does not). Don't get me wrong. It is a huge achievement for both the machine learning research field and for humanity as a whole, but putting along the achievements of Nobel physicists and such feels wrong.

dvngnt_ · on Oct 7, 2023

objectively speaking what makes them so much better?

Alifatisk · on Oct 7, 2023

Here’s the thing, if google didn’t publish this paper, it would probably just collect dust somewhere at Google.

OpenAi (ClosedAi) saw the potential and demonstrated its impressive and unbelievable capabilities!

If Google never shared the transformer model with the world, we would probably not have what we have today.

hn_throwaway_99 · on Oct 7, 2023

Very much disagree. Google clearly saw the potential of this as well, and did a ton of work and created a lot of leading models based on this.

The big difference between Google and OpenAI is that Google "had a ton more to lose" so to speak and went forward much more cautiously. See all the hullaballoo they had to deal with e.g. with their "Ethical AI" group and the Timnit Gebru fiasco, as well as cases like where that dim bulb Google employee claimed that LaMDA was sentient. OpenAI, on the other hand, was "full speed ahead" from the get-go.

As a result, many of the top AI researchers left Google. After all, wouldn't you rather work at a place where you could see your work productized as fast as you could build it, rather than at a place where other sizable teams in your company were actively working in an adversarial role to put up roadblocks and vetoes wherever they could?

moralestapia · on Oct 7, 2023

Very sensible comment, agree with it.

I still think that the big cos. (Google, MS, Oracle, ...) will win the AI race in the medium/long term as they have a huge momentum behind them, so they didn't really "miss" anything.

ChatGPT is much better than Google Search, though!

KptMarchewa · on Oct 8, 2023

OpenAI is basically front for Microsoft now.

dzign · on Oct 7, 2023

https://www.washingtonpost.com/technology/2023/05/04/google-...

abhishekjha · on Oct 7, 2023

Not sure if sarcasm or not.

dzign · on Oct 7, 2023

https://www.washingtonpost.com/technology/2023/05/04/google-...

nothrowaways · on Oct 7, 2023

Why do you think it's sarcasm?

artninja1988 · on Oct 7, 2023

Love Yannic

behnamoh · on Oct 8, 2023

Actually, all you need is all you need—anything that claims otherwise is just wrong. In other words, you need all the things that you need.