That's exactly what Elon has said. He's said that 99.99% of all vehicle miles ar...

croes · on June 15, 2024

That's not the same. Elon thinks if he gets the right training data he gets rid of the dangerous 1% of errors.

But that is not the case if it's a general problem of the used AI.

forgot-im-old · on June 15, 2024

Right, getting more corner cases training data won't solve an architecture problem. AI in general quickly impresses when it's mostly right but improving from there is the challenge.

throwaway7ahgb · on June 15, 2024

What are you using for "AI"? AI is a buzzword, unless you know specifically how they are using it, you can't make this claim.

Many marketing teams are just using AI when it is really ML doing the work, or it could be both.

croes · on June 15, 2024

There is no claim just a fact.

More data won't help if the problem is the tools as such.

Grandparent said the hard part is getting rid of the last 1%, parent claimed Elon said the dame when he said 99.99% of the trait data is useless.

But it's not the same. Elon thinks he just needs the right data to solve the problem but it could be impossible even if he gets the data because of the limitations of the used type of AI.

If you need a screwdriver but only have a hammer more nails won't help.

AI is here used by me as an umbrella term for computer decision systems.

what-the-grump · on June 15, 2024

And yet on a very simple drive I have to intervene 4-6 times over a distance of 8 miles. How is this not useful? It would have been easier to ask people to record how to drive roads by now and use video game track logic where you race a ghost by now…

The only time fsd works ‘ok’, single lane roads with 90 degree stop signs / turns.

I don’t believe that the current hardware can handle what is needed to have passable FSD for an average consumer.

georgeg23 · on June 15, 2024

So isn't that a deep problem to his FSD architecture?

acchow · on June 15, 2024

No. For the easy 99.999% of driving they keep very little of the training data.

Basically you want to minimize manual interventions (aka disengagements). When the driver intervenes, they keep a few seconds before (30 seconds?) and after that intervention and add that to the training data.

So their training data is basically just the exceptional cases.

They need to just make sure they don’t overfit so that the learned model actually does have some “understanding” of why decisions are made and can generalize.

forgot-im-old · on June 15, 2024

It's not clear that a bunch of cascaded rectified linear functions will every generalize to near 100%. The error floor is at a dangerous level regardless of training. AGI is needed to tackle the final 1%>

red75prime · on June 15, 2024

The universal approximation theorem disagrees. The question is how large the network should be and how much training data it needs. And for now it can only be tested experimentally.

forgot-im-old · on June 15, 2024

The universal approximation theorem does not apply once you include any realistic training algorithms / stochastic gradient descent. There isn't a learnability guarantee.

red75prime · on June 15, 2024

There's no theorem that SGD is insufficient. So, as I said, it's empirical.

forgot-im-old · on June 16, 2024

You said it only depends on network size, I'm saying it more likely is impossible regardless of network size due to fundamental limits in training methods.