Thanks for posting this. Most people don't realize that Logistic regression can ...

nextos · on Sept 20, 2023

Came here to say the same thing, actually NCD can probably do much better than 78%. Li & Vitanyi's book about Kolmogorov complexity has some interesting unsupervised examples.

A simple CNN as implemented in Keras tutorial can easily exceed 98%. 78% is very poor performance for MNIST even if model complexity is penalized.

mardifoufs · on Sept 20, 2023

When would a 90% accuracy on a dataset like mnist ever be useful? And I mean useful as in usable for actual products or software. especially considering mnist is more of a toy dataset.

I think that's why machine learning is the way to go for this type of detection, why go with anything else than a CNN (in this case) when it is now trivial to set up and train? Again, unless it's just to mess around with, 90% mnist accuracy is not useful in the real world

vannevar · on Sept 20, 2023

I don't think the point was that you should use logistic regression on MNIST. In lesser-known problems, say a custom in-house model, if you don't try the simpler approach first, you'll never know that your more complex solution is not worth the extra expense, or is actually worse than a simpler, cheaper model. MNIST is well-known to have nearly perfect solutions at this point, but for most novel problems, the data scientist has no idea what is theoretically possible.

Now, you can say that CNNs or other techniques are easily accessible these days, and almost trivial to set up. But they may not be trivial to train and run in terms of compute in the real world.

KRAKRISMOTT · on Sept 20, 2023

If you stack logistic regression in layers, you get a neural network.

bugglebeetle · on Sept 20, 2023

> People, even machine learning people, often don't realize the rapidly diminishing returns you get for adding a lot of complexity in your models.

Are you me? I was just having this argument at work with someone about using an old (fasttext/word2vec) model vs. the overhead on fine-tuned BERT model for a fairly simple classification problem.

short_sells_poo · on Sept 20, 2023

To a large degree this happens because logistic regression is not a sexy approach that one can add to their CV. Everyone wants to solve problems with big, complicated and buzzwordy models, because that sells (and is perhaps more interesting as well).

It's really a tragedy, because so many classic models would work fine for real world applications.

vjerancrnjak · on Sept 20, 2023

Extending existing features with their non-linear mappings would improve logistic regression too, probably to the svc level (rbf or poly kernel is that but implicit).

Linear models are really well researched, and today with the compute and with proper training and data preparation they can easily get to satisfying levels of performance for a variety of tasks.

make3 · on Sept 20, 2023

this is true unless you can use large pretrained models, which are very simple to use, are very resistant to ask sorts of noise, & would get ultra high accuracy with just logistic regression on the output of the model

salty_biscuits · on Sept 21, 2023

You can get 80ish percent with a linear model (if you one hot the labels).