More

_ntka · on Aug 23, 2024

Radiation would perhaps be better pictured with a light blue glow (Cherenkov radiation).

The green glow used in pop culture has its origin in the widespread use of radium paint to achieve a glow-in-the-dark effect (e.g. on watch faces) in the early 20th century. I still own a radium watch. The paint was always fluorescent green. And it did glow.

BoxOfRain · on Aug 23, 2024

Isn’t it usually the phosphor that wears out rather than the radium decaying that stops them glowing? Would it be possible for someone with the correct safety equipment to apply a new layer of fluorescent paint?

cyberax · on Aug 23, 2024

It's possible, but it's a bad idea, unless you just want it for display in a sealed case.

If you want a sane version, get modern tritium-based glowing capsules.

_ntka · on April 21, 2024

Isn't JAX the most widely used framework in the GenAI space? Most companies there use it -- Cohere, Anthropic, CharacterAI, xAI, Midjourney etc.

smhx · on April 22, 2024

most of the GenAI players use both PyTorch and JAX, depending on the hardware they are running on. Character, Anthro, Midjourney, etc. are dual shops (they use both). xAI only uses JAX afaik.

mistrial9 · on April 21, 2024

just guessing that tech leadership at all of those traces back to Google somehow

_ntka · on March 30, 2024

JAX is used by almost every large genAI player (Anthropic, Cohere, DeepMind, Midjourney, Character.ai, XAi, Apple, etc.). Its actual market share in foundation models development is something like 80%.

Also JAX is not just for TPU. It's mainly for GPU. It's usually 2-3x faster than torch on GPU: https://keras.io/getting_started/benchmarks/

Far more industry users of JAX use it on GPU compared to TPU.

kaycebasques · on March 30, 2024

Are there any resources going into detail about why the big players prefer JAX? I've heard this before but have never seen explanations of why/how this happened.

_ntka · on March 30, 2024

It's all about cost and performance. If you can train a foundation model 2x faster with JAX on the same hardware, you are effectively slashing your training costs by 2x, which is significant for a multi-million dollar training run.

ein0p · on March 30, 2024

“Faster” might have been true pre-Hopper, but there’s no such guarantee now.

ein0p · on March 30, 2024

Are you on one of those (usually small) teams? No? Then it’s probably not a good choice for you.

_ntka · on March 30, 2024

Or alternatively, do you want faster training runs (and thus lower training costs)? Then JAX is a good choice for you.

ein0p · on March 30, 2024

The current SOTA models (GPT4, DALL-E, Sora) were trained on GPUs. The next one (GPT5) will be, too. And the one after that. Besides, only very few people train models that need more than a few hundred H100s at a time, and PyTorch works well at that scale. And when you train large scale stuff the scaling problems are demonstrably surmountable, unlike, say, capacity problems which you will run into if you need a ton of modern TPU quota, because Google itself is pretty compute starved at the moment. Also, gone are the days when TPUs were significantly faster. GPUs have “TPUs” inside them, too, nowadays

_ntka · on March 31, 2024

No, I am saying, with JAX you train on G.P.U., with a G, and your training runs are >2x faster, so your training costs are 2x lower, which matters whether your training spend is $1k or $100M. You're not interested in that? That's ok, but most people are.

ein0p · on March 31, 2024

Have you actually tried that or are you just regurgitating Google’s marketing? I’ve seen Jax perform _slower_ than PyTorch on practical GPU workloads on the exact same machine, and not by a little, by something like 20%. I too thought I’d be getting great performance and “saving money”, but reality turned out to be a bit more complicated than that - you have to benchmark and tune.

phyalow · on March 30, 2024

For smaller scale projects, its basically a no brainer still to use pytorch.

_ntka · on Sept 28, 2022

This is not true, the initial Keras port of the model was done by Divam Gupta who is not affiliated with Keras or Google. He works at Meta.

The benchmark in the article uses mixed precision (and equivalent generation settings) for both implementations, it's a fair benchmark.

In the latest StackOverflow global developer survey, TensorFlow had 50% more users than PyTorch.

zone411 · on Sept 28, 2022

Two Keras creators are listed as authors on this post. If they were not involved, this should be specified. I specifically talked about research and StackOverflow is not in any way representative of what's used. Do you disagree that the majority of neural net research papers now only have PyTorch implementations, not TensorFlow? Also, according to Google Trends, PyTorch is more popular: https://trends.google.com/trends/explore?geo=US&q=pytorch,te.... BTW, I would love it if TF made a strong comeback, it's always better to have two big competing frameworks and I have some issues with PyTorch, including with its performance.

polygamous_bat · on Sept 28, 2022

> In the latest StackOverflow global developer survey, TensorFlow had 50% more users than PyTorch.

It also doesn't help that PyTorch has its own discussion forum [1] where most pytorch questions end up.

[1]: https://discuss.pytorch.org/

_ntka · on Aug 18, 2019

7 cubic km of olivine is a staggering amount. Makes you wonder whether the effect of deploying the olivine would even offset the energy cost and CO2 emissions of extracting and moving around all that rock.

thatswrong0 · on Aug 18, 2019

I would assume that the cost of extracting and moving is baked into that number

Also, see https://news.ycombinator.com/item?id=20407355

_ntka · on Dec 5, 2018

If I were to divide large tech companies between "good" and "bad", the only one I would classify as "bad" is Facebook. Most are neutral-to-good. Facebook has a huge societal and human cost and has hardly any benefits to show for it (unlike, say, fossil energy companies, which are a threat to civilization but at least serve a useful role by fulfilling our energy needs).

Facebook is a democracy-threatening, attention-wasting, extractive institution, with a deeply unethical leadership. Most tech companies are neutral -- Amazon, Microsoft, Uber etc., I assume they're doing business for profit and not social good, but they do provide value and largely play by the rules. Some other companies I would classify as "good" as they provide considerable value to society as a by-product of doing business, like Apple and Google (disclaimer: I work for Google, and I am quite happy about that).

Facebook is the only large tech company I can think of that is just plain evil. There's no other like it. It's in a league of its own.

Steltek · on Dec 5, 2018

I would easily throw Uber into the evil category. "Contractor" loophole to evade minimum wage (min. wage being already pretty pathetic), Greyball, and a general culture of knowingly ignoring laws cultivated by it's founder.

_ntka · on Aug 18, 2018

> there isn't anyone living today that has any idea how to get to AGI

Not true.

YeGoblynQueenne · on Aug 18, 2018

Well then- who has?

_ntka · on March 23, 2018

When someone shows you who they are, believe them the first time.

_ntka · on Nov 22, 2017

> IMO, AssertionErrors should indicate bugs in the called API. ValueErrors indicate bugs in the caller's use of the API.

Yes, I agree with this stance. `ValueError` should be used for user-provided input validation (as well as `TypeError` in some cases). But as it happens, many Python developers use `assert` statements to do input validation, and generally don't provide any error messages in their `assert` statements. I'm suggesting going with `ValueError` (and a nice message) instead.

_ntka · on Nov 21, 2017

To clarify: `tf.keras` is an implementation of the entire Keras API written from the ground-up in pure TensorFlow. The first benefit of that is a greater level of blending between non-Keras-TF workflows and TF-Keras workflows: for instance, layers from `tf.layers` and `tf.keras` are interchangeable in all use cases.

Additionally, this enables us to add TensorFlow-specific features that would be difficult to add to the multi-backend version of Keras, and to do performance optimizations that would otherwise be impossible.

Such features include support for Eager mode (dynamic graphs), support for TensorFlow Estimators, which enables distributed training and training on TPUs, and more to come.

mark_l_watson · on Nov 21, 2017

Thanks for clarifying that. BTW, I love your new book. I bought the MEAP last month and I really enjoyed it. Looking forward to the final version.

anentropic · on Nov 21, 2017

What is the book? It sounds like something I might want to read!

anentropic · on Nov 21, 2017

I believe it must be https://www.manning.com/books/deep-learning-with-python