That is comparing an all to all switched Nvlink fabric to a 3D torus for TPUs. Those are completely different network topologies with different tradeoffs.
For example the currently very popular Mixture of Experts architectures require a lot of all to all traffic (for expert parallelism) which works a lot better on the switched NVlink fabric as opposed where it doesn't need to traverse multiple links in the torus.
This is an underrated point. Comparing just the peak bandwidth is like saying Bulldozer was the far superior CPU of the era because it had a really high frequency ceiling.
Really? Fully-connected hardware is in buildable (at scale) which we already know from the HPC world. Fat trees and dragonfly networks are pretty scalable, but a 3d torus is a very good tradeofff, and respects the dimensionality of reality.
Bisection bandwidth is a useful metric, but is hop count? Per-hop cost tends to be pretty small.
Latency (of different types), jitter, and guaranteed bandwidth are the real underlying metrics. Hop count is just one potential driver of those, but different approaches may or may not tackle each of these parts differently.
It's the one exception in the semiconductor supply chain where Europe is still leading. For all other parts of the value creation Europe is either a niche player at best or completely absent, well into the actual application layer.
Not since the Ozaki scheme has appeared. Good high-precision perf from low-precision tensor units has unlocked some very interesting uses of low-fp64-perf GPUs.
That, plus the quoted text basically says the model homed in on the monomyth (Hero's journey) structure; while the pattern was identified and named by a 20th century American writer, the pattern itself is common and as ancient as it gets. Wouldn't really call it anglo-american bias.
The monomyth is also writing 101 these days, and considered the default structure you can and should use if you have little experience writing stories, so naturally it'll be a high-probability result of an LLM prompted to write a story - especially prompted in a way that implies the user is inexperienced at writing and needs a result suitable for an inexperienced writer.
> a protagonist lives in or returns home to a small town and resolves a minor conflict by reconnecting with tradition and organising community events
That's... not the Hero's Journey?
(The same study run against Claude Opus would be interesting - if we're going to test models, we might as well play to their strengths. My prediction: better writing, not better plotting).
But these social third places have also shifted. Younger generations aren't going out as much but e.g. playing video games specifically with other close friends is very popular.
Inference throughout scales really well with larger batch sizes (at the cost of latency) due to rising arithmetic intensity and the fact that it's almost always memory BW limited.
If it's just filtered out in the training sets, adding the information as context should work out fine - after all this is exactly how o3, Gemini 2.5 and co deal with information that is newer than their training data cutoff.
Something where you're reachable for any legal purposes- in Germany this sadly remains a physical address.
There are various service which offer a 'virtual' address with digital forwarding of letters for less than 10Eur/Month, so it's not an insurmountable obstacle.
Interestingly enough the price for these giant heatpumps is pretty much in line with domestic ~10kw units.