Hacker Newsnew | past | comments | ask | show | jobs | submit | fancyfredbot's commentslogin

I hate to break this to you but the article wrong. There are pictures of him.

Yup. The very first Google hit, https://gamesworkshop.fandom.com/wiki/Kevin_D._Rountree, has a photo of him. It's in the fscking wiki for Games Workshop, the company he runs. I have no idea how TFA couldn't find this.

The Financial Times also asserts in 2024 to have no public domain image of him either: https://www.ft.com/content/369279f5-6f44-4248-a0f1-5347864ea...

There's a possibility the Games Workshop wiki image is a generic stock business person place holder.

The Australian science fiction writer Greg Egan strongly asserts no image of him appears on the internet ... and yet images are returned if you search his name and profession ... they are all different.

See: https://www.gregegan.net/images/GregEgan.htm



Two things stand out to me:

1) Battery life claims are specific and very impressive, possibly best in class 2) Performance claims are vague and uninspiring.

Either this is an awful press release or this generation isn't taking back the performance crown.


Windows 11 usage is declining. The Xbox is selling vastly less than Sony/Nintendo. PC gamers are moving to SteamOS and Linux. The billions poured into OpenAI no longer look so smart given very competitive offerings elsewhere.

Despite all this they still have a hugely profitable business, a pretty decent OS under all the adware, and a defacto monopoly on business productivity software.


first paragraph is why we should be selling or second?

Corporations get their software into businesses through the exact same process software gets replaced in those companies… usually through IT and/or users using things personally who become their champions.

So which paragraph do you think was more relevant to their recommendation…the one where they already have most of the customers they will ever have, or the one where people are increasingly moving away from them in their daily lives?


in just last 5 months they got two new corporate customers with 1400 and 550 employees. and this is just me, one nobody that knows about. if you think they are not getting new corporate customers not daily but hourly you mite be tad misinformed.

as an exercise see how many job openings there are where you won’t be using MSFT products if you get the gig :)


Likely using a rather generous definition of “new”. There is a difference between a new customer, and buying a license. Im also fairly doubtful that every server, docker, vm, and appliance is also running Windows. And even if said 2000 users are using Windows for absolutely every system, it’s still a meaningless anecdote about a drop in the bucket. I don’t think anyone suggested that Microsoft doesn’t have customers? But I suspect they were far from “new” customers, even if a new company, because I guarantee something somewhere was replaced for every one of them; bankrupt businesses they replaced, old hardware, whatever. Arguing the opposite would certainly seem to be naive on face.

wasn't expecting to read that Microsoft is not getting new corporate customers but here we are, you learn something new every day :)

none of this is anecdotal, I make a living as contractor and in just past two years have worked on numerous moving-to-microsoft projects, Oracle to SQL Server, AWS to Azure, Sharepoint etc etc... I am not a fan of MSFT by any means but what you are writing makes absolutely no sense. You should read MSFT quarterly earnings reports and not read few anecdotal things people on HN write about MSFT. It is M7 for a reason and practically has no competition (which is why they are able to do shit like Windows 11 and Copilot and... people on HN might be bitching but it is just for entertainment purposes)


Anecdotes like “I’ve done blah blah over two years”? Correct, I ignore anecdotes just like that. You can argue whatever you like — you seem to be heavily financially motivated to do so while I neither own Microsoft stock nor earn my money by convincing people to use their products. As a result, feel free to continue your evangelism while I go ahead and extricate myself from your sphere of biases.

It's not the official reason, but also worth noting that many waterproof devices have headphone jacks.

Very sceptical that a 3kW speaker can cause "earthquake like vibrations with a radius of 2km".

I can't help but think it would be fun to try to verify the claim, though.

Fun? Sure. It is indeed fun to play with big speakers.

Direct-radiating bass reproduction is all about displacement, and the area of the piston (cone) is certainly a factor of that. More tends to be... well, more.

And this mysterious speaker (which there seems to be no color photos of, despite the 1981 date) has a radiating area of perhaps about 2 square meters.

That's around the same as qty. 18 of 18" woofers.

It's easy to find collections of way, way more than that. People even charge money to hear them; they're on the ground between the stage and the crowd barrier at any big rock show. :)


The japanese version of the article linked in another comment has a color photo (which appears to be a magazine scan) https://audio-heritage.jp/DIATONE/diatonesp/d-160(1).jpg

Marty McFly is volunteering to test it.

Resonance! Very minor earthquakes can knock picutures off the walls, items off the shelves etc. if they just happen to hit the right resonant frequency. So if you flood the area with 8Hz-ish acoustic energy, some stuff will start to shake.

You will probably end up in court. But you might not get convicted.

Shakeeb Ahmed was convicted of wire fraud for exploiting a smart contract bug.

Avi Eisenberg was also convicted for exploiting a smart contract bug, but he had his conviction overturned on appeal.

The Peraire-Bueno brothers were in court for exploiting a bug in the MEV mechanism but it ended in a mis-trial so we're going to have to wait to find out.

Not legal advice ;-)


Top Tip: If you find the orange site's conversation on crypto to be repetitive you can change the top bar. Conversation stays the same but the colour can be changed!

Readers will want to note that this delightful feature is only available to users above 251 karma, or a knack for UserCSS.

Yeah, always takes me a minute when people say 'the orange site' (especially elsewhere) - it's green if I'm logged in, so I rarely see it orange, and then it's 'wuh, I'm logged out, [logs in]'.

Fortunately I'm not prone to refer to the green site.


Wow thank you, I'm about to be on the blue site. I never knew this and really don't like the orange.

0000FF gang, unite!


Does it really matter? The article definitely shows signs of being LLM assisted but it doesn't read like pure slop to me. It reads more like the author used an LLM to summarise his thoughts.

If it feels like I'm reading a human essay that received some help from a copyeditor, no. If it feels like I'm reading the output of "hey LLM, write a contrarian take on the use of AI in business"... yeah, I think it matters, because it shifts the balance toward effortless production of endless HN bait.

In my view the reasons why LLMs may be less effective in a corporate environment is quite different from the human factors in mythical man month.

I think that the reason LLMs don't work as well in a corporate environment with large codebases and complex business logic, but do work well in greenfield projects, is linked to the amount of context the agents can maintain.

Many types of corporate overhead can be reduced using an LLM. Especially following "well meant but inefficient" process around JIRA tickets, testing evidence, code review, documentation etc.


I've found that something very similar to those "inefficient" processes works incredibly well when applied to LLMs. All of those processes are designed to allow for seamless handoff to different people who may not be familiar with the project or code which is exactly what an LLM behaves like when you clear its context.

The limited LLM context windows could be an argument in favor of a microservices architecture with each service or library in its own repository.

That just moves the complexity to the interactions between repositories, where it’s more difficult to understand and fix.

The METR study cited here is very interesting.

"In the METR study, developers predicted AI would make them 24% faster before starting. After finishing 19% slower, they still believed they'd been 20% faster."

I hadn't heard of this study before. Seems like it's been mentioned on HN before but not got much traction.


I see it brought up almost every week! It's a firm favorite of the "LLMs don't actually help write code" contingent, probably because there are very few other credible studies they can point to in support of their position.

Most people who cite it clearly didn't read as far as the table where METR themselves say:

> We do not provide evidence that:

> 1) AI systems do not currently speed up many or most software developers. Clarification: We do not claim that our developers or repositories represent a majority or plurality of software development work

> 2) AI systems do not speed up individuals or groups in domains other than software development. Clarification: We only study software development

> 3) AI systems in the near future will not speed up developers in our exact setting. Clarification: Progress is difficult to predict, and there has been substantial AI progress over the past five years [3]

> 4) There are not ways of using existing AI systems more effectively to achieve positive speedup in our exact setting. Clarification: Cursor does not sample many tokens from LLMs, it may not use optimal prompting/scaffolding, and domain/repository-specific training/finetuning/few-shot learning could yield positive speedup

https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...


Weird, you shouldn't really need to list the things your study doesn't prove! I guess they anticipated that the study might be misrepresented and wanted to get ahead of that.

Their study still shows something interesting, and quite surprising. But if you choose to extrapolate from this specific setting and say coding assistants don't work in general then that's not scientific and you need to be careful.

I think the studyshould probably decrease your prior that AI assistants actually speed up development, even if developers using AI tell you otherwise. The fact it feels faster when it is slower is super interesting.


The lesson I took from the study is that developers are terrible at estimating their own productivity based on a new tool.

Being armed with that knowledge is useful when thinking about my own productivity, as I know that there's a risk of me over-estimating the impact of this stuff.

But then I look at https://github.com/simonw which currently lists 530 commits over 46 repositories for the month of December, which is the month I started using Opus 4.5 in Claude Code. That looks pretty credible to me!


The lesson I learned is that agentic coding uses intermittent reinforcement to mimic a slot machine.

It (along with the hundreds of billions in investments hinging on it), explains the legions of people online who passionately defend their "system". Every gambler has a "system" and they usually earnestly believe it is helping them.

Some people even write popular (and profitable!) blogs about playing slots machines where they share their tips and tricks.


I really wish this meme would die.

We know LLMs instruction follow meaningfully and relatively consistently; we know they are in context learners and also pull from their context window for knowledge; we also know that prompt phrasing and especially organization can have a large effect on their behavior in general; we know from first principles that you can improve the reliability of their results by putting them in a loop with compilers / linters / tests because they do actually fix things when you tell them to. None of this is equivalent to a gambler's superstitions. It may not be perfectly effective, but neither are a million other systems and best practices and paradigms in software.

Also, it doesn't "use" anything. It may be a feature of the program but it isn't intentionally designed that way.

Also who sits around rerunning the same prompt over and over again to see if you get a different outcome like its a slot machine? You just directly tell it to fix whatever was bad about the output and it does so. Sometimes initial outputs have a larger or smaller amount of bad, but still. It isn't really analogous to a slot machine.

Also, you talk as if the whole "do something -> might work / might not, stochastic to a degree, but also meaningfully directable -> dopamine rush if it does; if not goto 1" loop isn't inherent to coding lol


I dont think the "meme" that LLMs follow instructions inconsistently will ever die because they do. It's in the nature of how LLMs function under the hood.

>Also who sits around rerunning the same prompt over and over again to see if you get a different outcome like its a slot machine?

Nobody. Plenty of people do like to tell the LLM that somebody might die if they dont do X properly and other such faith based interventions with their "magic box" though.

Boy do their eyes light up when they hit the "jackpot", too (LLM writes what appears to be the correct code on the first shot).


They're so much more consistent now than they used to be. The new LLMs almost always boast about how much better they are at "instruction following" and it really shows, I find Claude 4.5 and GPT-5.x models do exactly what I tell them to most of the time.

I am going to prefix this with that I could be completely wrong.

Simon - you are an outlier in the sense that basically your job is to play with LLMs. You don't have stakeholders with requirements that they themselves don't understand, you don't have to go to meetings, deal with a team, shout at people, do PRs etc., etc. The whole SDLC/process of SWE is compressed for you.


That's mostly (though not 100%) true, and a fair comment to make here.

Something that's a little relevant to how I work here is that I deliberately use big-team software engineering methods - issue trackers, automated tests, CI, PR code reviews, comprehensive documentation, well-tuned development environments - for all of my personal projects, because I find they help me move faster: https://simonwillison.net/2022/Nov/26/productivity/

But yes, it's entirely fair to point out that my use of LLMs is quite detached from how they might be used on large team commercial projects.


I think this shows where the real value of AI coding is: brand new repos, on tiny throwaway projects.

I'm not going to browse every commit in that repo, but half of the projects were created in december. The rest are either a few months old or less than a year.

This is not representative of the industry.


That's certainly an impressive month! However, it's conceivable that you are an outlier (in the best possible way!)

I liked the way they did that study and I would be interested to see an updated version with new tools.

I'm not particularly sceptical myself and my guess is that using Opus 4.5 would probably have produced a different result to the one in the original study.


I'm definitely an outlier - I've been pushing the boundaries of these tools for three years now and this month I've been deliberately throwing some absurdly ambitious problems at Opus 4.5 (like this one: https://static.simonwillison.net/static/2025/claude-code-mic...) to see how far it can go.

Very interesting example. It's an insanely complex task even with a reference implementation in another language.

It's surprising that it manages the majority of the test cases but not all of them. That's not a very human-like result. I would expect humans to be bimodal with some people getting stuck earlier and the rest completing everything. Fractal intelligence strikes again I guess?

Do you think the way you specified the task at such a high level made it easier for Claude? I would have probably tried to be much more specific for example by translating on a file by file or function by function basis. But I've no idea if this is a good approach. I'm really tempted to try this now! Very inspiring.


> Do you think the way you specified the task at such a high level made it easier for Claude?

Absolutely. The trick I've found works best for these longer tasks is to give it an existing test suite and a goal to get those tests to pass, see also: https://simonwillison.net/2025/Dec/15/porting-justhtml/

In this case ripping off the MicroQuickJS test suite was the big unlock.

I have a WebAssembly runtime demo I need to publish where I used the WebAssembly specification itself, which it turns out has a comprehensive test suite built in as well.


In the 80s, when the mouse was just becoming common, there was a study comparing programming using a mouse vs. just a keyboard. Programmers thought they were faster using a keyboard, but they were actually faster using a mouse.

That's the Ask Tog "study"[1]. It wasn't programmers, just regular users. The problem is he just says it, and of course Apple at the time of the Macintosh's development would have a strong motivation to prove mousing superior to keyboarding to skeptical users. Additionally, the experience level of the users was never specified.

[1]: https://www.asktog.com/TOI/toi06KeyboardVMouse1.html


This suprises me because at the time user interfaces were optimised for keyboard - the only input device most people had. Also screen resolutions were lower so there were fewer things you could click on anyway.

METR has some substantial AI industry ties, so I wonder if those clarifications (especially the one pointing at their own studies describing AI progress) are a way to mitigate concerns that industry would have with the apparent results of this study.

Plenty of people have been (too) quick to dismiss that study as not generally applicable because it was about highly experienced OSS devs rather than your average corporation programmer drone.

The issue I have with the paper is that it seems (based on my skimming) that they did not pick developers who were already versed with AI tooling. So they're comparing (experienced dev working in the way they're comfortable) vs (experienced dev working with new tool for the first time and not having passed the productivity slump from onboarding).

The thing I find interesting is that there is trillions of dollars in valuations hinging upon this question and yet the appetite to spend a little bit of money to repeat this study and then release the results publicly is apparently very low.

It reminds me of global warming where on one side of the debate there some scientists with very little money running experiments and on the other side there were some ridiculously wealthy corporations publicly poking holes in those experiments but who secretly knew they were valid since the 1960s.


Yeah, it's kind of a Bayesian probability thing, where the impressiveness of either outcome depends on what we expected to happen by default.

1. There are bajillions of dollars in incentives for a study declaring "Insane Improvements", so we should expect a bunch to finish being funded, launched, and released... Yet we don't see many.

2. There is comparatively no money (and little fame) behind a study saying "This Is Hot Air", so even a few seem significant.


Longitudinal studies are definitely needed, but of course at the time the research for this paper was done there weren't any programmers experienced with AI assist out there yet.

That's interesting context for sure, but the fact these were experienced developers makes it all the more surprising that they didn't realise the LLM slowed them down.

Measuring programming productivity in general is notoriously difficult, subjectively measuring your own programming productivity is even worse. A magic LoC machine saying brrrrrt gives an overoptimistic sense of getting things done.

I can believe it.

It will zero-shot a full system for you in 5 minutes, but then if you ask for a minor change to that system it will completely shit the bed.

And you have no understanding of what it has written, so you’d have to check everything.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: