systemf_omega's comments

systemf_omega · 2026-01-11T11:05:07 1768129507

What I don't understand about this whole "get on board the AI train or get left behind" narrative, what advantage does an early adopter have for AI tools?

The way I see it, I can just start using AI once they get good enough for my type of work. Until then I'm continuing to learn instead of letting my brain atrophy.

simonw · 2026-01-11T11:50:37 1768132237

This is a pretty common position: "I don't worry about getting left behind - it will only take a few weeks to catch up again".

I don't think that's true.

I'm really good at getting great results out of coding agents and LLMs. I've also been using LLMs for code on an almost daily basis since ChatGPT's release on November 30th 2022. That's more than three years ago now.

Meanwhile I see a constant flow of complaints from other developers who can't get anything useful out of these machines, or find that the gains they get are minimal at best.

Using this stuff well is a deep topic. These things can be applied in so many different ways, and to so many different projects. The best asset you can develop is an intuition for what works and what doesn't, and getting that intuition requires months if not years of personal experimentation.

I don't think you can just catch up in a few weeks, and I do think that the risk of falling behind isn't being taken seriously enough by much of the developer population.

I'm glad to see people like antirez ringing the alarm bell about this - it's not going to be a popular position but it needs to be said!

tymscar · 2026-01-12T01:30:29 1768181429

I think you are right in saying that there is some deep intuition that takes months, if not years, to hone about current models, however, the intuition some who did nothing but talk and use LLMs nonstop two years ago would be just as good today as someone who started from scratch, if not worse because of antipatterns that don’t apply anymore, such as always starting a new chat and never using a CLI because of context drift.

Also, Simon, with all due respect, and I mean it, I genuinely look in awe at the amount of posts you have on your blog and your dedication, but it’s clear to anyone that the projects you created and launched before 2022 far exceed anything you’ve done since. And I will be the first to say that I don’t think that’s because of LLMs not being able to help you. But I do think it’s because what makes you really, really good at engineering you kept replacing slowly but surely with LLMs more and more by the month.

If I look at Django, I can clearly see your intelligence, passion, and expertise there. Do you feel that any of the projects you’ve written since LLMs are the main thing you focus on are similar?

Think about it this way: 100% of you wins against 100% of me any day. 100% of Claude running on your computer is the same as 100% of Claude running on mine. 95% of Claude and 5% of you, while still better than me (and your average Joe), is nowhere near the same jump from 95% Claude and 5% me.

I do worry when I see great programmers like you diluting their work.

simonw · 2026-01-12T01:38:00 1768181880

My great regret from the past few years is that experimenting with LLMs has been such a huge distraction from my other work! My https://llm.datasette.io/ tool is from that era though, and it's pretty cool.

tymscar · 2026-01-12T01:47:48 1768182468

I do think your datasettes work is fantastic and I genuinely hope you take my previous message the right way. I’m not saying you do something bad, quite the opposite, I feel like we need more of you and I’m afraid because of LLMs we get less of you.

beaker52 · 2026-01-12T06:50:54 1768200654

(Breaking the 4th wall for a minute):

It’s not just Simon that we’re getting less of, it’s YOU we’re getting less of too. And we want you around. Don’t go.

beaker52 · 2026-01-12T07:15:15 1768202115

> because of antipatterns that don’t apply anymore, such as always starting a new chat

I’m keen to understand your reasoning on this. I don’t agree, but maybe I’m just stuck with old practices, so help me?

What’s your justification as to why starting a new chat is an antipattern?

jcheng · 2026-01-12T02:09:36 1768183776

> 95% of Claude and 5% of you, while still better than me (and your average Joe), is nowhere near the same jump from 95% Claude and 5% me.

I see what you're saying, but I'm not sure it is true. Take simonw and tymscar, put them each in charge of a team of 19 engineers (of identical capabilities). Is the result "nowhere near the same jump" as simonw vs. tymscar alone? I think it's potentially a much bigger jump, if there are differences in who has better ideas and not just who can code the fastest.

tymscar · 2026-01-12T02:33:07 1768185187

I agree, however there you don’t compare technical knowledge alone, you also compare managerial skills.

With LLMs its admittedly a bit closer to doing it yourself because the feedback loop is much tighter

Humorist2290 · 2026-01-11T22:29:18 1768170558

It needs to be said that your opinion on this is well understood by the community, respected, but also far from impartial. You have a clear vested interest in the success of _these_ tools.

There's a learning curve to any toolset, and it may be that using coding agents effectively is more than a few weeks of upskilling. It may be, and likely will be, that people make their whole careers about being experts on this topic.

But it's still a statistical text prediction model, wrapped in fancy gimmicks, sold at a loss by mostly bad faith actors, and very far from its final form. People waiting to get on the bandwagon could well be waiting to pick up the pieces once it collapses.

johnfn · 2026-01-12T05:28:55 1768195735

How does he have a vested interest in the success of these tools? He doesn't work for an AI company. Why must he have some shady ulterior motive rather than just honestly believing the thing they are stated? Yes, he blogs a lot about AI, but don't you have the cart profoundly before the horse if you are asserting that's a "vested interest"? He was free to blog about whatever he wants. Why would he fervently start blogging about AI if he didn't earnestly believe it was an interesting topic to blog about?

> But it's still a statistical text prediction model

This is reductive to the point of absurdity. What other statistical text prediction model can make tool calls to CLI apps and web searches? It's like saying "a computer is nothing special -- it's just a bunch of wires stuck together"

Humorist2290 · 2026-01-12T07:52:41 1768204361

> Why must he have some shady ulterior motive rather than just honestly believing the thing they are are stated?

I wouldn't say it's shady or even untoward. Simon writes prolifically and he seems quite genuinely interested in this. That he has attached his public persona, and what seems like basically all of his time from the last few years, to LLMs and their derivatives is still a vested interest. I wouldn't even say that's bad. Passion about technology is what drives many of us. But it still needs saying.

> This is reductive to the point of absurdity. What other statistical text prediction model can make tool calls to CLI apps and web searches?

It's just a fact that these things are statistical text prediction models. Sure, they're marvels, but they're not deterministic, nor are they reliable. They are like a slot machine with surprisingly good odds: pull the lever and you're almost guaranteed to get something, maybe a jackpot, maybe you'll lose those tokens. For many people it's cheap enough to just keep pulling the lever until they get what they want, or go bankrupt.

mattmanser · 2026-01-11T23:26:32 1768173992

I have a lot of respect from Simon and read a lot of his articles.

But I'm still seeing clear evidence it IS a statistical text prediction model. You ask it the right niche thing and it can only pump out a few variations of the same code, that's clearly someone else's code stolen almost verbatim.

And I just use it 2 or 3 times a day.

How are SimonW and AntiRez not seeing the same thing?

How are they not seeing the propensity for both Claude + ChatGPT to spit out tons of completely pointless error handling code, making what should be a 5 line function a 50 line one?

How are they not seeing that you constantly have to nag it to use modern syntax. Typescript, C#, Python, doesn't matter what you're writing in, it will regularly spit out code patterns that are 10 years out of date. And woe betide you using a library that got updated in the last 2 years. It will constantly revert back to old syntax over and over and over again.

I've also had to deal with a few of my colleagues using AI code on codebases they don't really understand. Wrong sort, id instead of timestamp. Wrong limit. Wrong json encoding, missing key converters. Wrong timezone on dates. A ton of subtle, not obvious, bugs unless you intimately know the code, but would be things you'd look up if you were writing the code.

And that's not even including the bit where the AI obviously decided to edit the wrong search function in a totally different part of the codebase that had nothing to do with what my colleague was doing. But didn't break anything or trigger any tests because it was wrapped in an impossible to hit if clause. And it created a bunch of extra classes to support this phantom code, so hundreds of new lines of code just lurking there, not doing anything but if I hadn't caught it, everyone thinks it does do something.

simonw · 2026-01-12T01:43:49 1768182229

It's mostly a statistical text model, although the RL "reasoning" stuff added in the past 12 months makes that a slightly less true statement - it has extra tricks now to help it bias bits of code to statistically predict that are more likely to work.

The real unlock though is the coding agent harnesses. It doesn't matter any more if it statistically predicts junk code that doesn't compile, because it will see the compiler error and fix it. If you tell it "use red/green TDD" it will write the tests first, then spot when the code fails to pass them and fix that too.

> How are they not seeing the propensity for both Claude + ChatGPT to spit out tons of completely pointless error handling code, making what should be a 5 line function a 50 line one?

TDD helps there a lot - it makes it less likely the model will spit out lines of code that are never executed.

> How are they not seeing that you constantly have to nag it to use modern syntax. Typescript, C#, Python, doesn't matter what you're writing in, it will regularly spit out code patterns that are 10 years out of date.

I find that if I use it in a codebase with modern syntax it will stick to that syntax. A prompting trick I use a lot is "git clone org/repo into /tmp and look at that for inspiration" - that way even a fresh codebase will be able to follow some good conventions from the start.

Plus the moment I see it write code in a style I don't like I tell it what I like instead.

> And that's not even including the bit where the AI obviously decided to edit the wrong search function in a totally different part of the codebase that had nothing to do with what my colleague was doing.

I usually tell it which part of the codebase to execute - or if it decides itself I spot that and tell it that it did the wrong thing - or discard the session entirely and start again with a better prompt.

mattmanser · 2026-01-12T08:58:56 1768208336

Ok, but given the level of detail you're supplying, at that point isn't it quicker to write the code yourself than it is to prompt?

As you have to explain much of this, the natural language words written are much more than just the code and less precise, so it actually takes much longer to type and is more ambiguous. And obviously at the moment ChatGPT tends to make assumptions without asking you, Claude is a little better at asking you for clarification.

I find it so much faster to just ask Claude/ChatGPT for an example of what I'm trying to do and then cut/paste/modify it myself. So just use them as SO on steriods, no agents, no automated coding. Give me the example, and I'll integrate it.

And the end code looks nothing like the supplied example.

I tried using AquaVoice (which is very good) to dictate to it, and that slightly helped, but often I found myself going so slowly just fully prompting the AI when I would have already finished the new code myself at that point.

I was thinking about this last night, I do wonder if this is another example of the difference between deep/narrow coding of specialist/library code and shallow/wide of enterprise/business code.

If you're writing specialist code (like AntiRez), it's dealing with one tight problem. If you're writing enterprise code, it has to take into account so many things, explaining it all to the AI takes forever. Things like use the correct settings from IUserContext, add to the audit in the right place, use the existing utility functions from folder X, add json converters for this data structure, always use this different date encoding because someone made a mistake 10 years ago, etc.

I get that some of these would end up in agents.md/claude.md, but as many people have complained, AI agents often rapidly forget those as the context grows so you have to go through any code generated with a toothcomb, or get it to generate a disproportionate amount of tests, which again you have to explain each and every one.

I guess that will be fixed eventually. But from my perspective, as they're still changing so rapidly and much advice from even 6/9 months ago is now utterly wrong, why not just wait.

I, like many others on this thread, also believe that it's going to take about a week to get up-to-speed when they're finally ready. It's not that I can't use them now, it's that they're slow, unreliable, prone to being a junior on steriods, and actually create more work when reviewing the code than if I'd just written it myself in the first place, and the code is much, much, much worse than MY code. Not necessarily all the people I've worked with's code, but definitely MY code is usually 50-90% more concise.

theshrike79 · 2026-01-12T11:30:43 1768217443

Enterprise code writer here.

> If you're writing enterprise code, it has to take into account so many things, explaining it all to the AI takes forever. Things like use the correct settings from IUserContext, add to the audit in the right place, use the existing utility functions from folder X, add json converters for this data structure, always use this different date encoding because someone made a mistake 10 years ago, etc.

The fix for this is... documentation. All of these need to be documented in a place that's accessible to the agent. That's it.

I've just about one-shotted UI features with Claude just by giving it a screenshot of the Figma design (couldn't be bothered with the MCP) and the ticket about the feature.

It used our very custom front-end components correctly, used the correct testing library, wrote playwright tests and everything. Took me maybe 30 minutes from first prompt to PR.

If I (a backend programmer) had to do it, it would've taken me about a day of trying different things to see which one of the 42 different ways of doing it worked.

mattmanser · 2026-01-12T13:21:00 1768224060

I talk about why that doesn't work the line after you've quoted. Everyone's having problems with context windows and CC/etc. rapidly forgetting instructions.

I'm fullstack, I use AI for FE too. They've been able to do the screenshot trick for over a year now. I know it's pretty good at making a page, but the code is usually rubbish and you'll have a bunch of totally unnecessary useEffect, useMemo and styling in that page that it's picked up from its training data. Do you have any idea what all the useEffect() and useMemo() it's littered all over your new page do? I can guarantee almost all of them are wrong or unnecessary.

I use that page you one-shotted as a starting point, it's not production-grade code. The final thing will look nothing like it. Good for solving the blank page problem for me though.

simonw · 2026-01-12T14:57:47 1768229867

> Everyone's having problems with context windows and CC/etc. rapidly forgetting instructions.

I'm not having those problems at all... because I've developed a robust intuition for how to avoid them!

insin · 2026-01-12T14:42:14 1768228934

That matches my experience with LLM-aided PRs - if you see a useEffect() with an obvious LLM line-comment above it, it's 95% going to be either unneccessary or buggy (e.g. too-broad dependencies which cause lots of unwanted recomputes).

MaybiusStrip · 2026-01-12T01:32:51 1768181571

You can literally go look at some of antirez's PRs described here in this article. They're not seeing it because it's not there?

Honestly, what you're describing sounds like the older models. If you are getting these sorts of results with Opus 4.5 or 5.2-codex on high I would be very curious to see your prompts/workflow.

suddenlybananas · 2026-01-12T08:46:30 1768207590

People have been saying "Oh use glorp 3.835 and those problems don't happen anymore" for about 3 years at this point. It's always the fact you're not using the latest model that's the problem.

jimmaswell · 2026-01-12T00:07:25 1768176445

> You ask it the right niche thing and it can only pump out a few variations of the same code, that's clearly someone else's code stolen almost verbatim.

There are only so many ways to express the same idea. Even clean room engineers write incidentally identical code to the source sometimes.

mattmanser · 2026-01-12T11:36:38 1768217798

There was an example on here recently where an AI PR to an open source literally had someone else's name in the comments in the code, and included their license.

That's the level of tell-tale that's its just stealing code and modifying a couple of variable names.

For me personally, the code I've seen might be written in a slightly weird style, or have strange, not applicable to the question, additions.

They're so obviously not "clean room" code or incredibly generic, they're the opposite, they're incredibly specific.

systemf_omega · 2026-01-11T12:04:36 1768133076

> Using this stuff well is a deep topic.

Just like the stuff LLMs are being used for today. Why wouldn't "using LLMs well" be not just one of the many things LLMs will simplify too?

Or do you believe your type of knowledge is somehow special and is resistant to being vastly simplified or even made obsolete by AI?

simonw · 2026-01-11T14:34:28 1768142068

An interesting trend over the past year is that LLMs have learned how to prompt each other.

Back in ~2024 a lot of people were excited about having "LLMs write the prompt!" but I found the results to be really disappointing - they were full of things like "You are the world's best expert in marketing" which was superstitious junk.

As of 2025 I'm finding they actually do know how to prompt, which makes sense because there's a ton more information about good prompting approaches in the training data as opposed to a couple of years ago. This has unlocked some very interesting patterns, such as Claude Code prompting sub-agents to help it explore codebases without polluting the top level token window.

But learning to prompt is not the key skill in getting good results out of LLMs. The thing that matters most is having a robust model of what they can and cannot do. Asking an LLM "can you do X" is still the kind of thing I wouldn't trust them to answer in a useful way, because they're always constrained by training data that was only aware of their predecessors.

leonidasv · 2026-01-11T16:21:34 1768148494

Unless we figure out how to make 1 billion+ tokens multimodal context windows (in a commercially viable way) and connect them to Google Docs/Slack/Notion/Zoom meetings/etc, I don't think it will simplify that much. Most of the work is adjusting your mental model to the fact that the agent is a stateless machine that starts from scratch every single time and has little-to-no knowledge besides what's in the code, so you have to be very specific about the context of the task in some ways.

It's different from assigning a task to a co-worker who already knows the business rules and cross-implications of the code in the real world. The agent can't see the broader picture of the stuff it's making, it can go from ignoring obvious (to a human that was present in the last planning meeting) edge cases to coding defensively against hundreds of edge cases that will never occur, if you don't add that to your prompt/context material.

csmpltn · 2026-01-11T22:34:29 1768170869

So where’s all of this cutting edge amazing and flawless stuff you’ve built in a weekend that everybody else couldn’t because they were too dumb or slow or clueless?

simonw · 2026-01-12T00:35:25 1768178125

I wouldn't call these flawless but here you go:

- https://github.com/simonw/denobox is a new Python library that gives you the ability to run arbitrary JavaScript and WASM in a sandbox provided by Deno, because it turns out a Python library can depend on deno these days. I built that on my phone in bed yesterday morning.

- https://github.com/simonw/pwasm is a WebAssembly runtime written in pure Python with no dependencies, built by feeding Claude Code the official WASM specification along with its conformance test suite and having it hack away at that (again via my phone) to get as many of the tests to pass as possible. It's pretty slow and not really useful yet but it's certainly interesting.

- https://github.com/datasette/datasette-transactions is a Datasette plugin which provides a JSON API for starting a SQLite transaction, running multiple queries within it and then executing or rolling back that transaction. I built that one on my phone on a BART (SF Bay Area metro) trip.

- https://github.com/simonw/micro-javascript is a pure Python, no dependency JavaScript interpreter which started as a port of MicroQuickJS. Here's a demo of that one running in a browser https://simonw.github.io/micro-javascript/playground.html - that's my JavaScript interpreter running inside Python running in Pyodide in WebAssembly in your browser of choice, which I find inherently amusing.

All of those are from the past three weeks. Most of them were built on my phone while I was doing other things.

Cyph0n · 2026-01-12T02:36:30 1768185390

I am not at all an AI sceptic, but probably less impressed by what LLMs are capable of.

Looking at these projects, I have a few questions:

1. These seem to be fairly self-contained and well specified problems, which is the best case scenario for “vibe coding”. Do you have any examples of projects where the solution was somewhat vague and open-ended? If not, how do you think Claude Code or similar would perform?

2. Did you feel excited or energized by having an LLM implement these projects end-to-end? Personally, I find LLMs useful as a closely guided assistant, particularly to interactively explore the space of solutions. I also don’t feel energized at all by having it implement anything non-trivial end to end, outside of writing tests (and even then, not all types of tests!).

3. Do you think others would find these projects useful? In particular, if you vibe coded them, why couldn’t someone else do the same thing? And once these projects are picked up by future model training runs, they’ll probably be even easier to one shot, reducing the value even further.

Let me provide an example of what I mean by (2), at least in the context of hobbyist dev. I could have Claude Code vibe code a Gameboy emulator and it would probably do a fine job given that it’s a well specified problem that is likely well represented in its training data. But the process would neither be exciting nor energizing. I would rather spend hours gradually getting more and more working and experience the fruits of my labor (I did this already btw).

At $DAYJOB, I simply do not have confidence in an LLM doing anything non-trivial end to end. Besides, the complexity remains in defining the requirements and constraints, designing the solution, gaining consensus, and devising a plan for implementation. The goal would be for the LLM to pick up discrete, well defined chunks of work.

simonw · 2026-01-12T03:55:00 1768190100

"Do you have any examples of projects where the solution was somewhat vague and open-ended"

This one is pretty open ended, and I'm having a ton of fun designing and iterating on it: https://github.com/simonw/claude-code-transcripts - it's also attracting quite a few happy users now.

I have another project in the works in Go which is proving to be a ton of fun from a software design perspective, but it's not ready for outside eyes just yet.

"Did you feel excited or energized by having an LLM implement these projects end-to-end"

I'm enjoying myself so much right now. My BART rides have never been this entertaining before!

"Do you think others would find these projects useful? In particular, if you vibe coded them, why couldn’t someone else do the same thing?"

I don't think many developers have the combined taste and knowledge necessary to spin up Denobox or django-transactions. They both solve problems that I'm very confident need solving, but I expect to have to explain why those matter in some detail to all but a very small group of people who share my particular interests.

The other two are pretty standard - I suggest anyone who wants to learn more about JavaScript interpreters or WASM runtimes try something similar in the language of choice as a learning exercise.

theshrike79 · 2026-01-12T11:34:36 1768217676

> I have another project in the works in Go which is proving to be a ton of fun from a software design perspective, but it's not ready for outside eyes just yet.

As a long-time user of the language I'm happy see that Go seems to be excellent for LLM agent development. The language is simple, there's only one way to do loops etc. It hasn't changed that much syntax wise (I think `any` is the only thing that LLMs miss).

Gofmt (or goimports) makes sure all code looks the same, there are VERY robust linters and a built-in testing framework so the LLM only needs to know one. And the code won't even compile if there are unused variables or other cruft.

It might be boring or verbose, but it's also very predictable and simple. All things LLMs like :D

simonw · 2026-01-12T15:17:01 1768231021

Yes, I've got very interested in Go over the past year for exactly those reasons.

It's also really easy to read code and understand exactly what it does, I'm still finding Rust a lot harder to decode - way more ampersands!

arcanemachiner · 2026-01-12T03:01:38 1768186898

How much do you pay per month for AI services?

simonw · 2026-01-12T03:49:45 1768189785

$200 to Anthropic, $20 to OpenAI, ~$10 in API fees for various other services, and I get GitHub Copilot in VS Code for free as an open source developer.

CjHuber · 2026-01-12T01:29:08 1768181348

Based on those, it seems you are not actually using them to create big codebases from scratch, but rather for problems that would normally take quite a while, not because they are inherently difficult to implement, but because you would normally have to spend considerable time on the finicky implementation details.

I think that's the reason why LLMs work so well for some like you, and generate slop for others, because if you let them alone with projects that require opinionated code and actual decision making they most often don't grasp the users intention well or worse misinterpret it so confidently that you end up with something with all the wrong opinions and decisions compounding path-dependently into the strangest and most useless slop.

simonw · 2026-01-12T01:45:49 1768182349

"for problems that would normally take quite a while, not because they are inherently difficult to implement, but because you would normally have to spend considerable time on the finicky implementation details"

Yes, exactly! How amazing is it that we have technology now that lets us quickly build projects where we would normally have to spend considerable time on the finicky implementation details?

nothrabannosir · 2026-01-12T10:13:05 1768212785

Pretty nice I guess. Cool even. Impressive! And I only say this , just in case, for someone else maybe, ehh—is that it? Because that’s totally fine with me, same experience actually funny that, really impressive tech btw! Very nice. Just, maybe, do the CEOs know that? When people talk of “not having to code anymore”—do they know that this is how it’s described by one of its most prominent champions today?

Not that I mind, of course. As you said: amazing!

Maybe someone just check in with the CEOs who were in the news recently talking about their work force…

simonw · 2026-01-12T15:13:41 1768230821

> When people talk of “not having to code anymore”

You should reinterpret that as "not having to type the code out be hand any more". You still need a significant depth of coding knowledge and experience to get good results out of these things. You just don't need to type out every variable declaration and for loop yourself any more.

theshrike79 · 2026-01-12T11:38:40 1768217920

Automate tools, not jobs.

Every single tool or utility you have in the back of your head, you can just make it in a few hours of wall-clock time, minutes of your personal active time.

Like I wanted a tool that can summarise different sources quickly, took me ~3 hours to build it using llm + fragments + OpenAI API.

Now I can just go `q <url>` in my terminal and it'll summarise just about anything.

Then I built a similar tool that can download almost anything `dl <url>` will use yt-dlp, curl and various other tools depending on the domain to download the content.

peteforde · 2026-01-12T01:50:23 1768182623

Another lens is that many people either have terrible written communication skills, do not intuitively grasp how to describe a complex system design, or both. And yet, since everyone is a genius with 100% comprehensibility in their own mind, they simply aren't aware that the problem starts with them.

CjHuber · 2026-01-12T02:09:49 1768183789

Well I think it also has to do with communication with LLMs being different to communication with humans. If you tell a developer "don't do busywork" they surely wouldn't say "Oh the repo looks like a trash dump, but no busywork so I'm not going to clean it up, quickly document that as canonical structure, then continue"

kaydub · 2026-01-12T14:42:39 1768228959

> have terrible written communication skills

More and more I think this is it.

AnthonyCalandra · 2026-01-12T05:04:57 1768194297

You keep saying you "built" this or that, but did you really?

Of course I don't know for sure if you had any substantial input other than writing a few paragraphs of prompt text and sending Claude some links, because I didn't witness your workflow there. But I think this is kind of what irks some people including myself.

What's stopping me from "building" something similar also? Maybe I won't be as fast as you since you seem to be more experienced with these tools, but at the end of the day, would you be able to describe in detail what got built without you asking Claude about it? If you don't know anything about what you built other than just prompting an AI, in my opinion you didn't actually "build" anything -- Claude did.

simonw · 2026-01-12T05:06:20 1768194380

There's an ongoing conversation among coding agent enthusiasts right now about the correct verb to use.

One of my favorite options is "directed" - "I directed this". It's not quite obvious enough for me to use it in comments on threads like this though.

I've also experimented with "We built" but that feels uncomfortably like anthropomorphizing the model.

One of the reasons I publish almost all of my prompts and transcripts is that I don't believe in gatekeeping this stuff and I want other people to be able to learn how to do what I can do. Here are the transcripts for me Denobox project, for example: https://github.com/simonw/denobox/tree/transcripts - you can view those with my new https://orphanhost.github.io/ tool like this: https://orphanhost.github.io/?simonw/denobox/transcripts/ses...

AnthonyCalandra · 2026-01-12T05:08:49 1768194529

Thanks for sharing, I'll take a look!

kaydub · 2026-01-12T14:45:01 1768229101

I don't think it's wise to bend to those with FUD.

I don't say "my tablesaw and I built this table" I say "I built this table"

wild_egg · 2026-01-11T22:47:27 1768171647

This is such a tired response at this point.

People are under zero obligation to release their work to the public. Simon actually publishes and writes about a remarkable amount of the side projects he builds with AI.

The rest of us just build tons of cool stuff for personal use or for $JOB. Releasing stuff to the public is, in general, a massive amount of extra work for very little benefit. There are loads of FOSS maintainers trapped spending as much time managing their communities as they do their actual projects and many of us just don't have time for that.

rgoulter · 2026-01-12T01:36:29 1768181789

> The rest of us just build tons of cool stuff for personal use or for $JOB. Releasing stuff to the public is, in general, a massive amount of extra work for very little benefit. There are loads of FOSS maintainers trapped spending as much time managing their communities as they do their actual projects and many of us just don't have time for that.

I wouldn't worry about this.

There are many examples of people sharing a project they've used LLMs to help write, and the result was not a huge amount of attention & expectation of burden.

Perhaps "I don't share it because I'm worried people will love it too much" even suggests the opposite: you can concretely demonstrate the kinds of things you've been able to build using LLMs.

> This is such a tired response at this point.

Lack of specificity & concrete examples frequently mean all that's left for discussion is emotion for hype and anti-hype, though.

In this thread, the discussion was:

  pro: use LLMs or get left behind

  conserve: okay, I'll start using LLMs when they're good

  pro: no no they won't be that good, it takes effort to get to use them

  conserve: do you have any examples?

  pro: why should we have to share examples?

I like LLMs. But making big claims while being reticent about concrete claims and demonstrations is irksome.

trollbridge · 2026-01-12T02:56:43 1768186603

I’m waiting to see a huge burst of high quality open source code, which should be happening, right?

Anamon · 2026-01-12T01:09:37 1768180177

The response may be tired when asked in this personal way, but in general, it's a fair question. Nobody is forced to share their work. But with all the high praises, we'd expect to see at least some uptick in the software world. But there is no surge in open source projects. No surge in app store entries. And for the bigger companies claiming high GenAI use, they're not iterating faster or building more. They are continually removing features and their software is getting worse, slower, less robust, and less secure.

Software quality has been on a step downwards curve as far as quality and capabilities are concerned, for years before LLM coding had its breakthrough. For all the promises I'd have expected to, three years later, at least notice the downward trajectory easing off. But it hasn't been happening.

grayhatter · 2026-01-12T01:51:35 1768182695

All I took from your reply was

> I could if I wanted to, but I just don't feel like it.

What am I missing where I can understand that's not what you meant?

jstummbillig · 2026-01-11T23:11:46 1768173106

I find it increasingly confusing that some people seem to believe, that other people not subjecting themselves to this continued interrogation, gives any credence to their position.

People seem to believe that there is a burden of proof. There is not. What do I care if you are on board?

I don't know what could change your mind, but of course the answer is "nothing" as long as you aer not open to it. Just look around. There is so much stuff, from so many credible people in all domains. If you can't find anything that is convincing or at least interesting to you, you are simply not looking.

lunar_mycroft · 2026-01-12T03:40:09 1768189209

> People seem to believe that there is a burden of proof. There is not. What do I care if you are on board?

The burden of proof rests on those making the positive claim. You say you don't care if others get on board, but a) clearly a lot of others do (case in point: the linked article) and b) a quick check of your posts in this very thread shows that you are indeed making positive claims about the merits of LLM assisted software development.

pavlus · 2026-01-12T00:28:07 1768177687

> What do I care if you are on board?

Without enough adoption expect some companies you are a client of to increase prices more, or close entirely down the road, due to insufficient cash inflow.

So, you would care, if you want to continue to use these tools and see them evolve, instead of seeing the bubble pop.

williamcotton · 2026-01-12T01:04:03 1768179843

Over the last few days I made this ggplot2-looking plotting DSL as a CLI tool and a Rust library.

https://github.com/williamcotton/gramgraph

The motivation? I needed a declarative plotting language for another DSL I'm working on called Web Pipe:

  GET /weather.svg
    |> fetch: `https://api.open-meteo.com/v1/forecast?latitude=52.52&longitude=13.41&hourly=temperature_2m`
    |> jq: `
      .data.response.hourly as $h |
      [$h.time, $h.temperature_2m] | transpose | map({time: .[0], temp: .[1]})
    `
    |> gg({ "type": "svg", "width": 800, "height": 400} ): `
      aes(x: time, y: temp) 
        | line()
        | point()
    `

"Web Pipe is an experimental DSL and Rust runtime for building web apps via composable JSON pipelines, featuring native integration of GraphQL, SQL, and jq, an embedded BDD testing framework, and a sophisticated Language Server."

https://github.com/williamcotton/webpipe

https://github.com/williamcotton/webpipe-lsp

https://williamcotton.com/articles/basic-introduction-to-web...

I've been working at quite a clip for a solo developer who is building a new language with a full featured set of tooling.

I'd like to think that the approach to building the BDD-testing framework directly into the language itself and having the test runner using the production request handlers is at least somewhat novel!

  GET /hello/:world
    |> jq: `{ world: .params.world }`
    |> handlebars: `<p>hello, {{world}}</p>`

  describe "hello, world"
    it "calls the route"
      let world = "world"
      
      when calling GET /hello/{{world}}
      then status is 200
      and selector `p` text equals "hello, {{world}}"

I'm married with two young kids and I have a full-time job. Before these tools there was no way I could build all of these experiments with such limited resources.

llmslave3 · 2026-01-11T22:56:44 1768172204

He's built lots of cool stuff with AI. Here is four random ones pulled from https://tools.simonwillison.net

- https://tools.simonwillison.net/bullish-bearish

- https://tools.simonwillison.net/user-agent

- https://tools.simonwillison.net/gemini-chat

- https://tools.simonwillison.net/token-usage

m4nu3l · 2026-01-11T23:32:00 1768174320

All of the linked apps look trivial to me. Also, the first one, the UI has no feedback once you click the answer (plus some questions don't really make sense as they have the answer in them). There is more on the website, so there could be something interesting, but I'm having trouble finding it among all the noise. Not saying simple apps have no value. Even simple throwaway UIs can have value, especially if you develop them quickly.

simonw · 2026-01-12T00:36:04 1768178164

How about these ones, are these trivial too? https://news.ycombinator.com/item?id=46582192

CamelCaseName · 2026-01-11T23:12:12 1768173132

This is not really cool or impressive at all?

sesm · 2026-01-12T00:40:29 1768178429

A page that outputs your user agent as an example of 'cool stuff built with AI'?

simonw · 2026-01-12T00:42:08 1768178528

See my comment here - I suspect that those were deliberately picked by llmslave3 to NOT be impressive: https://news.ycombinator.com/item?id=46582209

For more impressive examples see https://simonwillison.net/2025/Dec/10/html-tools/ and https://news.ycombinator.com/item?id=46574276#46582192

novemp · 2026-01-12T05:30:41 1768195841

I feel like I'm being punked, being told that this "bullish vs bearish flash card" thing and this "here's your user agent, something people have been doing for thirty years" thing, are "cool stuff". This guy seriously needed AI to make those?

I can't gauge the other two since I don't use those things, so maybe they are cool, idk.

simonw · 2026-01-12T05:43:55 1768196635

Go read my replies to your sibling comments that said the same thing.

simonw · 2026-01-12T00:37:07 1768178227

llmslave3 appears to have deliberately picked the least interesting from my HTML+JavaScript tools collection here. This post describes a bunch of much more interesting ones: https://simonwillison.net/2025/Dec/10/html-tools/

llmslave3 · 2026-01-12T00:45:03 1768178703

> Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith.

simonw · 2026-01-12T00:57:02 1768179422

Did you genuinely select those examples in good faith?

If you're here to converse in good faith, what's your opinion of the examples I shared in this post over here? https://news.ycombinator.com/item?id=46574276#46582192

user34283 · 2026-01-11T22:59:11 1768172351

Where is all the amazing, much better stuff you implemented manually meanwhile?

csmpltn · 2026-01-12T12:04:29 1768219469

I'm not the one making unverifiable, extravagant, pompous and extraordinary claims though :)

user34283 · 2026-01-12T13:03:23 1768223003

Did you miss the part where the guy you derisively asked replied with an extensive list of quite verifiable projects?

suddenlybananas · 2026-01-12T08:56:23 1768208183

Are you asking for evidence that humans can write good code?

user34283 · 2026-01-12T10:29:23 1768213763

No, I am pointing out the hypocrisy in demanding evidence of production results in a derisive manner whenever someone mentions a productivity boost with AI.

To some extend it's an understandable ask, but obviously even with a decent productivity boost side projects still require a lot of time and effort before a possible public release.

elvis10ten · 2026-01-12T08:18:05 1768205885

While intuition takes a while, I think it can be learned in less than a month or two.

This has been my experience. When something gets good enough, someone will create some really good resource on it. Allowing the dust to settle, to me is a more efficient strategy than constantly trying to “keep up”. Maybe also not waiting too long to do so.

This wouldn’t work of course if a person was trying to be some AI thought leader.

helloplanets · 2026-01-12T08:55:33 1768208133

I'd say that it's a different type of learning process, where even a good resource doesn't help as much as it would with a traditional programming language. Sort of like you can't get very good at writing by just reading a ton of instructional books about it.

elvis10ten · 2026-01-12T12:03:15 1768219395

Even CRUD programming: you can’t get very good at it with just reading.

stefanlindbohm · 2026-01-12T08:55:23 1768208123

Maybe it’s just two different ways to reach the same result. You need to spend time to be great at prompting to get high-quality code from LLM’s, which might just be equivalent to the fact you need to spend time to write high-quality code without LLM’s too.

From where I’m standing, I don’t see any massive difference on overall productivity between anyone all in on vibe coding than those who aren’t. There’s not more features, higher quality, etc from teams/companies out there than before on any high-level metrics/observations. Maybe it will come, but there’s also no evidence it will.

I do, however, see great gains within certain specific tasks using LLM’s. Smaller scope code gen, rubber ducking, etc. But this seems much less difficult to get good at using (and I hope for tooling that help facilitate the specific types of use cases) and on the whole amounts to marginal gains. It seems fine to be a few years late to catch up, worst case.

lunar_mycroft · 2026-01-12T04:11:40 1768191100

The core of your argument is that using LLMs is a skill that takes a significant amount of time to master. I'm not going to argue against that (although I have some doubts) because I think it's ultimately irrelevant. The question isn't "is prompting a skill that you'll need to be an effective software developer in the future" but "what other skills will you need to do so", and regardless of the answer you don't need to start adopting LLMs right away.

Maybe AI gets good enough at writing code that it's users' knowledge of computer science and software development becomes irrelevant. In that case, approximately everyone on this site is just screwed. We're all in the business of selling that specialized knowledge, and if it's no longer required then companies aren't going to pay us to operate the AI, they're going to pay PMs, middle managers, executives, etc. But even that won't be particularly workable long term, because all their customers will realize they no longer need to pay the companies for software either. In this world, the price of software goes to zero (and hosting likely gets significantly more commoditized than it is now). Any time you put into learning to use LLMs for software development doesn't help you keep making money selling software, and actually stops you from picking up a new career.

If, on the other hand, CS and software engineering knowledge is still needed, companies will have to keep/restart hiring or training new developers. In terms of experience using AI, it is impossible for anyone to have less experience than these new developers. We will, however, have much more experience and knowledge of the aforementioned non-LLM skills that we're assuming (in this scenario) are still necessary for the job. In this scenario you might be better off if you'd started learning to prompt a bit earlier, but you'll still be fine if you didn't.

coffeemug · 2026-01-11T16:18:45 1768148325

Strongly disagree. Claude Code is the most intuitive technology I've ever used-- way easier than learning to use even VS Code for example. It doesn't even take weeks. Maybe a day or two to get the hang of it and you're off to the races.

johnsmith1840 · 2026-01-11T22:22:39 1768170159

The difference is AI tooling lies to you. Day 0 you think it's perfect but the more you use ai tools you realize using them wrong can give you gnarly bugs.

It's intuitive to use but hard to master

coffeemug · 2026-01-12T02:30:20 1768185020

It took me a couple of days to find the right level of detail to prompt it. Too high level, and the codebase gets away from me/the tooling goes off the rails. Too low level, and I may as well do it myself. Maybe also learn the sorts of things Claude Code isn't good at yet. But once I got in the groove it was very easy from there. I think the whole process took 2-3 days.

johnsmith1840 · 2026-01-12T07:11:58 1768201918

Assuming you used AI before? Then yeah its the same.

If you never AI coded before then get ready for fun!

simonw · 2026-01-11T16:34:51 1768149291

Don't underestimate the number of developers who aren't comfortable with tools that live in the terminal.

coffeemug · 2026-01-12T02:28:20 1768184900

I actually don't use it in the terminal, I use the vs code extension. It's a better experience (bringing up the file being edited, nicer diffs, etc.) But both are trivial to pick up.

HDThoreaun · 2026-01-11T20:45:41 1768164341

Well these people are left behind either way. Competent devs can easily learn to use coding assistants in a day or two

grayhatter · 2026-01-12T01:39:04 1768181944

Show me what you've made with AI?

What's the impressive thing that can convince me it's equivalent, or better than anything created before, or without it?

I understand you've produced a lot of things, and that your clout (which depends on the AI ferver) is based largely because of how refined a workflow you've invented. But I want to see the product, rather than the hype.

Make me say; I wish I was good enough to create this!

Without that, all I can see is the cost, or the negative impact.

edit: I've read some of your other posts, and for my question, I'd like to encourage you to pick only one. Don't use the scatter shot approach that LLMs love, giving plenty of examples, hoping I'll ignore the noise for the single that sounds interesting.

Pick only one. What project have you created that you're truly proud of?

I'll go first, (even though it's unfinished): Verse

tsimionescu · 2026-01-12T07:30:49 1768203049

There are so many projects named Verse (or similar) that you really need to be more specific.

grayhatter · 2026-01-12T15:24:36 1768231476

The point wasn't to level set, or for it to feel like I'm promoting it. Only so that I couldn't back out later, or not have any skin in the game. But if you really couldn't figure out which I meant. It's the one on my github, and also hosting on srctree, which I link to from the site in my profile, and from my github.

camel-cdr · 2026-01-11T22:42:46 1768171366

How many thing you learned working with LLMs in 2022 are relevant today? How many things you learned now are relevant in the future?

y1n0 · 2026-01-11T23:37:26 1768174646

This question misses the point. Everything you learn today informs how you learn in the future.

febusravenga · 2026-01-12T09:16:29 1768209389

> Using this stuff well is a deep topic. These things can be applied in so many different ways, and to so many different projects. The best asset you can develop is an intuition

You're basically saying that using LLMs is like using magic. Telling people to use intuition is basically telling that i don't know how it works and why, but works for me sometimes.

That's why we programmers hate it - we have safe space where there's no intuition - namely programming languages & runtimes with deterministic behavior. And we're shoehorned back into mess of magic/intuition and wishfullthinking.

(yes, i try llm, i have some results, i'm frustrated mostly by people AI-slopping _everything_ around me)

simonw · 2026-01-12T15:03:11 1768230191

Oddly enough I wrote about the magic analogy and why I stopped using is a few years ago (pre-ChatGPT, even): https://simonwillison.net/2022/Oct/5/spell-casting/

I am eternally frustrated that "intuition" is the key skill people need to work effectively with LLMs, because it's something I can't teach people! If I could figure out how to download my intuition into other people's heads I would do that.

Instead I have to convince people that intuition is key, and the only way to get it is to invest in experimenting.

theshrike79 · 2026-01-12T11:42:22 1768218142

It's like any other power tool. It requires skill to use it safely and efficiently.

Anyone can use a band saw to cut things. Then go look what Jimmy DiResta makes with one and you see the difference.

The chance of an inexperienced person cutting off their finger with a bandsaw is also way over zero, there are things you should not and must not do with it. As with any power tool.

hahahahhaah · 2026-01-12T09:43:54 1768211034

Intuition is the wrong word IMO. Tacit knowledge is the thing. Knowledge that is hard to communicate and needs experience.

Problem with AI is it isn't woodwork. The material keeps changing!

matsemann · 2026-01-12T10:09:42 1768212582

I learned Django 15 years after its inception. After 5 years of experience I'm probably not too far behind someone doing the exact same work as me but for 15 years.

Or would you say people shouldn't learn Django now? As it's useless as they're already far behind? They shouldn't study computer science, as it will be too late?

Every profession have new people continuously entering the workforce, that quickly get up to speed on whatever is in vogue.

Honestly, what you've spent years learning and experimenting with, someone else will be able to learn in months. People will figure out the best ways of using these tools after lots of attempts, and that distilled knowledge will be transferred quickly to others. This is surely painful to hear for those having spent years in the trenches, and is perhaps why you refuse to acknowledge it, but I think it's true.

simonw · 2026-01-12T15:11:28 1768230688

I would not say that about a framework like Django - though I would encourage people not to under-invest in understanding web fundamentals since once you have those Django, Rails, Next.js etc are all quick to pick up.

I would say that about LLMs.

That's why I'm ringing the alarm bells here. LLM skills are not the same as framework or library usage skills. They aren't clearly documented or predictable - they're really weird!

If you assume learning to use coding agents is the same category of challenge as learning to use something like Django you'll get burned by that assumption.

rubslopes · 2026-01-11T15:43:22 1768146202

I don't disagree, knowing how to use the tools is important. But I wanted to add that great prompting skill nowadays are far far less necessary for top-tier models that it was years ago. If I'm clear about what I want and how I want it to behave, Claude Opus 4.5 almost always nails it first time. The "extra" that I do often, that maybe newcomers don't, is to setup a system where the LLM can easily check the results of its changes (verbose logs in terminal and, in web, verbose logs in console and playwright).

furyofantares · 2026-01-11T16:27:56 1768148876

I think I'm also very good at getting great results out of coding agents and LLMs, and I disagree pretty heavily with you.

It is just way easier for someone to get up to speed today than it was a year ago. Partly because capabilities have gotten better and much of what was learned 6+ months ago no longer needs to be learned. But also partly because there is just much more information out there about how to get good results, you might have coworkers or friends you can talk to who have gotten good results, you can read comments on HN or blog posts from people who have gotten good results, etc.

I mean, ok, I don't think someone can fully catch up in a few weeks. I'll grant that for sure. But I think they can get up to speed much faster than they could have a year ago.

Of course, they will have to put in the effort at that time. And people who have been putting it off may be less likely to ever do that. So I think people will get left behind. But I think the alarm to raise is more, "hey, it's a deep topic and you're going to have to put in the effort" rather than "you better start now or else it's gonna be too late".

mmcnl · 2026-01-11T13:54:07 1768139647

Why can't both be true at the same time? Maybe their problems are more complex than yours. Why do you assume it's a skill issue and ignore the contextual variables?

simonw · 2026-01-11T14:31:30 1768141890

On the rare occasions that I can convince them to share the details of the problems they are tackling and the exact prompts they are using it becomes very clear that they haven't learned how to use the tools yet.

UncleEntity · 2026-01-11T17:26:13 1768152373

I'm kind of curious about the things you're seeing since I find the best way is to have them come up with a plan for the work they're about to do and then make sure they actually finish it because they like to skip stuff if it requires too much effort.

I mean, I just think of them like a dog that'll get distracted and go off doing some other random thing if you don't supervise them enough and you certainly don't want to trust them to guard your sandwich.

jeroenhd · 2026-01-11T12:34:22 1768134862

So far every new AI product and even model update has required me to relearn how to get decent results out of them. I'm honestly kind of sick of having to adjust my work flow every time.

The intuition just doesn't hold. The LLM gets trained and retrained by other LLM users so what works for me suddenly changes when the LLM models refresh.

LLMs have only gotten easier to learn and catch up on over the years. In fact, most LLM companies seem to optimise for getting started quickly over getting good results consistently. There may come a moment when the foundations solidify and not bothering with LLMs may put you behind the curve, but we're not there yet, and with the literally impossible funding and resources OpenAI is claiming they need, it may never come.

christophilus · 2026-01-12T01:05:36 1768179936

Really? Claude Code upgrades for me have been pretty seamless- basically better quality output, given the same prompts, with no discernible downsides.

hollowturtle · 2026-01-11T22:41:21 1768171281

I can't buy it because for many people like you it's always the other that uses the tools wrong, proving the contrary for skeptics that keep getting bad results from llms it simply is impossible with this narrative as the base of the discourse, eg "you're not using it well". I don't even get why you need to praise yourself so much being really good at using these tools, if not for building some tech influencer status around here... same thing I believe antirez is trying to do(who knows why)

kevin42 · 2026-01-11T22:57:23 1768172243

Have you considered that maybe you aren't using it well? It's something that can and should be learned. It's a tool, and you can't expect to get the most out of a tool without really learning how to use it.

I've had this conversation with a few people so far, and I've offered to personally walk through a project of their choosing with them. Everyone who has done this has changed their perspective. You may not be convinced it will change the world, but if you approach it with an open mind and take the time to learn how to best use it, I'm 100% sure you will see that it has so much potential.

There are tons of youtube videos and online tutorials if you really want to learn.

hollowturtle · 2026-01-11T23:18:39 1768173519

> Have you considered that maybe you aren't using it well?

Here we go, as I said, and again and again and again it's always out fault we're not using well. It is impossible to counter argument. Btw to reply to your question, yes many times and proved to be useful in very small specialized tasks and a couple of migrations. I really like how LLMs are helping me in my day to day, but still so far away from all this astroturfing

Mawr · 2026-01-11T12:49:08 1768135748

I don't see how your position is compatible with the constant hype about the ever-growing capabilities of LLMs. Either they are improving rapidly, and your intuition keeps getting less and less valuable, or they aren't improving.

simonw · 2026-01-11T12:56:01 1768136161

They're improving rapidly, which means your intuition needs to be constantly updated.

Things that they couldn't do six months go might now be things that they can do - and knowing they couldn't do X six months ago is useful because it helps systematize your explorations.

A key skill here is to know what they can do, what they can't do and what the current incantations are that unlock interesting capabilities.

A couple I've learned in the past week:

1. Don't give Claude Code a URL to some code and tell it to use that, because by default it will use its WebFetch tool but that runs an extra summarization layer (as a prompt injection defense) which loses details. Telling it to use curl sometimes works but a guaranteed trick is to have it git clone the relevant repo to /tmp and look at the code there instead.

2. Telling Claude Code "use red/green TDD" is a quick to type shortcut that will cause it to write tests first, run them and watch them fail, then implement the feature and run the test again. This is a wildly effective technique for getting code that works properly while avoiding untested junk code that isn't needed.

Now multiply those learnings by three years. Sure, the stuff I figure out in 2023 mostly doesn't apply today - but the skills I developed in learning how to test and iterate on my intuitions from then still count and still keep compounding.

The idea that you don't need to learn these things because they'll get better to the point that they can just perfectly figure out what you need is AGI science fiction. I think it's safe to ignore.

mmcnl · 2026-01-11T13:59:35 1768139975

Personally I think this is an extreme waste of time. Every week you're learning something new that is already outdated the next week. You're telling me AI can write complex code but isn't able to figure out how to properly guide the user into writing usable prompts?

A somewhat intelligent junior will dive deep for one week and be on the same knowledge level as you in roughly 3 years.

simonw · 2026-01-11T14:29:02 1768141742

No matter how good AI gets we will never be in a situation where a person with poor communication skills will be able to use it as effectively as someone who's communication skills are razor sharp.

q3k · 2026-01-11T15:05:05 1768143905

But the examples you've posted have nothing to do with communication skills, they're just hacks to get particular tools to work better for you, and those will change whenever the next model/service decides to do things differently.

theshrike79 · 2026-01-12T11:47:44 1768218464

Yes and no. Knowing the terminology is a short-cut to make the LLM use the correct part of its "brain".

Like when working with video, if you use "timecode" instead of "timestamp", it'll use the video production part of the vector memory more. Video production people always talk about "timecodes", not "timestamps".

You can also explain the idea of red/green testing the long way without mentioning any of the keywords. It might work, but just knowing you can say "use red/green testing" is a magic shortcut to the correct result.

Thus: working with LLMs is a skill, but also an ever-changing skill.

zahlman · 2026-01-11T16:46:00 1768149960

I'm generally skeptical of Simon's specific line of argument here, but I'm inclined to agree with the point about communication skill.

In particular, the idea of saying something like "use red/green TDD" is an expression of communication skill (and also, of course, awareness of software methodology jargon).

habinero · 2026-01-11T17:13:50 1768151630

Ehhh, I don't know. "Communication" is for sapients. I'd call that "knowing the right keywords".

And if the hype is right, why would you need to know any of them? I've seen people unironically suggest telling the LLM to "write good code", which seems even easier.

zahlman · 2026-01-11T17:26:22 1768152382

I sympathize with your view on a philosophical level, but the consequence is really a meaningless semantic argument. The point is that prompting the AI with words that you'd actually use when asking a human to perform the task, generally works better than trying to "guess the password" that will magically get optimum performance out of the AI.

Telling an intern to care about code quality might actually cause an intern who hasn't been caring about code quality to care a little bit more. But it isn't going to help the intern understand the intended purpose of the software.

habinero · 2026-01-12T07:05:27 1768201527

I'm not making a semantic argument, I'm making a practical one.

> prompting the AI with words that you'd actually use when asking a human to perform the task, generally works better

Ok, but why would you assume that would remain true? There's no reason it should.

As AI starts training on code made by AI, you're going to get feedback loops as more and more of the training data is going to be structured alike and the older handwritten code starts going stale.

If you're not writing the code and you don't care about the structure, why would you ever need to learn any of the jargon? You'd just copy and paste prompts out of Github until it works or just say "hey Alexa, make me an app like this other app".

simonw · 2026-01-11T15:09:59 1768144199

I'm going to resist the temptation to spend more time coming up with more examples. I'm sorry those weren't to your liking!

danielmarkbruce · 2026-01-11T22:51:30 1768171890

Why do you bother with all this discussion? Like, I get it the first x times for some low x, it's fun to have the discussion. But after a while, aren't you just tired of the people who keep pushing back? You are right, they are wrong. It's obvious to anyone who has put the effort in.

peteforde · 2026-01-12T01:56:37 1768182997

Trying to have a discussion with people who aren't actually interested in being convinced is exhausting. Simon has a lot more patience than I do.

simonw · 2026-01-12T04:09:01 1768190941

It's a poorly considered hobby.

It's also useful for figuring out what I think and how best to express that. Sometimes I get really great replies too - I compared ethical LLM objections to veganism today on Lobste.rs and got a superb reply explaining why the comparison doesn't hold: https://lobste.rs/s/cmsfbu/don_t_fall_into_anti_ai_hype#c_oc...

danielmarkbruce · 2026-01-12T04:15:05 1768191305

I like debate as much as the next guy(almost). Your patience is either admirable or crazy, I'm not sure which.

simonw · 2026-01-12T04:43:41 1768193021

Neither am I!

theshrike79 · 2026-01-12T11:45:23 1768218323

"There is considerable overlap between the intelligence of the smartest bears and the dumbest tourists."

At some point you'll just have to accept the tool isn't for everyone =)

crakhamster01 · 2026-01-11T13:43:41 1768139021

I feel like both of these examples are insights that won't be relevant in a year.

I agree that CC becoming omniscient is science fiction, but the goal of these interfaces is to make LLM-based coding more accessible. Any strategies we adopt to mitigate bad outcomes are destined to become part of the platform, no?

I've been coding with LLMs for maybe 3 years now. Obviously a dev who's experienced with the tools will be more adept than one who's not, but if someone started using CC today, I don't think it would take them anywhere near that time to get to a similar level of competency.

simonw · 2026-01-11T14:30:21 1768141821

I base part of my skepticism about that on the huge number of people who seem to be unable to get good results out of LLMs for code, and who appear to think that's a commentary on the quality of the LLMs themselves as opposed to their own abilities to use them.

pavlus · 2026-01-12T00:40:01 1768178401

> huge number of people who seem to be unable to get good results out of LLMs for code

Could it be, they use other definition of "good"?

svara · 2026-01-11T15:17:00 1768144620

I suspect that's neither a skill issue nor a technical issue.

Being "a person who can code" carries some prestige and signals intelligence. For some, it has become an important part of their identity.

The fact that this can now be said of a machine is a grave insult if you feel that way.

It's quite sad in a way, since the tech really makes your skills even more valuable.

biophysboy · 2026-01-11T17:00:44 1768150844

What are your tips? Any resources you would recommend? I use Claude code and all the chat bots, but my background isn't programming, so I sometimes feel like I'm just swimming around.

tehnub · 2026-01-11T23:06:56 1768172816

I guess this applies to the type of developer who needs years, not weeks, to become proficient in say Python?

noosphr · 2026-01-12T00:03:22 1768176202

I've been building Ai apps since gpt 3 so 5 years now.

The pro AI people don't understand what quadratic attention means and the anti-ai people don't understand how much information can be contained in a tb of weights.

At the end of the day both will be hugely disappointed.

>The best asset you can develop is an intuition for what works and what doesn't, and getting that intuition requires months if not years of personal experimentation.

Intuition does not translate between models. Whatever you think dense llms were good at deepseek completely upended it in an afternoon. The difference between major revisions of model families is substantial enough that intuition is a drawback not an asset.

simonw · 2026-01-12T00:17:46 1768177066

What does quadratic attention mean?

I've so far found that intuition travels between models of a similar generation remarkably well. The conformance suite trick (find a 9,200 test existing conformance suite and tell an agent to build a fresh implementation that passes all those tests) I first found with GPT-5.2 turned out to work exactly as well against Claude Opus 4.5, for example.

noosphr · 2026-01-12T00:25:34 1768177534

https://arxiv.org/abs/2209.04881

simonw · 2026-01-12T00:52:56 1768179176

To save anyone else the click, this is the paper "On The Computational Complexity of Self-Attention" from September 2022, with authors from NYU and Microsoft.

It argues that the self-attention mechanism in transformers works by having every token "attend to" every other token in a sequence, which is quadratic - n^2 against input - which should limit the total context length available to models.

This would explain why the top models have been stuck at 1 million tokens since Gemini 1.5 in February 2024 (there has been a 2 million token Gemini but it's not in wide use, and Meta claimed their Llama 4 Scout could do 10 million but I don't know of anyone who's seen that actually work.)

My counter-argument here is that Claude Opus 4.5 has a comparatively tiny 200,000 token window which turns out to work incredibly well for the kinds of coding problems we're throwing at it, when accompanied by a cleverly designed harness such as Claude Code. So this limit from 2022 has been less "disappointing" than people may have expected.

sdenton4 · 2026-01-12T01:08:18 1768180098

The quadratic attention problem seems to be largely solved by practical algorithmic improvements. (Iterations on flash attention, etc.)

What's practically limiting context size IME is that results seem to get "muddy" and get off track when you have a giant context size. For a single-topic long session, I imagine you get a large number of places in the context which may be good matches for a given query, leading to ambiguous results.

I'm also not sure how much work is being put into reinforcement in extremely large context inference, as it's presumably quite expensive to do and hard to reliably test.

noosphr · 2026-01-12T01:38:23 1768181903

Indeed, filling the adversitsed context more than 1/4 full is a bad idea in general. 50k tokens is a fair bit, but works out to between 1 and 10k lines of code.

Perfect for a demo or work on a single self contained file.

Disastrous for a large code base with logic scattered all throughout it.

trollbridge · 2026-01-12T03:05:03 1768187103

Right. It’s not practical to apply AI tools as they are today to existing, complex code bases and get reliable results.

Greenfield is easy (but it always was). Working on well-organised modules that are self contained and cleanly designed is easy - but that always was, too.

water-drummer · 2026-01-12T12:02:34 1768219354

> Using this stuff well is a deep topic. These things can be applied in so many different ways, and to so many different projects. The best asset you can develop is an intuition for what works and what doesn't, and getting that intuition requires months if not years of personal experimentation.

You feel that way because it took you years or months to reach that point. But after reaching that point, do you really think that it's equally—if not more—difficult to put what you learned into words compared to, let's say, programming or engineering?

See, the thing about these tools is that they're designed to be operated via natural language, which is something most people (with a certain level of education) are quite comparable to each other at; consequently, the skill ceiling is considerably lower compared to something like programming. I am not saying there's no variance in people's ability to articulate, but that the variance is considerably less than what we get when comparing people's ability to write code or solve engineering problems.

So, whatever you learned by trial and error was just different ways or methods to get around the imperfections of the existing LLMs—not ways to use them skillfully according to their design goals. Their design goal is to achieve whatever task is given to them, as long as the intent is clear. These workarounds and tricks that you learned aren't something you build an intuition for. What you build an intuition for is finding new workarounds, but once you've found them, they're quite concrete and easy to describe to someone else who can simply use them to achieve the same results as you.

Tools that are designed to be operable via natural language aren't designed to be more thorough—it's actually the opposite. If you want more control, you have programming languages and search engines; thoroughness is where you get that high skill ceiling. The skill ceiling for using these tools is going to get narrower and narrower. The workarounds that you figure out may take skill to discover, but they don't take much skill to replicate.

If you share your "tips and tricks" with someone, then yeah, it will take them a week to start getting the same results as you because the skill ceiling is low and the workarounds are concrete/require less thinking.

simonw · 2026-01-12T15:21:00 1768231260

The more I see of how different people use LLMs the more convinced I am that communication skills differ wildly between different people.

Clear, unambiguous communication is a key skill to unlock LLMs. I suspect it's a lot less common than you think!

noodletheworld · 2026-01-12T09:07:06 1768208826

> I don't think you can just catch up in a few weeks, and I do think that the risk of falling behind isn't being taken seriously enough by much of the developer population.

This is nonsense.

This field moves so fast the things you did more than a year ago aren't relevant anymore.

Claude code came out last year.

Anyone using random shit from before that is not using it any more. It is completely obsolete in all but a handful of cases.

To make matters worse “intuition” about models is wasted learning, because they change, significantly, often.

Stop spreading FUD.

You can be significantly less harmful to people who are trying to learn by sharing what you actually do instead of nebulously hand waving about magical BS.

Dear readers: ignore this irritating post.

Go and watch Armin Ronacher on youtube if you want to see what a real developer doing this looks like, and why its hard.

simonw · 2026-01-12T15:00:11 1768230011

You're accusing me of spreading harmful advice here, when you're the one telling people that they don't need to worry about not investing in their skills because "This field moves so fast the things you did more than a year ago aren't relevant anymore."

One of us is right here. I hope for your sake and the people that listen to you that it's you. I don't think it is.

quitit · 2026-01-11T12:16:40 1768133800

You're right, it's difficult to get "left behind" when the tools and workflows are being constantly reinvented.

You'd be sage with your time just to keep a high-level view until workflows become stable and aren't advancing every few months.

The time to consider mastering a workflow is when a casual user of the "next release" wouldn't trivially supersede your capabilities.

Similarly we're still in the race to produce a "good enough" GenAI, so there isn't value in mastering anything right now unless you've already got a commercial need for it.

This all reminds me of a time when people were putting in serious effort to learn Palm Pilot's Graffiti handwriting recognition, only for the skill to be made redundant even before they were proficient at it.

antirez · 2026-01-11T11:17:50 1768130270

I think that who says that you need to be accustomed to the current "tools" related to AI agents, is suffering from a horizon effect issue: these stuff will change continuously for some time, and the more they evolve, the less you need to fiddle with the details. However, the skill you need to have, is communication skills. You need to be able to express yourself and what matters for your project fast and well. Many programmers are not great at communication. In part this is a gift, something you develop at small age, and this will, I believe, kinda change who is good at programming: good communicators / explorers may not have a edge VS very strong coders that are bad at explaining themselves. But a lot of it is attitude, IMHO. And practice.

embedding-shape · 2026-01-11T11:22:17 1768130537

> Many programmers are not great at communication.

This is true, but still shocking. Professional (working with others at least) developers basically live or die by their ability to communicate. If you're bad at communication, your entire team (and yourself) suffer, yet it seems like the "lone ranger" type of programmer is still somewhat praised and idealized. When trying to help some programmer friends with how they use LLMs, it becomes really clear how little they actually can communicate, and for some of them I'm slightly surprised they've been able to work with others at all.

An example the other day, some friend complained that the LLM they worked with was using the wrong library, and using the wrong color for some element, and surprised that the LLM wouldn't know it from the get go. Reading through the prompt, they never mentioned it once, and when asked about it, they thought "it should have been obvious" which yeah, to someone like you who worked for 2 years on this project that might be obvious, but for some with zero history and zero context about what you do? How you expect it to know this? Baffling sometimes.

theshrike79 · 2026-01-12T11:55:45 1768218945

People anthropomorphise LLMs, not understanding that they don't have "implied context" about things. They just go by the statistical average unless directed otherwise.

Having worked with offshore consultant teams where there are language and cultural barriers - and needing clear specs myself. I somehow just naturally "got" how much context to give the Agent.

People who have been working solo or with like-minded people all their career might have a harder time.

prodigycorp · 2026-01-11T11:26:58 1768130818

Yup. I'd take a gander than most complaints by people who have even used LLMs for long time can be resolved by "describe your thing in detail". LLM's are such a relief on my wrists that I often get tempted to write short prompts and pray that the LLM divines my thoughts. I always get much better results in a lot faster time when i just turn on the mic and have whisper transcribe a couple minutes of my speaking though.

menaerus · 2026-01-11T12:13:49 1768133629

I am using Google Antigravity for the same type of work you mention, such as many things and ideas I had over the years but I couldn't justify the time I needed to invest into them. Pretty non-trivial ideas and yet with a good problem definition communication skills I am getting unbelievable results. I am even intentionally sometimes being too vague in my problem definition to avoid introducing the bias to the model and the ride has been quite crazy so far. In 2 days I've implemented several substantial improvements that i had in my head for years.

The world changed for good and we will need to adapt. The bigger and more important question at this point isn't anymore if LLMs are good enough, for the ones who want to see, but, as you mention in your article, is what will happen to people who will get unemployed. There's a reality check for all of us.

oncallthrow · 2026-01-11T11:16:20 1768130180

My take: learning how to do LLM-assisted coding at a basic level gets you 80% of the returns, and takes about 30 minutes. It's a complete no-brainer.

Learning all of the advanced multi-agent worklows etc. etc... Maybe that gets you an extra 20%, but it costs a lot more time, and is more likely to change over time anyway. So maybe not very good ROI.

theshrike79 · 2026-01-12T12:00:37 1768219237

1. Basic vanilla LLM Agentic coding

2. Build tools for the LLM, ones that are easy to use and don't spam stuff. Like give it tools to run tests that only return "Tests OK" if nothing failed, same with builds.

3. Look into /commands and Skills, both seem to be here to stay

Maybe a weekend of messing about and you'll be pretty well off compared to the vast masses who still copy/paste code out of ChatGPT to their editor.

__MatrixMan__ · 2026-01-12T05:56:42 1768197402

It seems like you're mostly focused on the tooling for actually directing the LLM but there's a whole host of other technology which becomes relevant re: building guardrails and handcuffs for your agent. For instance I've been doing a lot of contract testing lately. It's not new tech, not changing at a blistering pace, but now that generating mountains of code is cheap, techniques for dealing with those mountains are suddenly more necessary.

edg5000 · 2026-01-11T11:11:42 1768129902

It took me a few months of working with the agents to get really productive with it. The gains are significant. I write highly detailed specs (equiv multiple A4 pages) in markdown and dicate the agent hierarchy (which agent does what, who reports to who).

I've learned a lot of new things this year thanks to AI. It's true that the low levels skills with atrophy. The high level skills will grow though; my learning rate is the same, just at a much higher abstraction level; thus covering more subjects.

The main concern is the centralisation. The value I can get out of this thing currently well exceeds my income. AI companies are buying up all the chips. I worry we'll get something like the housing market where AI will be about 50% of our income.

We have to fight this centralisation at all costs!

wmwragg · 2026-01-11T11:19:14 1768130354

This is something I think a lot of people don't seem to notice, or worry about, the moving of programming as a local task, to one that is controlled by big corporations, essentially turning programming into a subscription model, just like everything else, if you don't pay the subscription you will no longer be able to code i.e. PaaS (Programming as a Service). Obviously at the moment most programmers can still code without LLMs, but when autocomplete IDEs became main stream, it didn't take long before a large proportion of programmers couldn't program without an autocomplete IDE, I expect most new programmers coming in won't be able to "program" without a remote LLM.

Lio · 2026-01-11T11:30:36 1768131036

That ignores the possibility that local inference gets good enough to run without a subscription on reasonably priced hardware.

I don't think that's too far away. Anthropic, OpenAI, etc. are pushing the idea that you need a subscription but if opensource tools get good enough they could easily become an expensive irrelivance.

wmwragg · 2026-01-11T11:49:53 1768132193

There is that, but the way this usually works is that there is always a better closed service you have to pay for, and we see that with LLMs as well. Plus there is the fact that you currently need a very powerful machine to run these models at anywhere near the speed of the PaaS systems, and I'm not convinced we'll be able to do the Moore's law style jumps required to get that level of performance locally, not to mention the massive energy requirements, you can only go so small, and we are getting pretty close to the limit. Perhaps I'm wrong, but we don't see the jumps in processing power we used to see in the 80s and 90s, due to clock speed jumps, the clock speed of most CPUs has stayed pretty much the same for a long time. As LLMs are essentially probabilistic in nature, this does open up options not available to current deterministic CPU designs, so that might be an avenue which gets exploited to bring this to local development.

__MatrixMan__ · 2026-01-12T06:02:51 1768197771

> there is always a better closed service you have to pay for

Always? I think that only holds for a certain amount of time (different for each sector) after which the open stuff is better.

I thought it was only true for dev tools, but I had to rethink it when I met a guy (not especially technical) who runs open source firmware on his insulin pump because the closed source stuff doesn't gives him as much control.

flyinglizard · 2026-01-11T11:41:51 1768131711

My concern is that inference hardware is becoming more and more specialized and datacenter-only. It won’t be possible any longer to just throw in a beefy GPU (in fact we’re already past that point).

wmwragg · 2026-01-11T15:50:40 1768146640

Yep, good point. If they don't make the hardware available for personal use, then we wouldn't be able to buy it even it could be used in a personal system.

epolanski · 2026-01-12T00:18:50 1768177130

Local inference is already very good on open models if you have the hardware for it.

Lio · 2026-01-12T07:17:10 1768202230

Yep I agree, I think people haven’t woken up to that yet. Moore’s Law is only going to make that easier.

I’m surprised by how good the models I can run on my old M1 Max laptop are.

In a year’s time open models on something like a Mac Studio M5 Ultra are going to be very impressive compared to the closed models available today.

They won’t be state of the art for their time but they will be good enough and you’ll have full control.

krainboltgreene · 2026-01-12T06:44:22 1768200262

> on reasonably priced hardware.

Thank goodness this isn't in a problem!

smallerfish · 2026-01-11T11:31:48 1768131108

This is the most valid criticism. Theoretically in several years we may be able to run Opus quality coding models locally. If that doesn't happen then yes, it becomes a pay to play profession - which is not great.

epolanski · 2026-01-12T00:18:14 1768177094

I have found that using more REPLs and doing leetcodes/katas prevents the atrophy to be honest.

In fact, I'd say I code even better since I started doing one hour per day of a mixture of fun coding and algo quizzes while at work I mostly focus on writing a requirements plan and implementation plan later and then letting the AI cook while I review all the output multiple times from multiple angles.

nebula8804 · 2026-01-11T11:26:42 1768130802

The hardware needs to catch up I think. I asked ChatGPT (lol) how much it would cost to build a Deepseek server that runs at a reasonable speed and it quoted ~400k-800k(8-16 H100 + the rest of the server).

Guess we are still in the 1970s era of AI computing. We need to hope for a few more step changes or some breakthrough on model size.

cyber_kinetist · 2026-01-11T11:45:56 1768131956

The problem is that Moore's law is dead, silicon isn't advancing as fast as what we've envisioned in the past, we're experiencing all sorts of quantum tunneling effects in order to cram as much microstructure as possible into silicon, and R&D for manufacturing these chips are climbing at a rapid rate. There's a limit to how we can fight against Physics, and unless we discover a totally new paradigm to alleviate this issues (ex. optical computing?) we're going to experience diminishing returns at the end of the sigmoid-like tech advancement cycle.

NitpickLawyer · 2026-01-11T11:37:18 1768131438

You can run most open models (excluding kimi-k2) on hardware that costs anywhere from 45 - 85k (tbf, specced before the vram wars of late 2025 so +10k maybe?). 4-8 PRO6000s + all the other bits and pieces gives you a machine that you can host locally and run very capable models, at several quants (glm4.7, minimax2.1, devstral, dsv3, gpt-oss-120b, qwens, etc.), with enough speed and parallel sessions for a small team (of agents or humans).

iLoveOncall · 2026-01-11T11:21:15 1768130475

[flagged]

isoprophlex · 2026-01-11T11:32:50 1768131170

Well, if you're programming without AI you need to understand what you're building too, lest you program yourself into a corner. Taking 3-5 minutes to speech-to-text an overview of why you want to build what exactly, using which general philosophies/tool seems like it should cost you almost zero extra time and brainpower

trollbridge · 2026-01-12T03:10:07 1768187407

… speech to text?

Why not just type?

isoprophlex · 2026-01-12T05:38:32 1768196312

I almost exclusively work from home, the LLM forgives sloppy STT, and my wrists suffer continuous low-grade pain when typing...

jsight · 2026-01-11T23:07:39 1768172859

I thought this way for a while. I still do to a certain degree, but I'm starting to see the wisdom in hurrying off into the change.

The most advanced tooling today looks nothing like the tooling for writing software 3 years ago. We've got multi-agent orchestration with built in task and issue tracking, context management, and subagents now. There's a steep learning curve!

I'm not saying that everyone has to do it, as the tools are so nascent, but I think it is worthwhile to at least start understanding what the state of the art will look like in 12-24 months.

xboxnolifes · 2026-01-12T03:48:38 1768189718

Early adopters get the advantage of only having to learn a trickle of new things every few weeks instead of everything all at once.

Part of the problem with things that iterate quickly is that iterations tend to reference previous versions. So, you try learning the new hotness (v261), but there are implied references to v254, v239, and v198. Then you realize, v1, v5, v48, v87, v138, v192, and v230 have cute identifiers that you aren't familiar with and are never explained anywhere. New concepts get introduced in v25, v50, v102, and v156 that later became foundational knowledge that is assumed to be understood by the reader and is never explained anywhere.

So, if you feel confident something will be the next hotness, it's usually best to be an early adopter, so you gain your knowledge slowly over years instead of having to cram when you need to pick it up.

CuriouslyC · 2026-01-11T11:12:05 1768129925

AI development is about planning, orchestration and high throughput validation. Those skills won't go away, the quality floor of model output will just rise over time.

Ekaros · 2026-01-11T11:10:29 1768129829

By their promises it should get so good that basically you do not need to learn it. So it is reasonable to wait until that point.

simonw · 2026-01-11T11:54:57 1768132497

If you listen to promises like that you're going get burned.

One of the key skills needed in working with LLMs is learning to ignore the hype and marketing and figure out what these things are actually capable of, as opposed to LinkedIn bluster and claims from CEOs who's net worth are tied to investor sentiment in their companies.

If someone spends more time talking about "AGI" then what they're actually building, filter that person out.

pydry · 2026-01-11T13:42:17 1768138937

>One of the key skills needed in working with LLMs is learning to ignore the hype and marketing and figure out what these things are actually capable of

This is precisely what led me to realize that while they have some use for code review and analyzing docs, for coding purposes they are fairly useless.

The hypesters responses' to this assertion exclusively into 5 categories. Ive never heard a 6th.

theshrike79 · 2026-01-12T12:03:11 1768219391

Do you always believe what the marketing people tell you?

If so, I've got a JPEG of a monkey to sell you =)

dkdcio · 2026-01-11T11:23:24 1768130604

this is a straw man, nobody serious is promising that. it is a skill like any other that requires learning

robot-wrangler · 2026-01-11T13:06:32 1768136792

I agree about skills actually, but it's also obvious that parent is making a very real point that you cannot just dismiss. For several years now and far short of wild AGI promises, the answer to literally every issue with casual or production AI has been something like "but the rate of model improvement.." or "but the tools and ecosystem will evolve.."

If you believe that uncritically about everything else, then you have to answer why agentic workflows or MCP or whatever is the one thing that it can't evolve to do for us. There's a logical contradiction here where you really can't have it both ways.

dkdcio · 2026-01-11T13:19:16 1768137556

I’m not understanding your point… (and would be genuinely curious to)? the models and systems around them have evolved and gotten better (over the past few years for LLMs and decades for “AI” more broadly)

oh I think I do get your point now after a few rereads (correct if wrong but you’re saying it should keep getting better until there’s nothing for us to do). “AI”, and computer systems more broadly, are not and cannot be viable systems. they don’t have agency (ironically) to affect change in their environment (without humans in the loop). computer systems don’t exist/survive without people. all the human concerns around what/why remain, AI is just another tool in a long line of computer systems that make our lives easier/more efficient

robot-wrangler · 2026-01-11T14:38:02 1768142282

AI Engineer to Software Engineer: Humans writing code is a waste of time, you can only hope to add value by designing agentic workflows

Prompt Engineer to AI Engineer: Designing agentic workflows is a waste of time, just pre/postfix whatever input you'd normally give to the agentic system with the request to "build or simulate an appropriate agentic workflow for this problem"

sensanaty · 2026-01-12T01:08:27 1768180107

Nobody serious, like every single AI CEO out there? I mean I agree, nobody should be taking them seriously, yet we're fast on track for a global financial meltdown because of these fraudsters and their "non-serious" words.

fabianholzer · 2026-01-11T13:53:34 1768139614

> nobody serious is promising that

There is a staggering number of unserious folks in the ears of people with corporate purchasing power.

Ekaros · 2026-01-11T11:30:42 1768131042

OpenAI is going to get to AGI. And AGI should in minutes build a system that takes vague input and produces fully functioning product out of it. Isn't singularity being promised by them?

dkdcio · 2026-01-11T11:35:56 1768131356

you’re just repeating the straw man. if you can’t think critically and just regurgitate every dumb thing you hear idk what to tell you. nobody serious thinks a “singularity” is coming. there’s not even a proper definition of “AGI”

your argument amounts to “some people said stupid shit one time and I took it seriously”

zahlman · 2026-01-11T11:31:03 1768131063

The idea, I think, is to gain experience with the loop of communicating ideas in natural language rather than code, and then reading the generated code and taking it as feedback.

It's not that different overall, I suppose, from the loop of thinking of an idea and then implementing it and running tests; but potentially very disorienting for some.

epolanski · 2026-01-12T00:10:30 1768176630

What would be the type of work you're doing where you wouldn't benefit from one or multiple of the following:

- find information about APIs without needing to open a browser

- writing a plan for your business-logic changes or having it reviewed

- getting a review of your code to find edge cases, potential security issues, potential improvements

- finding information and connecting the dots of where, what and why it works in some way in your code base?

Even without letting AI author a single line of code (where it can still be super useful) there are still major uses for AI.

nikcub · 2026-01-11T11:11:43 1768129903

I've used cursor and claude code both daily[0] within a month of their releases - i'm learning something new on how to work with and apply the tools almost every day.

I don't think it's a coincidence that some of the best developers[1] are using these tools and some openly advocating for them because it still requires core skills to get the most out of them

I can honestly say that building end-to-end products with claude code has made me a better developer, product designer, tester, code reviewer, systems architect, project manager, sysadmin etc. I've learned more in the past ~year than I ever have in my career.

[0] abandoned cursor late last year

[1] see Linus using antigravity, antirez in OP, Jared at bun, Charlie at uv/ruff, mitushiko, simonw et al

dkdcio · 2026-01-11T11:45:36 1768131936

I started heavy usage in April 2025 (Codex CLI -> some Claude Code and trying other CLIs + a bit of Cursor -> Warp.dev -> Claude Code) and I’m still learning as well (and constantly trying to get more efficient)

(I had been using GitHub Copilot for 5+ years already, started as an early beta tested, but I don’t really consider that the same)

I like to say it’s like learning a programming language. it takes time, but you start pattern matching and knowing what works. it took me multiple attempts and a good amount of time to learn Rust, learning effective use of these tools is similar

I’ve also learned a ton across domains I otherwise wouldn’t have touched

nicce · 2026-01-11T16:32:56 1768149176

> What I don't understand about this whole "get on board the AI train or get left behind" narrative, what advantage does an early adopter have for AI tools?

Replace that with anything and you will notice that people who are building startups in this area will want to bring the narrative like that as it usually highly increases the value of their companies. When narrative gets big enough, then big companies must follow - or they look like "lagging behind". Whether the current thing brings value or not. It is a fire that keeps feeding itself. In the end, when it gets big enough - we call it as bubble. Bubble that may explode. Or not.

Whether the end user gets actual value or not, is just side effect. But everyone wants to believe that that it brings value - otherwise they were foolish to jump in the train.

bsaul · 2026-01-11T11:13:08 1768129988

An ecosystem is being built around AI : Best prompting practices, mcps, skills, IDE integration, how to build a feedback loop so that LLM can test its output alone, plug to the outside world with browser extensions, etc...

For now i think people can still catch up quickly, but at the end of 2026 it's probably going to be a different story.

Avshalom · 2026-01-11T11:30:29 1768131029

Okay, end of 2026 then what? No one ever learns how to use the tools after that? No one gets a job until the pre-2026 generation dies?

hackable_sand · 2026-01-11T17:44:24 1768153464

For now i think people can still catch up quickly, but at the end of 2027 it's probably going to be a different story.

PessimalDecimal · 2026-01-12T06:07:02 1768198022

I heard 2028 is when it really gets impossible to catch up.

krupan · 2026-01-12T06:31:52 1768199512

edg5000 · 2026-01-11T11:14:59 1768130099

> probably going to be a different story

Can you elaborate? Skill in AI use will be a differentiator?

epolanski · 2026-01-12T00:22:07 1768177327

Yes.

At some point you will need to combine multiple skills together:

- communication

- engineering skills (understanding requirements, finding edge cases, etc)

- architectural proficiency

- prompting

- agentic workflows and skills

- context management

- and yes, proper old fashioned coding skills to keep things tidy and consistent

rvz · 2026-01-11T11:22:44 1768130564

> Best prompting practices, mcps, skills, IDE integration, how to build a feedback loop so that LLM can test its output alone, plug to the outside world with browser extensions, etc...

Ah yes, an ecosystem that is fundamentally inherently built on probabilisitic quick sand and even with the "best prompting practices", you still get agents violating the basics of security and committing API keys when they were told not to. [0]

[0] https://xcancel.com/valigo/status/2009764793251664279

simonw · 2026-01-11T12:33:24 1768134804

One of the skills needed to effectively use AI for code is to know that telling AI "don't commit secrets" is not a reliable strategy.

Design your secrets to include a common prefix, then use deterministic scanning tools like git hooks to prevent then from being checked in.

Or have a git hook that knows which environment variables have secrets in and checks for those.

jeroenhd · 2026-01-11T12:37:26 1768135046

That's such an incredibly basic concept, surely AIs have evolved to the point where you don't need to explicitly state those requirements anywhere?

simonw · 2026-01-11T15:08:32 1768144112

They can still make mistakes.

For example, what if your code (that the LLM hasn't reviewed yet) has a dumb feature in where it dumps environment variables to log output, and the LLM runs "./server --log debug-issue-144.log" and commits that log file as part of a larger piece of work you ask it to perform.

If you don't want a bad thing to happen, adding a deterministic check that prevents the bad thing to happen is a better strategy than prompting models or hoping that they'll get "smarter" in the future.

eichin · 2026-01-12T00:10:06 1768176606

Part of why these things feel "not fit for purpose" is that they don't include the things Simon has spent three years learning? (I know someone else who's doing multi-LLM development where he uses job-specialty descriptions for each "team member" that lets them spend context on different aspects of the problem; it's a fascinating exercise to watch, but it feels even more like "if this is how the tools should be used, why don't they just work that way"?)

thunky · 2026-01-11T14:56:22 1768143382

Doesn't seem to work for humans all the time either.

Some of this negativity I think is due to unrealistic expectations of perfection.

Use the same guardrails you should be using already for human generated code and you should be fine.

dkdcio · 2026-01-11T11:30:24 1768131024

I have tons of examples of AI not committing secrets. this is one screenshot from twitter? I don’t think it makes your point

CPUs are billions of transistors. sometimes one fails and things still work. “probabilistic quicksand” isn’t the dig you think it is to people who know how this stuff works