if you wrote this comment 70 years ago when computers were the size of rooms, it would make a lot of sense, and yet we know how history played out where everyone has a super computer in their pocket.
for some reason it feels like people are under the assumption that hardware isnt going to improve or something?
we're 10 months into agentic coding. Claude code came out in march. I dont understand how you are so unimaginative to think what this might look like in 5 years even with slow progress.
It might be genuinely useful in 5 years, my issue is how it's being marketed now. We're 6 months into "AI will be writing 90% of code in three months" among other ridiculous statements.
I sort of agree. If anything I feel like they've gotten a bit worse, but the advances in the tooling around them (eg claude code) has masked that slightly.
I think they are useful as an augmentation, but largely valueless for directly outputting code. Who knows if that will change. It's still made me more productive as a dev despite not oneshotting entire files. It's just not industry-changing, at least yet.
Agreed. It is very similar to gambling in how it tricks the human mind. I am sure some of this AI technology will prove yo be useful but the breakthrough has been just around the corner since soon after ChatGPT was released.
To add to this: the tooling or `harness` around the models has vastly improved as well. You can get far better results with older or smaller models today than you could 10 months ago.
The harnesses are where most progress is made at the moment. There are some definite differences in the major models as to what kind of code they prefer, but I feel the harnesses make the biggest difference.
Copilot + Sonnet is a complete idiot at times, while Claude Code + Sonnet is pretty good.
Might the creator of Claude Code have some … incentives … to develop like that, or at least claim that he does?
As someone who frequently uses Claude Code, I cannot say that a year's worth of features/improvements have been added in the last month. It bears repeating: if AI is truly a 10x force multiplier, you should expect to see a ~year's worth of progress in a month.
Nobody here claimed that Boris wasn't a biased source.
I do however think he is not an actively dishonest source. When he says "In the last thirty days, I landed 259 PRs -- 497 commits, 40k lines added, 38k lines removed. Every single line was written by Claude Code + Opus 4.5." I believe he is telling the truth.
That's what dogfooding your own product looks like!
I find it so weird that people are so bullish on the CLI form factor when they are literally just adding functionality that IDE based agents get for free. Stuff like improved diff tools and LSP support in the terminal instead of idk... just using a GUI/IDE?
IDEs have LSP support because they have a plugin that connects to an LSP server. The plugin is a very small piece of code compared to the language server. Creating a new client is not reinventing the wheel. In fact the entire philosophy of LSP is: one server to many different clients.
CLIs can also have a small piece of code that connects to an LSP server. I don’t see why IDEs should be the sole beneficiary of LSP just because they were the first clients imagined by the LSP creators.
I just saw a video of non-technical person describing how they use claude code to automate various workflows. They actually tried vscode and then the desktop gui.
Yet they preferred the CLI because it felt "more natural"
With agents, and Claude Code, we are *orchestrating* ... this is an unresolved UI/UX in industry. The same reasons `kubectl` didn't evolve to GUI probably apply here.
I use Zed and unless there is some MCP server that provides the same thing as the LSP server, the Zed agent won't have access, even though it's in an IDE that supposedly has this information
> It would be a huge step up if agent could interact with LSP (Language Server Protocol).
>
> It would offer :
>
> renaming all instances of a symbol over all files in one action
> quick navigation through code : fast find of all references to a property or method
> organize imports, format code, etc…
And last Friday a Cursor engineer replied "Thanks for the idea!"
So how does the AI agent in Cursor currently have access to LSP?
(I am most interested in having the agent use LSP for type checking, documentation of a method call, etc. rather than running slower commands)
(note, there is an open PR for Zed to pull LSP diagnostics into an AI agent thread https://github.com/zed-industries/zed/pull/42270 but it would be better if agents could make arbitrary LSP queries or something like that)
It would be so cool if LLMs could get the type of a variable when it's unclear (specially in languages with overloading and whatnot). Or could get autocomplete if they get stuck with a code. Really I think that agents and LSP should be hybrid, and maybe the agent could inform the LSP server of some things like things to warn (IDE diagnostics could be driven by a combination of LSP and AI agents)
Well my editor is in the terminal, so is my chatbot. I dont really want to change to an IDE to use a desktop app and a chatbot that both have half-baked UIs trying to complement each other.
it would hang for me half the time , the last time i tried it (3-4months ago?). when it worked, it seemed really good. but it hung often. time to try again
BDD was trying to recapture what TDD was originally, renamed from TDD in an effort to shed all the confusion that surrounded TDD. Of course, BDD picked up all of its own confusion (e.g. Gherkin/Cucumber and all that ridiculousness). So now it is rebranded as SDD to try and shed all of that confusion, with a sprinkle of "AI" because why not. Of course, SDD already is clouded in its own confusion.
Testing is the least understood aspect of computer science and it turns out that you cannot keep changing the name and expect everyone to suddenly get it. But that won't stop anyone. We patiently await the next rebrand.
Developers who aren't yet using AI would benefit from specs as well. They're good to have whether it's you or an LLM that's writing code. As a general rule, the clearer and less ambiguous the criteria you have, the better.
If your acceptance criteria state something like “produces output f(x) for any inout x, where f(x) is defined as follows: […]”, then you can’t possibly test that, because you can’t test all possible values of x. And if the criteria don’t state that, then they don’t cover the full specification of how the software is expected to behave, hence you have to go beyond those criteria to ensure that the software always behaves as expected.
You can’t prove that something is correct by example. Examples can only disprove correctness. And tests are always only examples.
reply