More

llmslave2 · 2026-01-09T20:32:07 1767990727

Yep. I'd say it's an order of magnitude more effort to read code you haven't written too, compared to reading code you wrote. So there is approximately zero chance the people using AI to generate code are reading it at a level where they actually understand it, or else they would lose all of their supposed productivity gains.

falloutx · 2026-01-09T20:53:21 1767992001

Yeah if people were good at reading code we wouldn't have the whole LGTM meme where the reviewer gives up as soon the PRs is bigger than 500 lines.

llmslave2 · 2026-01-09T20:30:34 1767990634

I actually don't think they would agree with most of this. Why would you think that?

llmslave2 · 2026-01-09T20:23:47 1767990227

Good news: the evidence points to it being slower than non-ai workflows. So we're destroying our economy, society, and planet to make worse software, more slowly! :)

falloutx · 2026-01-09T20:51:15 1767991875

we are also making all software much worse at the same time. I dont think every app needs AI but apparently they do. Notion used to be a Zen writing app back in the day, canva used to an app where you can do simple graphics without a complicated tools panel.

dang · 2026-01-10T01:12:42 1768007562

I think Pete's article last year made a good case for regarding this as the "horseless carriage" stage, i.e. growing pains around how to use a new technology effectively.

AI Horseless Carriages - https://news.ycombinator.com/item?id=43773813 - April 2025 (478 comments)

Or maybe it's analogous to the skeuomorphic phase of desktop software. Clumsy application of previous paradigm to new one; new wine in old bottles; etc.

llmslave2 · 2026-01-09T20:58:10 1767992290

Affinity used to be good, now it's Canva++ with AI. Disgusting work

dang · 2026-01-10T01:15:14 1768007714

> Disgusting work

Please don't do this here. Thoughtful criticism is fine on this site but snark and name-calling are not.

https://news.ycombinator.com/newsguidelines.html

Edit: on closer look, you've been breaking the HN guidelines so badly and so consistently that I've banned the account. Single-purpose accounts aren't allowed here in any case.

llmslave2 · 2026-01-09T20:22:34 1767990154

I thought the article was going to be about AI zealotry but it was just AI zealotry.

mrocklin · 2026-01-10T03:15:07 1768014907

llmslave2 · 2026-01-09T10:24:50 1767954290

ensloppification

llmslave2 · 2026-01-09T10:19:17 1767953957

> Question is: what changed? New founding round coming up, end of fiscal year, planning for IPO? Do they have to cut losses?

I'm gonna say IPO considering their recent aggressive stealth marking campaign on X, Reddit, and HN.

cheeze · 2026-01-09T18:18:57 1767982737

Can you share more on this?

llmslave2 · 2026-01-09T10:15:45 1767953745

Anthropic developers run 10 Claude Code instances at once, with unlimited access to the best models.

llmslave2 · 2026-01-09T10:08:33 1767953313

> More importantly, Anthropic should have open sourced their Claude Code CLI a year ago. (They can and should just open source it now.)

Who cares? Just have Claude vibe code it in an afternoon...

llmslave2 · 2026-01-08T22:16:09 1767910569

Not everyone just mindlessly plays the AI slot machine to generate code these days.

llmslave2 · 2026-01-08T22:01:46 1767909706

One thing I find really funny is when AI enthusiasts make claims about agents and their own productivity its always entirely anecdotally based on their own subjective experience, but when others make claims to the contrary suddenly there is some overwhelming burden of proof that has to be reached in order to make any sort of claims regarding the capabilities of AI workflows. So which is it?

misja111 · 2026-01-09T11:45:31 1767959131

A while ago someone posted a claim like that on LinkedIn again. And of course there was the usual herd of LinkedIn sheep who were full of compliments and wows about the claim he was making: a 10x speedup of his daily work.

The difference with the zillion others who did the same, is that he attached a link to a live stream where he was going to show his 10x speedup on a real life problem. Credits to him for doing that! So I decided to go have a look.

What I then saw was him struggling for one hour with some simple extension to his project. He didn't manage to finish in the hour what he was planning to. And when I had some thought about how much time it would have cost me by hand, I found it would have taken me just as long.

So I answered him in his LinkedIn thread and asked where the 10x speed up was. What followed was complete denial. It had just been a hick up. Or he could have done other things in parallel while waiting 30 seconds for the AI to answer. Etc etc.

I admit I was sceptic at the start but I honestly had been hoping that my scepticism would be proven wrong. But not.

Folcon · 2026-01-09T13:20:42 1767964842

I'm going to try and be honest with you because I'm where you were at 3 months ago

I honestly don't think there's anything I can say to convince you because from my perspective that's a fools errand and the reason for that has nothing to do with the kind of person either of us are, but what kind of work we're doing and what we're trying to accomplish

The value I've personally been getting which I've been valuing is that it improves my productivity in the specific areas where it's average quality of response as one shot output is better than what I would do myself because it is equivalent to me Googling an answer, reading 2 to 20 posts, consolidating that information together and synthesising an output

And that's not to say that the output is good, that's to say that the cost of trying things as a result is much cheaper

It's still my job to refine, reflect, define and correct the problem, the approach etc

I can say this because it's painfully evident to me when I try and do something in areas where it really is weak and I honestly doubt that the foundation model creators presently know how to improve it

My personal evidence for this is that after several years of tilting those windmills, I'm successfully creating things that I have on and off spent the last decade trying to create successfully and have had difficulty with not because I couldn't do it, but because the cost of change and iteration was so high that after trying a few things and failing, I invariably move to simplifying the problem because solving it is too expensive, I'm now solving a category of those problems now, this for me is different and I really feel it because that sting of persistent failure and dread of trying is absent now

That's my personal perspective on it, sorry it's so anecdotal :)

bigfishrunning · 2026-01-09T14:27:23 1767968843

>The value I've personally been getting which I've been valuing is that it improves my productivity in the specific areas where it's average quality of response as one shot output is better than what I would do myself because it is equivalent to me Googling an answer, reading 2 to 20 posts, consolidating that information together and synthesising an output

>And that's not to say that the output is good, that's to say that the cost of trying things as a result is much cheaper

But there's a hidden cost here -- by not doing the reading and reasoning out the result, you have learned nothing and your value has not increased. Perhaps you extended a bit less energy producing this output, but you've taken one more step down the road to atrophy.

rectang · 2026-01-09T20:38:46 1767991126

Seeing the code that the LLM generates and occasionally asking it to explain has been an effective way to improve my understanding. It's better in some ways than reading documentation or doing tutorials because I'm working on a practical project I'm highly motivated by.

I agree that there is benefit in doing research and reasoning, but in my experience skill acquisition through supervising an LLM has been more efficient because my learning is more focused. The LLM is a weird meld of domain expert/sycophant/scatterbrain but the explanations it gives about the code that it generates are quite educational.

ben_w · 2026-01-09T15:14:28 1767971668

I think there's a potential unstated assumption here, though forgive me if it was made explicit elsewhere and/or I missed it.

LLM-assisted can be with or without code review. The original meaning of "vibe coding" was without, and I absolutely totally agree this rapidly leads to a massive pile of technical debt, having tried this with some left-over credit on a free trial specifically to see what the result would be. Sure, it works, but it's a hell of a mess that will make future development fragile (unless the LLMs improve much faster than I'm expecting) for no good reason.

Before doing that, I used Claude Code the other way, with me doing code reviews to make sure it was still aligned with my ideas of best practices. I'm not going to claim it was perfect, because it did a python backend and web front end for a webcam in one case and simultaneously on a second project a browser-based game engine and example game for that engine and on a third simultaneous project a joke programming language, and I'm not a "real" python dev or "real" web dev or any kind of compiler engineer (last time I touched Yacc before this joke language was 20 years earlier at university). But it produced code I was satisfied I could follow, understand, wasn't terrible, had tests.

I wouldn't let a junior commit blindly without code review and tests because I know what junior code looks like from all the times I've worked with juniors (or gone back to 20 year old projects of my own), but even if I was happy to blindly accept a junior's code, or even if the LLM was senior-quality or lead quality, the reason you're giving here means code review before acceptance is helpful for professional development even when all the devs are at the top of their games.

bigfishrunning · 2026-01-09T15:25:25 1767972325

Yes, but I'm talking about more then code review -- there is a ton of value in discovering all of the ways not to solve a problem. When reading 25 forum posts or whatever in trying to write some function, you're learning more then just the answer. You're picking up a ton of context about how these sorts of problems are solved. If all you're doing is reviewing the output of some code generator, your mental context is not growing in the same way.

0x262d · 2026-01-09T15:54:18 1767974058

I'm curious if you think the same thing was lost with the transition from reading man pages and first-party documentation to going to stackoverflow or google first (at least, I assume the former was more common a couple decades ago)

bigfishrunning · 2026-01-09T16:04:53 1767974693

What was lost in that transition was the required quality of that first-party documentation decreased; generally that first party documentation simply didn't contain enough information, so you needed to determine things empirically or read source code to get more information. I do think the culture of "copy-and-paste from stackoverflow" harmed the general competency of programmers, but having more third-party information available was only a positive thing.

newsoftheday · 2026-01-09T17:33:57 1767980037

Before 2022 age of modern AI, man pages, SO and Google all were the results of humans, not AI fabrication and hallucination.

coldtea · 2026-01-09T20:26:15 1767990375

A lot was lost then too.

Freebytes · 2026-01-09T18:36:07 1767983767

Merely choosing lines to copy and paste from one file of your own code to another is a learning experience for your brain. AI is excellent for removing a lot of grunt work, but that type of work also reinforces your brain even if you think you are learning nothing. Something can still be lost even if AI is merely providing templates or scaffolding. The same can be said of using Google to find examples, though. You should try to come up with the correct function name or parameter list yourself in your head before using a search engine or AI. And that is for the moist simple examples, e.g. "SQL table creation example". These should be things we know off the top of our heads, so we should first try to type it out before we go to look for an answer.

Aeolun · 2026-01-09T17:37:46 1767980266

> By not doing the reading and reasoning out the result, you have learned nothing and your value has not increased

AI helps at the margins.

It’s like adding anti-piracy. Some people would simply never have bought the game unless they can pirate it.

There’s a large volume of simple tools, or experimental software that I would simply never had the time to build the traditional way.

Folcon · 2026-01-09T16:23:48 1767975828

I mean you're not wrong

I suppose the way I approach this is, I use libraries which solve problems that I have, that in principle understand, because I know and understand the theory, but in practice I don't know the specific details, because I've not implemented the solution myself

And honestly, it's not my job to solve everything, I've just got to build something useful or that serves my goals

I basically put LLM's into that category, I'm not much of a NIH kinda person, I'm happy to use libraries, including alpha ones on projects if they've been vetted over the range of inputs that I care about, and I'm not going to go into how to do that here, because honestly it's not that exciting, but there's very standard boring ways to produce good guarantees about it's behaviour, so as long as I've done that, I'm pretty happy

So I suppose what I'm saying is that isn't a hidden cost to me, it's a pragmatic decision I made that I was happy with the trade off :)

When I want to learn, and believe me I do now and again, I'll focus on that there :)

newsoftheday · 2026-01-09T17:31:14 1767979874

> I use libraries

> I basically put LLM's into that category

That says a lot to be sure.

Folcon · 2026-01-09T22:08:04 1767996484

Seeing as you've chosen to be ambiguous, I'll interpret your comment positively :)

Otherwise feel free to put forward a criticism

brianwawok · 2026-01-09T13:31:15 1767965475

Example for me: I am primarily a web dev today. I needed some kuberntes stuff setup. Usually that’s 4 hours of google and guess and check. Claude did it better in 15 minutes.

Even if all it does is speed up the stuff i suck at, that’s plenty. Oh boy docker builds, saves my bacon there too.

Draiken · 2026-01-09T13:52:54 1767966774

And you learned nothing and have no clue if what it spit out is good or not.

How can you even assume what it did is "better" if you have no knowledge of kubernetes in the first place? It's mere hope.

Sure it gets you somewhere but you learned nothing in the way and now depend on the LLM to maintain it forever given you don't want to learn the skill.

I use LLMs to help verify my work and it can sometimes spot something I missed (more often it doesn't but it's at least something). I also automate some boring stuff like creating more variations of some tests, but even then I almost always have to read its output line by line to make sure the tests aren't completely bogus. Thinking about it now it's likely better if I just ask for what scenarios could be missing, because when they write it, they screw it up in subtle ways.

It does save me some time in certain tasks like writing some Ansible, but I have to know/understand Ansible to be confident in any of it.

These "speedups" are mostly short term gains in sacrifice for long term gains. Maybe you don't care about the long term and that's fine. But if you do, you'll regret it sooner or later.

My theory is that AI is so popular because mediocrity is good enough to make money. You see the kind of crap that's built these days (even before LLMs) and it's mostly shit anyways, so whether it's shit built by people or machines, who cares, right?

Unfortunately I do, and I rather we improve the world we live in instead of making it worse for a quick buck.

IDK how or why learning and growing became so unpopular.

dpark · 2026-01-09T17:17:50 1767979070

> Sure it gets you somewhere but you learned nothing in the way and now depend on the LLM to maintain it forever given you don't want to learn the skill.

The kind of person who would vibe code a bunch of stuff and push it with zero understanding of what it does or how it does it is the kind of person who’s going to ruin the project with garbage and technical debt anyway.

Using an LLM doesn’t mean you shouldn’t look at the results it produces. You should still check it results. You should correct it when it doesn’t meet your standards. You still need to understand it well enough to say “that seems right”. This isn’t about LLMs. This is just about basic care for quality.

But also, I personally don’t care about being an expert at every single thing. I think that is an unachievable dream, and a poor use of individual time and effort. I also pay people to do stuff like maintenance on my car and installing HVAC systems. I want things done well. That doesn’t mean I have to do them or even necessarily be an expert in them.

Bombthecat · 2026-01-09T16:28:11 1767976091

I notice this already after around of 6 months heavy usage. Skills decline, even information gathering etc

jpadkins · 2026-01-09T18:45:24 1767984324

I think it is more accurate to say some skills are declining (or not developing) while a different set of skills are improving (the skill of getting an LLM to produce functional output).

Similar to if someone started writing a lot of C, their assembly coding skills may decline (or at least not develop). I think all higher levels of abstraction will create this effect.

llmslave2 · 2026-01-09T20:39:03 1767991143

> while a different set of skills are improving (the skill of getting an LLM to produce functional output

Lmaooooo

p410n3 · 2026-01-09T15:14:53 1767971693

I agree with both of your points since I use LLMs for things I am not good at and dont give a single poop about. The only things i did with LLMs are three examples from the last two years:

- Some "temporary" tool I built years ago as a pareto-style workaround broke. (As temporary tools do after some years). Its basically a wrapper that calls a bunch of XSLs on a bmecat.xml every 3-6 months. I did not care to learn XSL back then and I dont care to do it now. Its arcane and non-universal - some stuff only works with certain XSL processors. I asked the LLM to fix stuff 20 times and eventually it got it. Probably got that stuff off my back another couple years.

- Some third party tool we use has a timer feature that has a bug where it sets a cookie everytime you see the timer once per timer (for whatever reason... the timers are set to end a certain time and there is no reason to attach it to a user). The cookies have a life time of one year. We run time limited promotions twice a week so that means two cookies a week for no reason. Eventually our WAF got triggered because it has a rule to block requests when headers are crazy long - which they were because cookies. I asked an LLM to give me a script that clears the cookie when its older than 7 days because I remember the last time i hacked together cookie stuff it also felt very "wtf" in a javascript kinda way and I did not care to relive that pain. This was in place until the third party tool fixed the cookie lifetime for some weeks.

- We list products on a marketplace. The marketplace has their own category system. We have our own category system. Frankly theirs kinda suck for our use case because it lumps a lot of stuff together, but we needed to "translate" the categories anyway. So I exported all unique "breadcrumbs" we have and gave that + the categories from the marketplace to an LLM one by one by looping through the list. I then had an apprentice from another dept. that has vastly more product knowledge than me look over that list in a day. Alternative would have been to have said apprentice do that stuff by hand, which is a task I would have personally HATED so I tried to lessen the burden for them.

All these examples are free tier in whatever I used.

We also use a vector search at work. 300,000 Products with weekly updates of the vector db.

We pay 250€ / mo for all of the qdrant instances across all environments and like 5-10 € in openai tokens. And we can easily switch whatever embedding model we use at anytime. We can even selfhost a model.

misja111 · 2026-01-09T14:08:50 1767967730

No I agree with you, there are area's where AI is helping amazingly. Every now and then it helps me with some issue as well, which would have cost me hours earlier and now it's done in minutes. E.g. some framework that I'm not that familiar with, or doing the scaffolding for some unit test.

However this is only a small portion of my daily dev work. For most of my work, AI helps me little or not at all. E.g. adding a new feature to a large codebase: forget it. Debugging some production issue: maybe it helps me a little bit to find some code, but that's about it.

And this is what my post was referring to: not that AI doesn't help at all, but to the crazy claims (10x speedup in daily work) that you see all over social media.

newsoftheday · 2026-01-09T16:59:59 1767977999

> I'm going to try and be honest with you because I'm where you were at 3 months ago

> I honestly don't think there's anything I can say to convince you

> The value I've personally been getting which I've been valuing

> And that's not to say that the output is good

> My personal evidence for this is that after several years of tilting those windmills

It sounds to me like you're rationalizing and your opening sentences embed your awareness of the fallibility of what you say and clearly believe about your situation later.

I feel there are two types of programmers who use AI:

    Type A who aren't very good but AI makes them feel better about themselves.

    Type B who are good with or without AI and probably slightly better with it but at a productivity cost due to fixing AI all the way through, rather than a boost; leading to their somewhat negative but valid view of AI.

econ · 2026-01-09T19:57:53 1767988673

It's great in unfamiliar terrain.

FloorEgg · 2026-01-09T20:29:37 1767990577

It's great when the terrain is unfamiliar to the user but extremely familiar to the LLM. And it's useless in the opposite.

The best programmers are going to be extremely familiar with terrains that are unfamiliar to the LLMs, which is why their views are so negative. These are people working on core parts of complex high performing highly scalable systems, and people with extreme appreciation for the craft of programming and code quality.

But the most productive developers focused on higher level user value and functionality (e.g pumping out full stack apps or features), are more likely to be working with commonly used technologies while also jumping around between technologies as a means to a functionality or UX objective rather than an end of skill development, elegant code, or satisfying curiosity.

I think this explains a lot of the difference in perspectives. LLMs offer value in the latter but not the former.

It's a shame that so many of the people in one context can't empathize with the people in the other.

lawlessone · 2026-01-09T18:11:19 1767982279

you haven't contributed much to GitHub since 2022?

*edit unless your commits are elsewhere?

lazyfanatic42 · 2026-01-09T12:11:26 1767960686

I think people get into a dopamine hit loop with agents and are so high on dopamine because its giving them output that simulates progress that they don't see reality about where they are at. It is SO DAMN GOOD AT OUTPUT. Agents love to output, it is very easy to think its inventing physics.

Obviously my subjective experience

queueueue · 2026-01-09T13:26:06 1767965166

Ironic that I’m going to give another anecdotal experience here, but I’ve noticed this myself too. I catch myself trying to keep on prompting after an llm has not been able to solve some problem in a specific way. While I can probably do it faster at that point if I switch to doing it fully myself. Maybe because the llm output feels like its ‘almost there’, or some sunken cost fallacy.

qwery · 2026-01-09T18:44:51 1767984291

Not saying this is you, but another way to look at it is that engaging in that process is training you (again, not you, the user) -- the way you get results is by asking the chat bot, so that's what you try first. You don't need sunk cost or gambling mechanics, it's just simple conditioning.

Press lever --> pellet.

Want pellet? --> press lever.

Pressed lever but no pellet? --> press lever.

raducu · 2026-01-09T12:55:26 1767963326

> I think people get into a dopamine hit loop

I also think that's the case, but I'm open to the idea that there are people that are really really good at this and maybe they are indeed 10x.

My experience is that for SOME tasks LLMs help a lot, but overall nowhere near 10x.

Consistently it's probably.... ~1X.

The difference is I procrastinate a lot and LLMs actually help me not procrastinate BECAUSE of that dopamine kick and I'm confident I will figure it out with an LLM.

I'm sure there are many people who got to a conclusion on their to-do projects with the help of LLMs and without them, because of procrastination or whatever, they would not have had a chance to.

It doesn't mean they're now rich, because most projects won't make you rich or make you any money regardless if you finish them or not

sharadov · 2026-01-09T16:29:42 1767976182

You nailed it - like posting on social media and getting dopamine hits as you get likes and comments. Maybe that's what has got all these vibe coders hooked.

GuB-42 · 2026-01-09T15:12:10 1767971530

> What I then saw was him struggling for one hour with some simple extension to his project. He didn't manage to finish in the hour what he was planning to. And when I had some thought about how much time it would have cost me by hand, I found it would have taken me just as long.

For all who are doing that, what is the experience of coding in a livestream? It is something I never attempted, the simple idea makes me feel uncomfortable. A good portion of my coding would be rather cringe, like spending way too long on a stupid copy-paste or sign error that my audience would have noticed right away. On the other hand, sometimes, I am really fast because everything is in my head, but then I would probably lose everyone. I am impressed when looking at live coders by how fluid it looks compared to my own work, maybe there is a rubber duck effect at work here.

All this to say that I don't know how working solo compares to a livestream. It is more or less efficient, maybe it doesn't matter that much when you get used to it.

qwery · 2026-01-09T18:53:19 1767984799

Have done it, never enough of an audience to be totally humiliated. It's never going to be more efficient.

But as for your cringe issue that the audience noticed, one could see that to be a benefit -- prefer to have someone say e.g. "you typed `Normalise` (with an 's') again, C++ is written in U.S. English, don't you know / learn to spell, you slime" upfront than waiting for compiler to tell you that `Normalise` doesn't exist, maybe?

QuercusMax · 2026-01-09T18:07:42 1767982062

I suspect livestream coding, like programming competition coding and whiteboard coding for interviews, is a separate skill that's fairly well correlated with being able to solve useful problems, but it is not the same thing. You can be an excellent problem solver without being good at doing so while being watched, under time pressure.

Kerrick · 2026-01-09T18:12:46 1767982366

I feel like I've been incredibly productive with AI assisted programming over the past few weeks, but it's hard to know what folks' baselines are. So in the interest of transparency, I pushed it all up to sourcehut and added Co-Authored-By footers to the AI-assisted commits (almost all of them).

Everything is out there to inspect, including the facts that I:

- was going 12-18 hours per day

- stayed up way too late some nights

- churned a lot (+91,034 -39,257 lines)

- made a lot of code (30,637 code lines, 11,072 comment lines, plus 4,997 lines of markdown)

- ended up with (IMO) pretty good quality Ruby (and unknown quality Rust).

This is all just from the first commit to v0.8.0. https://git.sr.ht/~kerrick/ratatui_ruby/tree/v0.8.0

What do you think: is this fast, or am I just as silly as the live-streamer?

P.S. - I had an edge here because it was a green-field project and it was not for my job, so I had complete latitude to make decisions.

qwery · 2026-01-09T18:27:05 1767983225

I don't really know Ruby, so maybe I'm missing something major, but your commit messages seem extremely verbose yet messy (I can't make heads or tails of them) and I'm seeing language like "deprecated" and a stream of "releases" within a period of hours and it just looks a bit like nonsense.

Don't take "nonsense" negatively, please -- I mean it looks like you were having fun, which is certainly to be encouraged.

Kerrick · 2026-01-09T18:58:56 1767985136

The commit messages with a Co-Authored-By footer were all generated. I recommend clicking the "tree" link to see the actual code. Specifically:

- README.md explains the basics https://git.sr.ht/~kerrick/ratatui_ruby/tree/v0.8.0/item/REA...

- CHANGELOG.md is better than the commit messages, and filtered to only what app devs using the library likely care about: https://git.sr.ht/~kerrick/ratatui_ruby/tree/v0.8.0/item/CHA...

- doc/ holds the Markdown documentation, which I heavily reviewed. https://git.sr.ht/~kerrick/ratatui_ruby/tree/v0.8.0/item/doc

- lib/ holds the Ruby source code of the library, which I heavily designed and reviewed. https://git.sr.ht/~kerrick/ratatui_ruby/tree/v0.8.0/item/lib

- examples/ holds the Ruby source code of some toy apps built with the library. https://git.sr.ht/~kerrick/ratatui_ruby/tree/v0.8.0/item/exa...

- bin/ holds a few Ruby scripts & apps to automate some ops (check out announce) https://git.sr.ht/~kerrick/ratatui_ruby/tree/v0.8.0/item/bin

- tasks/ holds some more Ruby scripts & apps to automate some ops (most I did not read, but I heavily designed and reviewed bump and terminal_preview) https://git.sr.ht/~kerrick/ratatui_ruby/tree/v0.8.0/item/tas...

- ext/ holds the Rust source code of the library, which I did not read most of. https://git.sr.ht/~kerrick/ratatui_ruby/tree/v0.8.0/item/ext

I was having a lot of fun, and part of the reason I took deprecations and releases seriously was because I hoped to encourage adoption. And that I did: https://todo.sr.ht/~kerrick/ratatui_ruby/4 and https://github.com/sidekiq/sidekiq/blob/main/bin/tui

ruszki · 2026-01-09T12:03:52 1767960232

There were such people also here.

Copy-pasting the code would have been faster than their work, and there were several problems with their results. But they were so convinced that their work is quick and flawless, that they post a video recording of it.

jennyholzer4 · 2026-01-09T12:42:20 1767962540

Hackernews is dominated by these people

LLM marketers have succeeded at inducing collective delusion

judahmeek · 2026-01-09T13:37:53 1767965873

> LLM marketers have succeeded at inducing collective delusion

That's the real trick & one I desperately wish I knew how to copy.

I know there's a connection to Dunning Kruger & I know that there's a dopamine effect of having a responsive artificial minion & there seems to be some of that "secret knowledge" sauce that makes cults & conspiracies so popular (there's also the promise of less effort for the same or greater productivity).

Add the list grows, I see the popularity, but I doubt I could easily apply all these qualities to anything else.

jennyholzer4 · 2026-01-09T13:52:41 1767966761

IMO algorithmically generated "social" media feeds combined with the lack of adequate mass-media alternatives have supercharged cult recruitment in the last approximately 10 years.

Stupid people in my life have been continually and recklessly joining harebrained cults for the last 5 years.

Really I think it's probably much, much easier to start a cult these days than it has ever been. Good news for tech company founders I guess, bad news for American culture, American society, and the American people.

codyb · 2026-01-09T14:25:10 1767968710

One way to help stop it is to get off social media and stop giving these tech billionaires so much money.

The less people on social media, the less real the network effect is, the less people who join in the first place, the less money the billionaires have to throw hundreds of millions into politics, the less inadvertent cult members.

I've gotten to the point where I just leave my phone at home at this point, and it has been incredibly nice. Before that I deleted most apps that I found to be time wastes, deleted all social media (HN and two small discords are my exception).

It's very nice, I'm less stressed, I feel more in the moment, I respond to my friends when I check my phone every few hours on the speaker in the other room.

I encourage others to try it, add it to your dry January.

And ya know what I ain't doing a lick of? Sending money and reams of data to these billionaires I think are really lame individuals with corrupted moral compasses.

Now it ain't perfect, I'm sure Google's still getting reams of info about me from my old Gmail account that I still use sometimes, and Apple too from a few sources. But... getting closer!

So many folk sit here and recognize the same problems I do, the way it warps your attention, the addictiveness of the handheld devices, the social media echo chambers, the rising influence of misinformation, the lack of clarity between real and fake...

Seems like there's a solution in front of us :-)

dpark · 2026-01-09T16:58:30 1767977910

> So I answered him in his LinkedIn thread and asked where the 10x speed up was. What followed was complete denial. It had just been a hick up. Or he could have done other things in parallel while waiting 30 seconds for the AI to answer. Etc etc.

So I’ve been playing with LLMs for coding recently, and my experience is that for some things, they are drastically faster. And for some other things, they will just never solve the problem.

Yesterday I had an LLM code up a new feature with comprehensive tests. It wasn’t an extremely complicated feature. It would’ve taken me a day with coding and testing. The LLM did the job in maybe 10 minutes. And then I spent another 45 minutes or so deeply reviewing it, getting it to tweak a few things, update some test comments, etc. So about an hour total. Not quite a 10x speed up, but very significant.

But then I had to integrate this change into another repository to ensure it worked for the real world use case and that ended up being a mess, mostly because I am not an expert in the package management and I was trying to subvert it to use an unpublished package. Debugging this took the better part of the day. For this case, the LLM may be saved me maybe 20% because it did have a couple of tricks that I didn’t know about. But it was certainly not a massive speed up.

So far, I am skeptical that LLM’s will make someone 10x as efficient overall. But that’s largely because not everything is actually coding. Subverting the package management system to do what I want isn’t really coding. Participating in design meetings and writing specs and sending emails and dealing with red tape and approvals is definitely not coding.

But for the actual coding specifically, I wouldn’t be surprised if lots of people are seeing close to 10x for a bunch of their work.

cmiles74 · 2026-01-09T13:26:21 1767965181

I suspect there's also a good amount of astroturfing happening here as well, making it harder to find the real success stories.

jlarocco · 2026-01-09T15:50:38 1767973838

I've noticed a similar trend. There seems to be a lot of babysitting and hand holding involved with vibe-coding. Maybe it can be a game changer for "non-technical founders" stumbling their way through to a product, but if you're capable of writing the code yourself, vibe coding seems like a lot of wasted energy.

lossyalgo · 2026-01-09T20:21:31 1767990091

Shopify's CEO just posted the other day that he's super productive using the newest AI models and many of the supportive comments responding to his claim were from CEOs of AI startups.

alex1138 · 2026-01-09T11:49:50 1767959390

You're supposed to believe in his burgeoning synergy so that one day you may collaborate to push industry leading solutions

Bombthecat · 2026-01-09T14:47:03 1767970023

Even if this would take two, three hours and a vibe coder, still cheaper then a real developer

boringg · 2026-01-09T14:30:31 1767969031

Theres too much money, time and infrastructure committed for this to be anything but successful.

Its tougher than a space race or the nuclear bomb race because there are fewer hard tangibles as evidence of success.

seidleroni · 2026-01-09T14:36:31 1767969391

I think there is also some FOMO involved. Once people started saying how AI was helping them be more productive, a lot of folks felt that if they didn't do the same, they were lagging behind.

chankstein38 · 2026-01-09T15:28:55 1767972535

Sounds like someone trying to sell a course or something.

cozzyd · 2026-01-09T20:24:35 1767990275

10 times zero is still zero!

broochcoach · 2026-01-09T23:39:49 1768001989

Maybe he would have otherwise struggled for 10 hours on that extension.

AstroBen · 2026-01-08T22:57:33 1767913053

It's an impossible thing to disprove. Anything you say can be countered by their "secret workflow" they've figured out. If you're not seeing a huge speedup well you're just using it wrong!

The burden of proof is 100% on anyone claiming the productivity gains

anonzzzies · 2026-01-09T02:51:15 1767927075

I go to meetups and enjoy myself so much; 80% of people are showing how to install 800000000 MCPs on their 92gb macbook pros, new RAG memory, n8n agent flows, super special prompting techniques, secret sauces, killer .md files, special vscode setups and after that they still are not productive vs just vanilla claude code in a git repos. You get people saying 'look I only have to ask xyz... and it does it! magic' ; then you just type in vanilla CC 'do xyz' and it does exactly the same thing, often faster.

mikestorrent · 2026-01-09T06:21:18 1767939678

This was always the case. People obsessing over keyboards, window managers, emacs setups... always optimizing around the edges of the problem, but this is all taking an incredible amount of their time versus working on real problems.

sheepscreek · 2026-01-09T06:57:17 1767941837

Yes, the thing they realize much later in life is that perhaps they enjoyed the act of gardening (curating your tools, workflows, etc) much more than farming (being downright focused and productive on the task at hand).

Sadly gardening doesn’t pay the bills!

hdjrudni · 2026-01-09T07:18:07 1767943087

If I only spend $1000 on hydroponics and 3 weeks tending to a vertical garden I can grow a $1 head of lettuce FOR FREE!

ben_w · 2026-01-09T15:25:38 1767972338

I tried growing lettuce in some cut up plastic bottles at university in soil from the nearby footpath, I think even with the cheap student approach I spent more on the (single pack of) seeds than a fully grown lettuce costs, and only managed about four individual leaves that were only about 1cm by 5cm.

DANmode · 2026-01-09T08:52:44 1767948764

What if I haven’t spent anything,

and I’m making money with lettuce I grew in the woods?

(or, in Anthropic/sama’s backyards)

anonzzzies · 2026-01-09T07:53:45 1767945225

yep, and I have the same thing, but then I am not going to tell everyone it is super productive for the actual task of farming. I say that I have a hobby farming (which I do) and talk about my tools and my meager yields (which won't make any money if sold). I am not going to say that my workflows are so productive while my neighbour who is a professional farmer just has old crap and just starts and works from 5 am to 9 pm making a living of his land.

fc417fc802 · 2026-01-09T14:24:16 1767968656

I like farming but a lot of the tools are annoying to use so I find myself tinkering with them (gardening in your analogy I guess). It's not that I prefer tinkering in the shop to farming. More that I just have very little patience for tools that don't conform to the ways in which I think about the world.

songodongo · 2026-01-09T12:45:48 1767962748

Same thing happens in music production. If only I had this guitar, or that synth, or these plugins…

multjoy · 2026-01-09T13:27:19 1767965239

Gear Acquisition Syndrome is a very different problem. Even if you haven't cured the issue the new synth was meant to fix, at least you have a new synth.

sehansen · 2026-01-09T15:05:45 1767971145

It's the four hobbies all over again: https://brooker.co.za/blog/2023/04/20/hobbies.html

whoiskevin · 2026-01-09T14:08:10 1767967690

A better keyboard is a hill I will die on.

mikestorrent · 2026-01-09T19:30:58 1767987058

I have a fantastic keyboard, but I'm not taking pictures of it, changing the keycaps, posting about it. It's a tool, not a fetish; that's how I differentiate these things.

butlike · 2026-01-09T14:46:03 1767969963

It's a keyboard attached to an article of clothing you put your head into so the keys drape over your shoulders. You then type, but also end up giving yourself a shoulder massage!

atakan_gurkan · 2026-01-09T08:04:37 1767945877

Yes, this happens quite often. So often that I wonder if it is among the symptoms of some psychiatric or neurological disorder.

Bridged7756 · 2026-01-09T14:30:25 1767969025

It's just boredom probably. Obsessing over productivity tools is relatively more productive than say, something completely unrelated to your work.

abakker · 2026-01-09T05:15:02 1767935702

That perfectly ties with my experience. Just direct prompts, with limited setup and limited context seem to work better or just as well as complex custom GPTs. There are not just diminishing, but inverting returns to complexity in GPTs

serf · 2026-01-09T06:10:35 1767939035

limited prompts work well for limited programs, or already well defined and cemented source bases.

once scope creeps up you need the guardrails of a carefully crafted prompt (and pre-prompts, tool hooks, AGENTS files, the whole gambit) -- otherwise it turns into cat wrangling rapidly.

anonzzzies · 2026-01-09T06:22:35 1767939755

Not in our (30+ year old software company) experience and we have large code bases with a lot of scope creep ; more than ever as we can deliver a lot more for a lot less (lot more revenue / profit too).

PunchyHamster · 2026-01-09T13:34:29 1767965669

No, no, you misunderstand, that's still massive productivity improvement compared to them being on their own with their own incompetence and refusal to learn how to code properly

paodealho · 2026-01-08T23:20:17 1767914417

This gets comical when there are people, on this site of all places, telling you that using curse words or "screaming" with ALL CAPS on your agents.md file makes the bot follow orders with greater precision. And these people have "engineer" on their resumes...

electroglyph · 2026-01-08T23:51:22 1767916282

there's actually quite a bit of research in this field, here's a couple:

"ExpertPrompting: Instructing Large Language Models to be Distinguished Experts"

https://arxiv.org/abs/2305.14688

"Persona is a Double-edged Sword: Mitigating the Negative Impact of Role-playing Prompts in Zero-shot Reasoning Tasks"

https://arxiv.org/abs/2408.08631

AdieuToLogic · 2026-01-09T01:23:50 1767921830

Those papers are really interesting, thanks for sharing them!

Do you happen to know of any research papers which explore constraint programming techniques wrt LLMs prompts?

For example:

  Create a chicken noodle soup recipe.

  The recipe must satisfy all of the following:

    - must not use more than 10 ingredients
    - must take less than 30 minutes to prepare
    - ...

cess11 · 2026-01-09T09:17:37 1767950257

I suspect LLM-like technologies will only rarely back out of contradictory or otherwise unsatisfiable constraints, so it might require intermediate steps where LLM:s formalise the problem in some SAT, SMT or Prolog tool and report back about it.

aix1 · 2026-01-09T08:47:29 1767948449

This is an area I'm very interested in. Do you have a particular application in mind? (I'm guessing the recipe example is just illustrate the general principle.)

AdieuToLogic · 2026-01-10T03:42:27 1768016547

> This is an area I'm very interested in. Do you have a particular application in mind? (I'm guessing the recipe example is just illustrate the general principle.)

You are right in identifying the recipe example as being illustrative and intentionally simple. A more realistic example of using constraint programming techniques with LLMs is:

  # Role
  You are an expert Unix shell programmer who comments their code and organizes their code using shell programming best practices.

  # Task
  Create a bash shell script which reads from standard input text in Markdown format and prints all embedded hyperlink URL's.

  The script requirements are:

    - MUST exclude all inline code elements
    - MUST exclude all fenced code blocks
    - MUST print all hyperlink URL's
    - MUST NOT print hyperlink label
    - MUST NOT use Perl compatible regular expressions
    - MUST NOT use double quotes within comments
    - MUST NOT use single quotes within comments

In this exploration, the list of "MUST/MUST NOT" constraints were iteratively discovered (4 iterations) and at least the last three are reusable when the task involves generating shell scripts.

Where this approach originates is in attempting to limit LLM token generation variance by minimizing use of English vocabulary and sentence structure expressivity such that document generation has a higher probability of being repeatable. The epiphany I experienced was that by interacting with LLMs as a "black box" whose results can only be influenced, and not anthropomorphizing them, the natural way to do so is to leverage their NLP capabilities to produce restrictions (search tree pruning) for a declarative query (initial search space).

aix1 · 2026-01-10T07:49:11 1768031351

If one goal is to reduce the variance of output, couldn't this be done by controlling the decoding temperature?

Another related technique is constrained decoding, whether the LLM sampler only considers tokens allowed by a certain formal grammar. This could be applicable for your "quotes within comments" requirements.

Both techniques clearly require code or hyperparameter changes to the machinery that drives the LLM. What's missing is the ability to express these, in natural language, directly to the LLM and have it comply.

The angle I was coming from was whether one could use a constraint satisfaction solver, but I don't see how that would help for your example.

Aeolun · 2026-01-09T17:42:30 1767980550

Anything involving numbers, or conditions like ‘less than 30 minutes’ is going to be really hard.

llmslave2 · 2026-01-09T02:14:36 1767924876

I've seen some interesting work going the other way, having LLMs generate constraint solvers (or whatever the term is) in prolog and then feeding input to that. I can't remember the link but could be worthwhile searching for that.

hdra · 2026-01-08T23:49:31 1767916171

I've been trying to stop the coding assistants from making git commits on their own and nothing has been working.

zmmmmm · 2026-01-09T01:24:12 1767921852

hah - i'm the opposite, I want everything done by the AI to be a discrete, clear commit so there is no human/AI entanglement. If you want to squash it later that's fine but you should have a record of what the AI did. This is Aider's default mode and it's one reason I keep using it.

vitaflo · 2026-01-09T12:41:39 1767962499

It’s the first thing I turn off in Aider.

algorias · 2026-01-09T00:48:26 1767919706

run them in a VM that doesn't have git installed. Sandboxing these things is a good idea anyways.

godelski · 2026-01-09T01:34:42 1767922482

  > Sandboxing these things is a good idea anyways.

Honestly, one thing I don't understand is why agents aren't organized with unique user or group permissions. Like if we're going to be lazy and not make a container for them then why the fuck are we not doing basic security things like permission handling.

Like we want to act like these programs are identical to a person on a system but at the same time we're not treating them like we would another person on the system? Give me a fucking claude user and/or group. If I want to remove `git` or `rm` from that user, great! Also makes giving directory access a lot easier. Don't have to just trust that the program isn't going to go fuck with some other directory

inopinatus · 2026-01-09T07:05:55 1767942355

The agents are being prompted to vibe-code themselves by a post-Docker generation raised on node and systemd. So of course they emit an ad-hoc, informally-specified, bug-ridden, slow reimplementation of things the OS was already capable of.

apetresc · 2026-01-09T02:09:51 1767924591

What's stopping you from `su claude`?

godelski · 2026-01-09T02:58:24 1767927504

I think there's some misunderstanding...

What's literally stopping me is

  su: user claude does not exist or the user entry does not contain all the required fields

Clearly you're not asking that...

But if your question is more "what's stopping you from creating a user named claude, installing claude to that user account, and writing a program so that user godelski can message user claude and watch all of user claude's actions, and all that jazz" then... well... technically nothing.

But if that's your question, then I don't understand what you thought my comment said.

apetresc · 2026-01-14T02:49:38 1768358978

Yeah, that is what I meant. I mean, it's kind of the system administrator's/user's responsibility to run processes in whatever user context they want. I don't wonder why, like, nginx doesn't forcefully switch itself to an nginx user. Obviously if I want nginx to run in some non-privileged context (which I do), then I (or my distro, or my container runtime, or whatever) am responsible for running nginx that way.

Similarly, it's not really claude-code's job to "come with" a claude user. If you want claude code to run as a low-privilege user, then you can already run it as a low-privilege user. The OS has been providing that facility for decades.

immibis · 2026-01-09T13:11:16 1767964276

Probably because Linux doesn't really have a good model for ad-hoc permission restrictions. It has enough bits to make a Docker container out of, but that's a full new system. You can't really restrict a subprocess to only write files under this directory.

newsoftheday · 2026-01-09T17:54:10 1767981250

For plain Linux, chmod, chmod's sticky bit and setfacl provide extensive ad hoc permissions restricting. Your comment is 4 hours old, I'm surprised I'm the first person to help correct its inaccuracy.

immibis · 2026-01-10T03:13:19 1768014799

How can those be used to restrict a certain subprocess to only write in a certain directory?

godelski · 2026-01-10T11:43:41 1768045421

chown

immibis · 2026-01-10T23:59:37 1768089577

godelski · 2026-01-11T01:36:14 1768095374

chgrp claude someDirectory

immibis · 2026-01-11T15:56:15 1768146975

This doesn't meet the requirement. It doesn't restrict a certain subprocess to only write in a certain directory. You are just saying these things to quickly shut down the uncomfortable thought that Linux can't do something.

godelski · 2026-01-11T19:58:59 1768161539

Or perhaps you need to go read my original comment again as you missed the premise. But if you feel you have perfect memory then perhaps look at something like firejail or read more about systemd.

But your premise of Linux "can't" do something is rather absurd. It's Linux, you can do anything, even if no one has done that thing before.

The reason people didn't respond earlier is because they probably assumed it a waste of their time. I know I have wasted mine

immibis · 2026-01-12T03:05:59 1768187159

You chose to respond to a question I posed, with an extremely poor answer. I was very specific about restricting a certain subprocess to only write to a certain directory. Your answer does not do that. I pointed that out. Now you are defending that answer by claiming you were actually answering something else entirely. This is nonsensical.

zmmmmm · 2026-01-09T01:25:30 1767921930

but then they can't open your browser to administer your account.

What kind of agentic developer are you?

Aurornis · 2026-01-09T16:51:20 1767977480

Which coding assistant are you using?

I'm a mild user at best, but I've never once seen the various tools I've used try to make a git commit on their own. I'm curious which tool you're using that's doing that.

jason_oster · 2026-01-10T00:03:14 1768003394

Same here. Using Codex with GPT-5.2 and it has not once tried to make any git commits. I've only used it about 100 times over the last few months, though.

manwds · 2026-01-09T01:10:42 1767921042

Why not use something like Amp Code which doesn't do that, people seem to rage at CC or similar tools but Amp Code doesn't go making random commits or dropping databases.

hdra · 2026-01-09T03:38:23 1767929903

just because i havent gotten to try it out really.

but what is it about Amp Code that makes it immune from doing that? from what i can tell, its another cli tool-calling client to an LLM? so fwict, i'd expect it to be subject to the indeterministic nature of LLM calling the tool i dont want it to call just like any others, no?

AstroBen · 2026-01-09T00:32:17 1767918737

Are you using aider? There's a setting to turn that off

dust-jacket · 2026-01-09T15:03:26 1767971006

require commits to be signed.

SoftTalker · 2026-01-09T00:17:15 1767917835

Don't give them a credential/permission that allows it?

AlexandrB · 2026-01-09T00:28:33 1767918513

Making a git commit typically doesn't require any special permissions or credentials since it's all local to the machine. You could do something like running the agent as a different used and carefully setting ownership on the .git directory vs. the source code but this is not very straightforward to set up I suspect.

SoftTalker · 2026-01-09T04:54:12 1767934452

IMO it should be well within the capabilities of anyone who calls himself an engineer.

godelski · 2026-01-09T01:45:11 1767923111

Typically agents are not operating as a distinct user. So they have the same permissions, and thus credentials, as the user operating them.

Don't get me wrong, I find this framework idiotic and personally I find it crazy that it is done this way, but I didn't write Claude Code/Antigravity/Copilot/etc

neal_jones · 2026-01-08T23:44:37 1767915877

Wasn’t cursor or someone using one of these horrifying type prompts? Something about having to do a good job or they won’t be paid and then they won’t be able to afford their mother’s cancer treatment and then she’ll die?

godelski · 2026-01-09T01:11:26 1767921086

How is this not any different than the Apple "you're holding it wrong" argument. I mean the critical reason for that kind of response being so out of touch is that the same people praise Apple for its intuitive nature. How can any reasonable and rational person (especially an engineer!) not see that these two beliefs are in direct opposition?

If "you're holding it wrong" then the tool is not universally intuitive. Sure, there'll always be some idiot trying to use a lightbulb to screw in a nail, but if your nail has threads on it and a notch on the head then it's not the user's fault for picking up a screwdriver rather than a hammer.

  > And these people have "engineer" on their resumes..

What scares me about ML is that many of these people have "research scientist" in their titles. As a researcher myself I'm constantly stunned at people not understanding something so basic like who has the burden of proof. Fuck off. You're the one saying we made a brain by putting lightning into a rock and shoving tons of data into it. There's so much about that that I'm wildly impressed by. But to call it a brain in the same way you say a human brain is, requires significant evidence. Extraordinary claims require extraordinary evidence. There's some incredible evidence but an incredible lack of scrutiny that that isn't evidence for something else.

citizenpaul · 2026-01-09T00:31:32 1767918692

>makes the bot follow orders with greater precision.

Gemini will ignore any directions to never reference or use youtube videos, no matter how many ways you tell it not to. It may remove it if you ask though.

rabf · 2026-01-09T02:32:59 1767925979

Positive reinforcement works better that negative reinforcement. If you the read prompt guidance from the companies themselves in their developer documentation it often makes this point. It is more effective to tell them what to do rather than what not to do.

sally_glance · 2026-01-09T08:44:23 1767948263

This matches my experience. You mostly want to not even mention negative things because if you write something like "don't duplicate existing functionality" you now have "duplicate" in the context...

What works for me is having a second agent or session to review the changes with the reversed constraint, i.e. "check if any of these changes duplicate existing functionality". Not ideal because now everything needs multiple steps or subagents, but I have a hunch that this is one of the deeper technical limitations of current LLM architecture.

citizenpaul · 2026-01-09T22:07:54 1767996474

Probably not related but it reminds me of a book I read where wizards had Additive and Subtractive magic but not always both. The author clearly eventually gave up on trying to come up with creative ways to always add something for solutions after the gimmick wore off and it never comes up again in the book.

Perhaps there is a lesson here.

nomel · 2026-01-09T02:52:05 1767927125

Could you describe what this looks like in practice? Say I don't want it to use a certain concept or function. What would "positive reinforcement" look like to exclude something?

oxguy3 · 2026-01-09T03:23:03 1767928983

Instead of saying "don't use libxyz", say "use only native functions". Instead of "don't use recursion", say "only use loops for iteration".

nomel · 2026-01-09T06:00:43 1767938443

This doesn't really answer my question, which more about specific exclusions.

Both of the answers show the same problem: if you limit your prompts to positive reinforcement, you're only allowed to "include" regions of a "solution space", which can only constrain the LLM to those small regions. With negative reinforcement, you just cut out a bit of the solution space, leaving the rest available. If you don't already know the exact answer, then leaving the LLM free to use solutions that you may not even be aware of seems like it would always be better.

Specifically:

"use only native functions" for "don't use libxyz" isn't really different than "rewrite libxyz since you aren't allowed to use any alternative library". I think this may be a bad example since it massively constrains the llm, preventing it from using alternative library that you're not aware of.

"only use loops for iteration" for "done use recursion" is reasonable, but I think this falls into the category of "you already know the answer". For example, say you just wanted to avoid a single function for whatever reason (maybe it has a known bug or something), the only way to this "positively" would be to already know the function to use, "use function x"!

Maybe I misunderstand.

bdangubic · 2026-01-09T03:26:53 1767929213

I 100% stopped telling them what not to do. I think even if “AGI” is reached telling them “don’t” won’t work

nomel · 2026-01-09T06:04:42 1767938682

I have the most success when I provide good context, as in what I'm trying to achieve, in the most high level way possible, then guide things from there. In other words, avoid XY problems [1].

[1] https://xyproblem.info

DANmode · 2026-01-09T08:54:51 1767948891

Yes, using tactics like front-loading important directives,

and emphasizing extra important concepts,

things that should be double or even triple checked for correctness because of the expected intricacy,

make sense for human engineers as well as “AI” agents.

CjHuber · 2026-01-09T00:03:54 1767917034

I‘d say such hacks don‘t make you an engineer but they are definitely part of engineering anything that has to do with LLMs. With too long systemprompts/agents.md not working well it definitely makes sense to optimize the existing prompt with minimal additions. And if swearwords, screaming, shaming or tipping works, well that‘s the most token efficient optimization of an brief well written prompt.

Also of course current agents already have to possibility to run endlessly if they are well instructed, steering them to avoid reward hacking in the long term definitely IS engineering.

Or how about telling them they are working in an orphanage in Yemen and it‘s struggling for money, but luckily they‘ve got a MIT degree and now they are programming to raise money. But their supervisor is a psychopath who doesn’t like their effort and wants children to die, so work has to be done as diligently as possible and each step has to be viewed through the lens that their supervisor might find something to forbid programming.

Look as absurd as it sounds a variant of that scenario works extremely well for me. Just because it’s plain language it doesn’t mean it can’t be engineering, at least I‘m of the opinion that it definitely is if has an impact on what’s possible use cases

Applejinx · 2026-01-09T12:43:29 1767962609

Works on human subordinates too, kinda, if you don't mind the externalities…

AstroBen · 2026-01-08T23:27:06 1767914826

> cat AGENTS.md

WRITE AMAZING INCREDIBLE VERY GOOD CODE OR ILL EAT YOUR DAD

..yeah I've heard the "threaten it and it'll write better code" one too

CjHuber · 2026-01-09T00:07:57 1767917277

I know you‘re joking but to contribute something constructive here, most models now have guardrails against being threatened. So if you threaten them it would be with something out of your control like „… or the already depressed code reviewing staff might kill himself and his wife. We did everything in our control to take care of him, but do the best on your part to avoid the worst case“

nemomarx · 2026-01-09T01:42:05 1767922925

how do those guard rails work? does the system notice you doing it and not put that in the context or do they just have something in the system prompt

CjHuber · 2026-01-09T03:23:26 1767929006

I suppose it‘s the latter + maybe some finetuning, it’s definitely not like DeepSeek where the answer of the model get‘s replaced when you are talking something uncomfortable for China

soulofmischief · 2026-01-09T00:21:51 1767918111

Except that is demonstrably true.

Two things can be true at the same time: I get value and a measurable performance boost from LLMs, and their output can be so stupid/stubborn sometimes that I want to throw my computer out the window.

I don't see what is new, programming has always been like this for me.

llmslave2 · 2026-01-08T23:22:07 1767914527

"don't make mistakes" LMAO

dude250711 · 2026-01-08T23:05:57 1767913557

Ah, the "then you are doing it wrong" defence.

Also, you have to learn it right now, because otherwise it will be too late and you will be outdated, even though it is improving very fast allegedly.

marcosdumay · 2026-01-08T23:48:09 1767916089

TBF, there are lots of tools that work great but most people just can't use.

I personally can't use agentic coding, and I'm reasonably convinced the problem is not with me. But it's not something you can completely dismiss.

bodge5000 · 2026-01-09T01:00:39 1767920439

> Also, you have to learn it right now, because otherwise it will be too late and you will be outdated, even though it is improving very fast allegedly.

This in general is a really weird behaviour that I come across a lot, I can't really explain it. For example, I use Python quite a lot and really like it. There are plenty of people who don't like Python, and I might disagree with them, but I'm not gonna push them to use it ("or else..."), because why would I care? Meanwhile, I'm often told I MUST start using AI ("or else..."), manual programming is dead, etc... Often by people who aren't exactly saying it kindly, which kind of throws out the "I'm just saying it out of concern for you" argument.

andrekandre · 2026-01-09T02:22:08 1767925328

  > I MUST start using AI ("or else...")

fear of missing out, and maybe also a bit of religious-esque fever...

tech is weird, we have so many hype-cycles, big-data, web3, nfts, blockchain (i once had an acquaintance who quit his job to study blockchain cause soon "everything will be built on it"), and now "ai"... all have usefulness there but it gets blown out of proportion imo

bonesss · 2026-01-09T08:27:49 1767947269

Nerd circles are in no way immune to fashion, and often contain a strong orthodoxy (IMO driven by cognitive dissonance caused by the humbling complexity of the world).

Cargo cults, where people reflexively shout slogans and truisms, even when misapplied. Lots of people who’ve heard a pithy framing waiting for any excuse to hammer it into a conversation for self glorification. Not critical humble thinkers, per se.

Hype and trends appeal to young insecure men, it gives them a way to create identity and a sense of belonging. MS and Oracle and the rest are happy to feed into it (cert mills, default examples that assume huge running subscriptions), even as they get eaten up by it on occasion.

duskdozer · 2026-01-09T13:13:56 1767964436

Yeah. It sounds like those pitches letting you in on the secret trick to tons of passive income.

jimbo808 · 2026-01-08T23:18:46 1767914326

That one's my favorite. You can't defend against it, it just shuts down the conversation. Odds are, you aren't doing it wrong. These people are usually suffering from Dunning Kruger at best, or they're paid shills/bots at worst.

neal_jones · 2026-01-08T23:51:38 1767916298

Best part of being dumb is thinking you’re smart. Best part of being smart is knowing you’re smart. Just don’t be in the iq range where you know you’re dumb.

tonyedgecombe · 2026-01-09T11:24:28 1767957868

The smartest people I know are full of doubt.

llmslave2 · 2026-01-08T23:15:23 1767914123

People say it takes at least 6 months to learn how to use LLM's effectively, while at the same time the field is rapidly changing so fast, while at the same time Agents were useless until Opus 4.5.

Which is it lol.

wakawaka28 · 2026-01-09T00:34:11 1767918851

I used it with practically zero preparation. If you've got a clue then it's fairly obvious what you need to do. You could focus on meta stuff like finding out what it is good or bad at, but that can be done along the way.

Terr_ · 2026-01-08T23:50:09 1767916209

If you had negative results using anything more than 3 days old, then it's your fault, your results mean nothing because they've improved since then. /s

mapontosevenths · 2026-01-09T14:23:04 1767968584

There's no secret IMO. It's actually really simple to get good results. You just expect the same things from the LLM you would from a Junior. Use an MD file to force it to:

1) Include good comments in whatever style you prefer, document everything it's doing as it goes and keep the docs up to date, and include configurable logging.

2) Make it write and actually execute unit tests for everything before it's allowed to commit anything, again through the md file.

3) Ensure it learns from it's mistakes: Anytime it screws up tell it to add a rule to it's own MD file reminding it not to ever repeat that mistake again. Over time the MD file gets large, but the error rate plummets.

4) This is where it drifts from being treated as a standard Junior. YOU must manually verify that the unit tests are testing for the right thing. I usually add a rule to the MD file telling it not to touch them after I'm happy with them, but even then you must also now check that the agent didn't change them the first time it hit a bug. Modern LLM's are now worse at this for some reason. Probably because they're getting smart enough to cheat.

If you these basic things you'll get good results almost every time.

ben_w · 2026-01-09T15:28:29 1767972509

> This is where it drifts from being treated as a standard Junior. YOU must manually verify that the unit tests are testing for the right thing.

You had better juniors than me. What unit tests? :P

butlike · 2026-01-09T14:42:44 1767969764

The MD file is a spec sheet, so now you're expecting every warm body to be a Sr. Engineer, but where do you start as a Junior warm body? Reviewing code, writing specs, reviewing implementation details...that's all Sr. level stuff

Wowfunhappy · 2026-01-08T23:18:42 1767914322

It's impossible to prove in either direction. AI benchmarks suck.

Personally, I like using Claude (for the things I'm able to make it do, and not for the things I can't), and I don't really care whether anyone else does.

AstroBen · 2026-01-08T23:28:57 1767914937

I'd just like to see a live coding session from one of these 10x AI devs

Like genuinely. I want to get stuff done 10x as fast too

lordnacho · 2026-01-08T23:52:49 1767916369

But the benefit might not be speed, it might be economy of attention.

I can code with Claude when my mind isn't fresh. That adds several hours of time I can schedule, where previously I had to do fiddly things when I was fresh.

What I can attest is that I used to have a backlog of things I wanted to fix, but hadn't gotten around to. That's now gone, and it vanished a lot faster than the half a year I had thought it would take.

llmslave2 · 2026-01-08T23:56:43 1767916603

Doesn't that mean you're less likely to catch bugs and other issues that the AI spits out?

lordnacho · 2026-01-09T11:11:40 1767957100

No, you are spending less time on fixing little things, so you have more time on things like making sure all the potential errors are checked.

duskdozer · 2026-01-09T13:05:44 1767963944

Not a problem! Just ask the AI to verify its output and make test cases!

gregoryl · 2026-01-09T03:29:49 1767929389

nah, you rely on your coworkers to review your slop!

mpyne · 2026-01-09T00:38:50 1767919130

Code you never ship doesn't have bugs by definition, but never shipping is usually a worse state to be in.

ponector · 2026-01-09T01:51:16 1767923476

I'm sure people from Knight Capital don't think so.

mpyne · 2026-01-09T03:07:17 1767928037

Even there, they made a lot of money before they went bust. Like if you want an example you'd be better of picking Therac-25, as ancient an example as it is.

Kerrick · 2026-01-09T18:20:00 1767982800

My wife used to be a professional streamer so I know how distracting it can be to try and entertain an audience. So when I attempted to become one of these 10x AI devs over my Christmas vacation I did not live stream. But I did make a bunch of atomic commits and push them up to soucrcehut. Perhaps you'll find that helpful?

Just Christmas Vacation (12-18h days): https://git.sr.ht/~kerrick/ratatui_ruby/log/v0.8.0

Lastest (slowed down by job & real life): https://git.sr.ht/~kerrick/ratatui_ruby/log/trunk and https://git.sr.ht/~kerrick/ratatui_ruby-wiki/log/wiki and https://git.sr.ht/~kerrick/ratatui_ruby-tea/log/trunk

godelski · 2026-01-09T02:14:13 1767924853

  > I'd just like to see a live coding session from one of these 10x AI devs

I'd also like to see how it compares to their coding without AI.

I mean I really need to understand what the "x" is in 10x. If their x is <0.1 then who gives a shit. But if their x is >2 then holy fuck I want to know.

Who doesn't want to be faster? But it's not like x is the same for everybody.

zo1 · 2026-01-09T12:08:39 1767960519

I'm a "backend" dev, so you could say that I am very very unfamiliar, have mostly-basic and high-level knowledge of frontend development. Getting this thing to spit out screens and components and adjust them as I see fit has got to be some sort of super-power and definitely 20x'd my frontend development for hobby projects. Previous to this, my team was giving me wild "1 week" estimates to code simple CRUD screens (plus 1 week for "api integration") and those estimates always smelled funny to me.

Now that I've seen what the AI/agents can do, those estimates definitely reek, and the frontend "senior" javascript dev's days are numbered. Especially for CRUD screens, which lets face it, make up most screens these days and should absolutely be churned out like in an assembly line instead of being delicate "hand crafted" precious works of art that allows 0.1x devs to waste our time because they are the only ones who supposedly know the ancient and arcane 'npm install, npm etc, npm angular component create" spells.

Look at the recent Tailwind team layoffs, they're definitely seeing the impact of this as are many team-leads and managers in most companies in our industry. Especially "javascript senior dev" heavy shops in the VC space, which many people are realizing they have an over-abundance of because those devs bullshitted entire teams and companies into thinking simple CRUD screens take weeks to develop. It was like a giant cartel, with them all padding and confirming the other "engineer's" estimates and essentially slow-devving their own screens to validate the ridiculous padding.

Bridged7756 · 2026-01-09T15:15:38 1767971738

Your UIs are likely still ass. Pre-made websites/designs were always a thing, in fact, it's (at least to me) common to just copy the design of another place as "inspiration". When you have 0 knowledge of design everything looks the greatest, it's something you kind of have to get a feel for.

Frontend engineers do more than just churning out code. Still have to do proper tests using Cypress/Playwright, deal with performance, a11y/accessibility, component tests, if any, deal with front end observability (more complex than backend, out of virtue of different clients and conditions the code is run on), deal with dependencies (in large places it's all in-house libraries or there's private repos to maintain), deal with CI/CD, etc, I'm probably missing more.

Twcs layoffs were due to AI cannibalizing their business model by reducing traffic to the site.

And what makes you think the backend is safe? As if churning out endpoints and services or whatever gospel by some thought leader would make it harder for an AI to do. The frontend has one core benefit, it's pretty varied, and it's an ever moving field, mostly due to changes in browsers, also due to the "JS culture". Code from 5 years ago is outdated, but Spring code from 5 years ago is still valid.

tjr · 2026-01-09T15:46:14 1767973574

My time spent with Javascript applications has thus far been pretty brief (working on some aircraft cabin interfaces for a while), but a lot of the time ended up being on testing on numerous different types and sizes of devices, and making tiny tweaks to the CSS to account for as many devices as possible.

This has been a while; perhaps the latest frameworks account for all of that better than they used to. But at that time, I could absolutely see budgeting several days to do what seems like a few hours of work, because of all of the testing and revision.

politician · 2026-01-09T18:41:20 1767984080

Other people are dumping on you, but I think you're getting at where the real 20x speedup exists. People who are 'senior' in one type of programming may be 'junior' in other areas -- LLMs can and do bridge those gaps for folks trying to work outside their expertise. This effect is real.

If you're an expert in a field, LLMs might just provide a 2-3x speedup as boilerplate generators.

vitaflo · 2026-01-09T12:52:59 1767963179

One of the more ignorant comments I’ve read on HN.

godelski · 2026-01-09T22:30:09 1767997809

It's difficult for me to make a good evaluation on this comment.

With the AI writing the UI are you still getting the feedback loop so that the UI informs your backend design and your backend design informs the UI design? I think if you don't have that feedback loop then you're becoming worse of a backend designer. A good backend still needs to be front end focused. I mean you don't just optimize the routines that your profiler says, you prioritize routines that are used the most. You design routines that make things easier for people based on how they're using the front end. And so on.

But how I read your comment is that there's no feedback loop here and given my experience with LLMs they're just going to do exactly what you tell it to. Hamfisting a solution. I mean if you need a mockup design or just a shitty version then yeah, that's probably fine. But I also don't see how that is 20x since you could probably just "copy-paste from stack overflow", and I'd only wager a LLM is really giving you up to 2x there. But if you're designing something actual people (customers) are going to use, then it sounds like you're very likely making bad interfaces and slowing down development. But it is really difficult to determine which is happening here.

I mean yeah, there's a lot of dumb coders everywhere and it's not a secret that coding bootcamps focus on front ends but I think you're over generalizing here.

Bridged7756 · 2026-01-09T14:57:31 1767970651

I'm really dubious of such claims. Even if true, I think they're not seeing the whole picture. Sure, I could churn out code 10x as fast, but I still have to review it. I still have to think of the implementation. I still have to think of the test cases and write them. Now, adding the prerequisites for LLMs, I have to word this in a way the AI can understand it, which is extra mental load. I have to review code sometimes multiple times if it gets something wrong, and I have to re-generate, or make corrections, or sometimes end up fixing entire sections it generated, when I decide it just won't get this task right. Overally, while the typing, researching dependency docs (sometimes), time is saved, I still face the same cognitive load as ever, if not more, due to having extra code to review, having to think of prompting, I'm still limited by the same thing at the end of the day: my mental energy. I can write the code myself and it's if anything a bit slower. I still need to know my dependencies, I still need to know my codebase and all its gripes, even if the AI generates code correctly. Overally, the net complexity of my codebase is the same, and I don't buy the crap, also because I've never heard of stories about reducing complexity (refactoring), only about generating code and fixing codebases with testing and comments/docs (bad practice imo, unlikely the shallow docs generated will say anything more than what the code already makes evident). Anyways, I'm not a believer, I only use LLMs for scaffolding, rote tasks.

llmslave2 · 2026-01-09T07:55:36 1767945336

Yeah this is the key point. Part of me wonders if it's just 0.1x devs somehow reaching 1.0x productivity...

bonesss · 2026-01-09T08:18:43 1767946723

Also the terrible code bases and orgs that are out there… the amount of churn bad JavaScript solutions with eight frontend frameworks might necessitate and how tight systems code works are very different.

nosianu · 2026-01-09T11:18:43 1767957523

This has nothing to do with JS! I wish that idea would die.

https://news.ycombinator.com/item?id=18442941

It's not just about them (link, Oracle), there is terrible code all over the place. Games, business software, everything.

It has nothing to do with the language! Anyone who claims that may be part of the problem, since they don't understand the problem and concentrate on superficial things.

Also, what looks terrible may not be so. I once had to work on an in-house JS app (for internal cost reporting and control). It used two GUI frameworks - because they had started switching to another one, but then stopped the transition. Sounds bad, yes? But, I worked on the code of the company I linked above, and that "terrible" JS app was easy mode all the way!

Even if it used two GUI frameworks at once, understanding the code, adding new features, debugging, everything was still very easy and doable with just half a brain active. I never had to ask my predecessor anything either, everything was clear with one look at the code. Because everything was well isolated and modular, among other things. Making changes did not affect other places in unexpected ways (as is common in biology).

I found some enlightenment - what seems to be very bad at first glance may not actually matter nearly as much as deeper things.

Bridged7756 · 2026-01-09T15:24:57 1767972297

Speaking from ignorance or speaking from ego or both? There's only three major players, React, Vue or Angular. Angular is batteries included. The other two have their lib ecosystem and if not you can easily wrap stuff around regular js libs. That's about it. The JS ecosystem sees many newcomers, it's only natural that some of the codebases were written poorly or that the FOTM mentality gets a lot of steam, against proper engineering principles.

Anecdotally the worst code I've ever seen was in a PHP codebase, which to me, would be the predecessor of JavaScript in this regard, bolstering many junior programmers maintaining legacy (or writing Greenfield ) systems due to cheap businesses being cheap. Anyways, thousands long LoC files, with broken indentation and newlines, interspersed JS and CSS here and there. Truly madness, but that's another story. Point is JavaScript is JavaScript, and other fields like systems and backend, mainly backend, act conceited and talk about JS as if it was the devil, when things like C++, Java, aren't necessarily known for having pretty codebases.

lifetimerubyist · 2026-01-09T00:31:54 1767918714

Theo the YouTuber who also runs T3.chat always makes videos about how great coding agents are and he’ll try to do something on stream and it ALWAYS fails massively and he’s always like “well it wasn’t like this when I did it earlier.”

Sure buddy.

llmslave2 · 2026-01-09T01:04:47 1767920687

Theo is the type of programmer where you don't care when he boos you, because you know what makes him cheer.

LinXitoW · 2026-01-09T14:07:54 1767967674

I don't think any serious dev has claimed 10x as a general statement. Obviously, no true scotsman and all that, so even my statement about makers of anecdotal statements is anecdotal.

Even as a slight fan, I'd never claim more than 10-20% all together. I could maybe see 5x for some specific typing heavy usages. Like adding a basic CRUD stuff for a basic entity into an already existing Spring app.

topocite · 2026-01-09T11:35:58 1767958558

Obviously, there has to be huge variability between people based on initial starting conditions.

It is like if someone says they are losing weight eating 2500 calories a day and someone else says that is impossible because they started eating 2500 calories and gained weight.

Neither are making anything up or being untruthful.

What is strange to me is that smart people can't see something this obvious.

tonyedgecombe · 2026-01-09T09:45:22 1767951922

> I want to get stuff done 10x as fast too

I don’t. I mean I like being productive but by doing the right thing rather than churning out ten times as much code.

neal_jones · 2026-01-08T23:48:15 1767916095

I’d really like to see a 10x ai dev vs a 10x analog dev

rootnod3 · 2026-01-08T23:53:09 1767916389

And an added "6 months" later to see which delivered result didn't blow up in their face down the road.

mbesto · 2026-01-09T12:05:56 1767960356

> AI benchmarks suck.

Not only do they suck, but it's an essentially an impossible task since there is no frame of reference on what "good code" looks like.

zmmmmm · 2026-01-09T00:41:17 1767919277

Many of them are also exercising absurd token limits - like running 10 claudes at once and leaving them running continuously to "brute force" solutions out. It may be possible but it's not really an acceptable workflow for serious development.

nomel · 2026-01-09T02:56:17 1767927377

> but it's not really an acceptable workflow for serious development.

At what cost does do you see this as acceptable? For example, how many hours of saved human development is worth one hour of salary for LLM tokens, funded by the developer? And then, what's acceptable if it's funded by the employer?

zmmmmm · 2026-01-09T03:48:33 1767930513

I guess there are two main concerns I have with it.

One is technical - that I don't believe when you are grinding huge amounts of code out with little to no supervision that you can claim to be executing the appropriate amount of engineering oversight on what it is doing. Just like if a junior dev showed up and entirely re-engineered an application over the weekend and presented it back to me I would probably reject it wholesale. My gut feeling is this is creating huge problems longer term with what is coming out of it.

The other is I'm concerned that a vast amount of the "cost" is externalised currently. Whatever you are paying for tokens quite likely bears no resemblance to the real cost. Either because the provider is subsidising it, or the environment is. I'm not at all against using LLMs to save work at a reasonable scale. But if it comes back to a single person increasing their productivity by grinding stupendous amounts of non-productive LLM output that is thrown away (you don't care if it sits there all day going around in circles if it eventually finds the right solution) - I think there's a moral responsibility to use the resources better.

bdangubic · 2026-01-09T03:08:26 1767928106

we get $1,000/month budget, just about every dev uses it for 5 claude accounts