Hacker Newsnew | past | comments | ask | show | jobs | submit | mmasu's commentslogin

I use the house analogy a lot these days. A colleague vibe-coded an app and it does what it is supposed to, but the code really is an unmaintainable hodgepodge of files. I compare this to a house that looks functional on the surface, but has the toilet in the middle of the living room, an unsafe electrical system, water leaks, etc. I am afraid only the facade of the house will need to be beautiful, only to realize that they traded off glittery paint for shaky foundations.

I've been a loan officer for 20 years.

To extend your analogy: AI is effectively mass-producing 'Subprime Housing'. It has amazing curb appeal (glittering paint), but as a banker, I'd rate this as a 'Toxic Asset' with zero collateral value.

The scary part is that the 'interest rate' on this technical debt is variable. Eventually, it becomes cheaper to declare bankruptcy (rewrite from scratch) than to pay off the renovation costs.


My experience with it is the code just wouldn't have existed in the first place otherwise. Nobody was going to pay thousands of dollars for it and it just needs to work and be accurate. It's not the backend code you give root access to on the company server, it's automating the boring aspects of the job with a basic frontend.

I've been able to save people money and time. If someone comes in later and has a more elegant solution for the same $60 effort I spent great! Otherwise I'll continue saving people money and time with my non-perfect code.


That's a fair point.

In banking terms, you are treating AI code as "OPEX" (Operating Expense) rather than "CAPEX" (Capital Expenditure). As long as we treat these $60 quick-fixes as "depreciating assets" (use it and throw it away), it’s great ROI.

My warning was specifically about the danger of mistaking these quick-fixes for "Long-term Capital Assets." As long as you know it's a disposable tool, not a foundation, we are on the same page.


I remember a very nice quote from an Amazon exec - “there is no compression algorithm for experience”. The LLM might as well do wrong things, and you still won’t know what you don’t know. But then, iterating with LLMs is a different kind of experience; and in the future people will likely do that more than just grinding through the failure of just missing semicolons Simon is describing below. It’s a different paradigm really


Of course there is - if you write good tests, they compress your validation work, and stand in for your experience. Write tests with AI, but validate their quality and coverage yourself.

I think the whole discussion about coding agent reliability is missing the elephant in the room - it is not vibe coding, but vibe testing. That is when you run the code a few times and say LGTM - the best recipe to shoot yourself in the foot no matter if code was hand written or made with AI. Just put the screw on the agent, let it handle a heavy test harness.


this is a very good point, however the risk of writing bad or non extensive tests is still there if you don’t know what good looks like! The grind will still need to be there, but it will be a different way of gaining experience


Starting to get it!

New skills, not no skills.

There will still be a wide spectrum of people that actually understand the stack - and don’t - and no matter how much easier or harder the tools get, those people aren’t going anywhere.


Compression algorithms for experience are of great interest to ML practitioners and they have some practices that seem to work well. Curriculum learning, feedback from verifiable rewards. Solve problems that escalate in difficulty, are near the boundary of your capability, and ideally have a strong positive or negative feedback on actions sooner rather than later.


Morsi did not have much time to do anything in all frankness - Sisi has now been there for 10+ years


I too started with your tutorial - thanks a million


While I don’t agree with you, I keep a healthily skeptical outlook and am trying to understand this too - what is the hard data? I saw a study a while ago about drops in productivity when devs of OSS repos were AI assisted, but sample size was far too low and repos were quite large. Are you referring to other studies or data supporting this? Thanks!


I, individually, am certainly much more productive in my side projects when using AI assistance (mostly Claude and ChatGPT). I attribute this to two main factors:

First, and most important, I have actually started a number of projects that have only lived in my head historically. Instead of getting weighed down in “ugh I don’t want to write a PDF parser to ingest that data” or whatever, my attitude has become “well, why not see if an AI assistant can do this?” Getting that sort of initial momentum for a project is huge.

Secondly, AI assistants have helped me stretch outside of my comfort zone. I don’t know SwiftUI, but it’s easy enough to ask an AI assistant to put things together and see what happens.

Both these cases refer almost necessarily to domains I’m not an expert in. And I think that’s a bigger factor in side projects than in day jobs, since in your day job, it’s more expected that you are working in an area of expertise.

Perhaps an exception is when your day job is at a startup, where everyone ends up getting stretched into domains they aren’t experts in.

Anyways, my story is, of course, just another anecdote. But I do think the step function of “would never have started without AI assistance” is a really important part of the equation.


Ithink there are also 2 factors to this.

1. Learning curve: Just like any skill there is a learning curve on how to get high quality output from an LLM.

2. The change in capabilities since recent papers were authored. I started intensively using the agentic coding tools in May. I had dabbled with them before that, but the Claude 3.7 release really changed the value proposition. Since May with the various Claude 4, 4.1 and 4.5 (and GPT-5) the utility of the agentic tools has exploded. You basically have to discard any utility measure before that inflection point, it just isn't super informative.


is it also possible that one of the side effects of this are that people driving recreationally become sometimes exceptionally good at it? see how many great f1/rally pilots Finland has generated. Clearly not good when this happens while drunk tho


Yes, I think it's definitely a factor. Recreational driving is a favorite past-time in the countryside, and due to the forest industry there are lots of dirt roads which are perfect for rally driving, many purpose-built race tracks around the country as well. So the barrier of entry is probably lower than in most places. It's also not too uncommon for kids whose parents own / have access to some land to have some old, unregistered car to practice with away from public roads.

There is even a popular racing class called "jokamiehenluokka", where drivers are obliged to sell their cars for 2000 euros if somebody makes an offer. That rule is designed to keep the barrier of entry low, as drivers don't have the incentive to invest too much into their car. Apparently you can take the exam tojoin at age of 15, which is 3 years before the normal minimum age for driving license.

I recommend the game "My Summer Car" for those interested in all this culture.


I've been wondering when the inevitable My Summer Car reference would pop up :)


we tried to build something similar lately for outbound calls (for simple reminders to partners) and faced massive issues using gpt-4o-realtime-audio. Noise detection, turn detection, random telephony issues (we were using Twilio too), prompt not holding together, and more.

We dropped the project because it would have resulted in a terrible experience for the person on the other side of the phone. Building these things is non trivial.

The plan would have been to A/B test and see what the response would have been (watching NPS and business metrics uplift). Human handoff was always the plan in case things got too tricky for the LLM to handle.

I see some hostility here towards this project and while I share many concerns, it is very naive to think that these services won’t be massively leveraged going forward. An AI agent can handle things as well as humans (not in our case but there are good services out there, i.e. Parloa) and the key elements are the same as all the other agentic based workflows:

- narrow use cases

- human in the loop ready to pick up/steer/correct

we will see a lot more of this and as LLM capabilities improve, it will only get better - it is inevitable at this point and might (_might_) result in a better experience for customers in some cases.

Nevertheless I also see the possibility that we will go full circle and we will always reach for a human, maybe showing up in person in a physical office to make sure cases or requests are handled well… or not :-)


We experienced similar challenges. That’s why we made audio handling, turn detection, and LLM retries modular. You can swap models or providers as needed. GitHub: https://github.com/videosdk-live/agents Blog: https://www.videosdk.live/blog/ai-telephony-agent-inbound-ou...


I've read this comment twice and I genuinely can't understand it.

Uh, so your own attempt at a similar project didn't work and was a terrible experience and the fundamentals of the system are specific and still require babysitting. But it's inevitable (???) that it'll get better... and this improvement only MIGHT make things better for people, only some of the time?

I'm not alone in being unimpressed by this, right? Nothing about what was written here sounds... good? Even the most optimistic part is "well, maybe it might be good, sometimes". Like, this sucks. This is a bad system that doesn't work and makes things worse.


Fair point. But when implemented properly, these agents can reliably handle narrow, production-grade tasks like appointment reminders or smart call routing.


what I mean is: building these systems is nontrivial, but if done well it can help. Imagine non being in an endless queue on a phone call when having to do a simple task through a customer center call, or having a phone reminder with more information and less noise than from a written notification. The fact that I failed at it (for lack of experience and resources) does not mean it should just be shrugged off as useless or impractical. Some companies offer this service and it works just fine for narrow use cases.


Only if misused. Our system supports human fallback, logging, and prompt tools to prevent poor user experiences. The key is thoughtful automation. GitHub: https://github.com/videosdk-live/agents Docs on HITL: https://docs.videosdk.live/ai_agents/human-in-the-loop


yesterday I read a paper about using gpt 4 as a tutor in italian schools, with encouraging results - students are more engaged, get through homework by receiving immediate and precise feedback, resulting in non-negligible performance improvements:

https://arxiv.org/abs/2409.15981

it is definitely a great use case for LLMs, and challenges the assumption that LLMs can only “increase brain rot” so to say.


please Jez, don’t talk about crack!!


The twins! The fucking twins. I’m always on about them


I'll never forgive Orange if they've wiped the twins!


A pill, a nipple, bit of fried halloumi, lovely ..


the Python ecosystem for AI is far ahead than R, sadly (or not :-) )


Reticulate is a bridge that lets you use Python from R so that you don't have to suffer Python the language.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: