More

jumploops · 2026-01-07T10:13:43 1767780823

I’ve found that experienced devs use agentic coding in a more “hands-on” way than beginners and pure vibe-coders.

Vibecoders are the best because they push the models in humorous and unexpected ways.

Junior devs are like “I automated the deploy process via an agent and this markdown file”

Seasoned devs will spend more time writing the prompt for a bug fix, or lazily paste the error and then make the 1-line change themselves.

The current crop of LLMs are more powerful than any of these use cases, and it’s exciting to see experienced devs start to figure that out (I’m not stanning Gas Town[0], but it’s a glimpse of the potential).

[0]https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16d...

Melonai · 2026-01-07T11:33:03 1767785583

Partially related: I really dislike the vibe of Gas Town, both the post and the tool, I really hope this isn't what the future looks like. It just feels disappointing.

jumploops · 2026-01-06T06:57:09 1767682629

To be fair, the author says: "Do not use Gas Town."

I started "fully vibecoding" 6 months ago, on a side-project, just to see if it was possible.

It was painful. The models kept breaking existing functionality, overcomplicating things, and generally just making spaghetti ("You're absolutely right! There are 4 helpers across 3 files that have overlapping logic").

A combination of adjusting my process (read: context management) and the models getting better, has led me to prefer "fully vibecoding" for all new side-projects.

Note: I still read the code that gets merged for my "real" work, but it's no longer difficult for me to imagine a future where that's not the case.

shepherdjerred · 2026-01-06T07:04:24 1767683064

Static analysis helps a lot. For example, I use jscpd [0] to solve the problem of AI duplicating code.

[0] https://github.com/kucherenko/jscpd

rlt · 2026-01-07T00:14:33 1767744873

What's the process for getting an agent like Claude Code to use jscpd effectively? I assume it's just another tool call with some basic prompting?

shepherdjerred · 2026-01-07T00:21:04 1767745264

I have it as a pre commit hook and also runs in CI. I also wrote an eslint plugin

https://github.com/shepherdjerred/scout-for-lol/blob/main/es...

ymck · 2026-01-06T07:49:03 1767685743

I have noticed in just the past two weeks or so, a lot of the naysayers have changed their tunes. I expect over the next 2 months there will be another sea change as the network effect and new frameworks kick in.

csomar · 2026-01-06T13:48:09 1767707289

No. If anything we are getting "new" models but hardly any improvements. Things are "improving" on scores, ranking and whatever other metrics the AI industry has invented but nothing is really materializing in real work.

rlt · 2026-01-07T00:30:49 1767745849

I've seen a number of previous skeptics change their tune with Opus 4.5.

notarobot123 · 2026-01-06T08:35:39 1767688539

I think we have crossed the chasm and the pragmatists have adopted these tools because they are actually useful now. They've thrown out a lot of their previously held principles and norms to do so and I doubt the more conservative crowd will be so quick to compromise.

2 years sounds more likely than 2 months since the established norms and practices need to mature a lot more than this to be worthy of the serious consideration of the considerably serious.

jumploops · 2026-01-06T06:46:24 1767681984

Curious what fidelity/precision the author finds necessary with Claude 4.5 Opus/GPT 5.2.

Looking at the screenshot of "Tracked Issues", it seems many of the "tasks" are likely overlapping in terms of code locality.

Based on my own experience, I've found the current crop of models to work well at a slightly higher-level of complexity than the tasks listed there, and they often benefit from having a shared context vs. when I've tried to parallelize down to that level of work (individual schema changes/helper creation/etc.).

Maybe I'm still just unclear on the inner workings, but it's my understanding each of those tasks is passed to Claude Code and developed separately?

In either case, I think this project is a glimpse into the future of software development (albeit with a grungy desert punk tinted lens).

For context, I've been "full vibe-coding"[0] for the past 6 months, and though it started painfully, the models are now good enough that not reading the code isn't much of an issue anymore.

jumploops · 2026-01-02T00:12:31 1767312751

> why can't there be an LLM that would always give the exact same output for the exact same input

LLMs are inherently deterministic, but LLM providers add randomness through “temperature” and random seeds.

Without the random seed and variable randomness (temperature setting), LLMs will always produce the same output for the same input.

Of course, the context you pass to the LLM also affects the determinism in a production system.

Theoretically, with a detailed enough spec, the LLM would produce the same output, regardless of temp/seed.

Side note: A neat trick to force more “random” output for prompts (when temperature isn’t variable enough), is to add some “noise” data to the input (i.e. off-topic data that the LLM “ignores” in it’s response).

tacone · 2026-01-02T01:21:14 1767316874

No, setting the temperature to zero is still going to yeld different results. One might think they add random seeds, but it makes no sense for temperature zero. One theory is that the distributed nature of their systems adds entropy and thus produces different results each time.

Random seeds might be a thing, but for what I see there's a lot demand for reproducibility and yet no certain way to achieve it.

empiko · 2026-01-02T05:24:55 1767331495

It's not really a mystery why it happens. LLM APIs are non-deterministic from user's point of view because your request is going to get batched with other users' requests. The batch behavior is deterministic, but your batch is going to be different each time you send your request.

The size of the batch influences the order of atomic float operations. And because float operations are not associative, the results might be different.

EMM_386 · 2026-01-02T02:42:48 1767321768

> Without the random seed and variable randomness (temperature setting), LLMs will always produce the same output for the same input.

Except they won't.

Even at temperature 0, you will not always get the same output as the same input. And it's not because of random noise from inference providers.

There are papers that explore this subject because for some use-cases - this is extremely important. Everything from floating point precision, hardware timing differences, etc. make this difficult.

jumploops · 2025-12-27T05:27:09 1766813229

> Text is the oldest and most stable communication technology

Minor nit: complex language (i.e. Zipf’s law) is the oldest and most stable communication technology.

Before text, we had oral story telling. It allowed us to communicate one generation’s knowledge to the next, and so on.

Arguably this is present elsewhere in the animal kingdom (orcas, elephants, etc.), but human language proves to be the most complex.

Side note: one of my favorite examples is from the Gunditjmara (a group of Aboriginal Australians) who recall a volcanic eruption from 30k+ years ago [0].

Written language (i.e. text) is unique, in that it allows information to pass across multiple generations, without a man-in-the-middle telephone-like game of storytelling.

But both are similar, text requires you to read, in your own voice, the thoughts of another. Storytelling requires you to hear a story, and then communicate it to others.

In either case, the person is required to retell the knowledge, either as an internal monologue or as an external broadcast.

Always bet on language.

[0]https://en.wikipedia.org/wiki/Budj_Bim

tim333 · 2025-12-27T11:50:13 1766836213

Well, the article had "assuming we treat speech/signing as natural phenomenon" but if you are including biological communication you'd probably have to go with genetic code written in RNA. Nature's way of writing down life's assembly instructions. Four billion years and going strong.

antonvs · 2025-12-27T13:26:46 1766842006

The example of the Gunditjmara is speculative. There’s no way to verify it. It’s an appealing possibility, but that’s about it.

jumploops · 2025-12-27T04:27:23 1766809643

Can we use the same argument for life among the stars?

Intelligence, even?

ggm · 2025-12-27T04:51:58 1766811118

We basically do. The habitable planet hunt is almost definitive for "since we depend on water and complex organic molecules for 'life' we will hunt for this signature to define if we think we have found extrasolar life, radio signals aside"

jumploops · 2025-12-24T07:25:38 1766561138

Love the simplicity and “Not MCP” callout (:

Not that it matters, but curious what percentage of this service was “vibe-coded”?

jumploops · 2025-12-20T22:44:58 1766270698

> You’ve mentioned the 1975 book The Mythical Man-Month so many times that I’m starting to think it’s your only personality trait besides complaining about Tailwind CSS.

Ahahaha, not entirely wrong!

jumploops · 2025-12-13T00:14:06 1765584846

I think the future is likely one that mixes the kitchen-sink style MCP resources with custom skills.

Services can provide an MCP-like layer that provides semantic definitions of everything you can do with said service (API + docs).

Skills can then be built that combine some subset of the 3rd party interfaces, some bespoke code, etc. and then surface these more context-focused skills to the LLM/agent.

Couldn’t we just use APIs?

Yes, but not every API is documented in the same way. An “MCP-like” registry might be the right abstraction for 3rd parties to expose their services in a semantic-first way.

prescriptivist · 2025-12-13T02:16:57 1765592217

Agree. I'd add that a aha moment to skills is AI agents are pretty good at writing skills. Let's say you have developed an involved prompt that explains how to hit an API (possibly with the complexity of reading credentials from an env var or config file) or run a tool locally to get some output you want the agent to analyze (example, downloading two versions of python packages and diffing them to analyze changes). Usually the agent reading the prompt it's going to leverage local tools to do it (curl, shell + stdout, git, whatever) every single time. Every time you execute that prompt there is a lot thinking spent on deciding to run these commands and you are burning tokens (and time!). As an eng you know that this is a relatively consistent and deterministic process to fetch the data. And if you were consuming it yourself, you'd write a script to automate it.

So you read about skills (prompt + scripts) to make this more repeatable and reduce time spent thinking. At that point there are two paths you can go down -- write the skill and prompt yourself for the agent to execute -- or better -- just tell the agent to write the skill and prompt and then you lightly edit it and commit it.

This may seem obvious to some, but I've seen engineers create skills from scratch because they have a mental model around skills being something that people must build for the agent, whereas IMO skills are you just bridging a productivity gap that the agent can't figure out itself (for now), which is instructing it to write tools to automate its own day to day tedium.

simonw · 2025-12-13T02:23:37 1765592617

The example Datasette plugin authoring skill I used in my article was entirely written by Claude Opus 4.5 - I uploaded a zip file to its the Datasette repo in it (after it failed to clone that itself for some weird environment reason) and had it use its skill-writing skill to create the rest: https://claude.ai/share/0a9b369b-f868-4065-91d1-fd646c5db3f4

prescriptivist · 2025-12-13T03:17:56 1765595876

That's awesome and I have a few similar conversations with Claude. I wasn't quite an AI luddite a couple months ago, but close. I joined a new company recently that is all in on AI and I have a comically huge token budget so I jumped all the way in myself. I have my choice of tools I can use and once I tried Claude Code it all clicked. The topology they are creating for AI tooling and concepts is the best of all the big LLMs, by far. If they can figure out the remote/cloud agent piece with the level of thoughtfulness they have given to Code, it'd be amazing. Cursor Cloud has that area locked down right now, but I'm looking forward to how Anthropic approaches it.

TypeDeck · 2025-12-13T15:28:58 1765639738

Completely agree with both points. Skills replacing one-off microservices and agents writing their own skills feel like two sides of the same coin to me. I’m a solo developer building a markdown-first slide editing app. The core format is just Markdown with --- slide separators, but it has custom HTML comment directives for layouts (, , etc.) and content-type detection for tables, code blocks, and Mermaid diagrams. It’s a small DSL, but enough that an LLM without context will generate slides that don’t render optimally. Right now my app is designed for copy-paste from external LLMs, which means users have to manually include the format spec in their prompts every time. But your comment about agents writing skills made me realize the better path: I could just ask Claude Code to read my parser and layout components, then generate a Slide_Syntax_Guide skill for me. The agent already understands the codebase—it can write the definitive spec better than I could document it manually.

dkdcio · 2025-12-13T00:21:48 1765585308

CLIs are really good when you can use them. self-documenting, agents already have shell tools, they tend to solve fine-grained auth, etc.

feels like the right layer of abstraction for remote APIs

esafak · 2025-12-13T01:03:04 1765587784

If only there was a way to progressively disclose the API in MCP instead of presenting the full laundry list up front.

simonw · 2025-12-13T01:04:13 1765587853

That is effectively what this proposal is about: https://www.anthropic.com/engineering/code-execution-with-mc...

jumploops · 2025-12-12T06:27:18 1765520838

Not to be the “ai” guy, but LLMs have helped me explore areas of human knowledge that I had postponed otherwise

I am of the age where the internet was pivotal to my education, but the teacher’s still said “don’t trust Wikipedia”

Said another way: I grew up on Google

I think many of us take free access to information for granted

With LLMs, we’ve essentially compressed humanity’s knowledge into a magic mirror

Depending on what you present to the mirror, you get some recombined reflection of the training set out

Is it perfect? No. Does it hallucinate? Yes. It it useful? Extremely.

As a kid that often struggled with questions he didn’t have the words for, Google was my salvation

It allowed me to search with words I did know, to learn about words I didn’t know

These new words both had answer and opened new questions

LLMs are like Google, but you can ask your exact question (and another)

Are they perfect? No.

The benefit of having expertise in some area, means I can see the limits of the technology.

LLMs are not great for novelty, and sometimes struggle with the state of the art (necessarily so).

Their biggest issue is when you walk blindly, LLMs will happily lead the unknowing junior astray.

But so will a blogpost about a new language, a new TS package with a bunch of stars on GitHub, or a new runtime that “simplifies devops”

The biggest tech from the last five years is undoubtedly the magic mirror

Whether it can evolve to Strong AI or not is yet to be seen (and I think unlikely!)

yesco · 2025-12-12T14:30:57 1765549857

I like to think of LLMs as the internet's Librarian. They've read nearly all the books in the library, can't always cite the exact page, but can point you in the right direction most of the time.

arw0n · 2025-12-13T13:09:27 1765631367

Completely agree, and for me it is not just about the easier/quicker access to information, but the interactivity. I can ask Claude to spend half an hour to create a learning plan for me, then refine it by explaining what I already know and where I see my main gaps.

And then I can, in the same context, ask questions while reading the articles suggested for learning. - There's also danger involved there, as the constant affirmation ("Great Point!", "You're absolutely right!") breeds overconfidence, but it has led me to learn quite a few things in a more formal capacity that I would have endlessly postponed before.

For example, I work quite a lot with k8s, but during the day, I'm always trying to solve a specific problem. I have never just sat down, and started reading about the architecture, design decisions, and underlying tech in a structured format. Now I have a detailed plan ready on how to fill my foundational gaps over the Christmas break, and this will hopefully save me time during the next big deployment/feature rollout.

coldtea · 2025-12-12T14:14:14 1765548854

>Their biggest issue is when you walk blindly, LLMs will happily lead the unknowing junior astray.

The biggest issue is outsourcing agency and skills atrophy

adammarples · 2025-12-12T14:17:09 1765549029

I agree. Show someone from 2000 what Claude code can do and tell them we've developed nothing useful and they'd punch you.