Hacker Newsnew | past | comments | ask | show | jobs | submit | asdev's commentslogin

I can promise you no one is using it

you really don't need any of this crap. you just need Claude Code and CLAUDE.MD in directories where you need to direct it. complicated AI set ups are mid curve


I refuse to learn all the complicated configuration because none of it will matter when they drop the next model.

Things that need special settings now won’t in the future and vice versa.

It’s not worth investing a bunch of time into learning features and prompting tricks that will be obsoleted soon


I wish that were true. Models don't feel like they've really had massive leaps.

They do get better, but not enough to change any of the configuration I have.

But you are correct, there is a real possibility that the time invested with be obsolete at some point.

For sure the work towards MCPs are basically obsolete via skills. These things happen.


It doesn’t require any major improvement to the underlying model. As long they tinker with system prompts and builtin tools/settings, the coding agent will evolve in unpredictable ways out of my control


That's a rational argument. In practice, what we're actually doing for the most part is managing context, and creating programs to run parts of tasks, so really the system prompts and builtin tools and settings have very little relevance.


i don't understand this mcp/skill distinction? one of the mcps i use indexes the runtime dependency of code modules so that claude can refactor without just blindly grepping.

how would that be a "skill"? just wrap the mcp in a cli?

fwiw this may be a skill issue, pun intended, but i can't seem to get claude to trigger skills, whereas it reaches for mcps more... i wonder if im missing something. I'm plenty productive in claude though.


So MCPs are a bunch of, essenntially skill type objects. But it has to tell you about all of them, and information about all of them up front.

So a Skill is just a smaller granulatrity level of that concept. It's just one of the individual things an MCP can do.

This is about context management at some level. When you need to do a single thing within that full list of potential things, you don't need the instructions about a ton of other unrelated things in the context.

So it's just not that deep. It would be having a python script or whatever that the skill calls that returns the runtime dependencies and gives them back to the LLM so they can refactor without blindly greping.

Does that make sense?


no that makes no sense. the skill doesn't do anything by itself, the mcp (can be) attached to a deterministic oracle that can return correct information.


But the skill includes the scripts to do things.

So in my nano banana image generation skill, it contains a python script that does all the actual work. The skill just knows how to call the python script.

We're attaching tools to the md files. This is at the granular level of how to hammer a nail, how to use a screw driver, etc. And then the agent, the handyman, has his tool box of skills to call depending on what he needs.


lets say i'm in erlang. you gonna include a script to unpack erlang bytecode across all active modules and look through them for a function call? oorrr... have that code running on localhost:4000 so that its a single invocation away, versus having the llm copypasta the entire script you provided and pray for the best?

The LLM doesn't copy the script, it runs it.

But for sure, there are places it makes sense, and there are places it doesn't. I'm arguing to maximully use it for places that make sense.

People are not doing this. They are leaving the LLM to everything. I am arguing it is better to move everything possible into tools that you can, and have the LLM focus only on the bits that a program doesn't make sense for.


In our experience, a lot of it is feel and dev preference. After talking to quite a few developers, we've found the skill was the easiest to get started with, but we also have a CLI tool and an MCP server too. You can check out the docs if you'd prefer to try those - feedback welcome: https://www.ensue-network.ai/docs#cli-tool


yeah but a skill without the mcp server is just going to be super inefficient at certain things.

again going to my example, a skill to do a dependency graph would have to do a complex search. and in some languages the dependency might be hidden by macros/reflection etc which would obscure a result obtained by grep

how would you do this with a skill, which is just a text file nudging the llm whereas the MCP's server goes out and does things.


A skill is not just a text file nudging the llm. You group scripts and programming to the skill, and the skill calls it.

that seems token inefficient. why have the llm do a full round trip. load the skill which contains the potentially hundreds of lines code then copy and paste the code back into the compiler when it could just run it?

not that i care too too much about small amounts of tokens but depleting your context rapidly seems bad. what is the positive tradeoff here?


I don't understand. The Skill runs the tools. In the cases there are problems where you can have programs replace the LLM, I think we should maximully do that.

That uses less tokens. The LLM is just calling the script, and getting the response, and then using that to continue to reason.

So I'm not exactly following.


what you are proposing is functionally equivalent to "wrapping an mcp in a cli" which is what I mentioned in my root comment.

It seems to mostly ignore Claude.md


If you can test how often it is being used by having a line in there saying something like “You must start every non-code response with ‘Woohoo!’”


It’s told to only use it if relevant because most people write bad ones. Someone should write a tool to assess CLAUDE.md quality.


It does, Claude.md is the least effective way to communicate to it.

It's always interesting reading other people's approaches, because I just find them all so very different than my experience.

I need Agents, and Skills to perform well.


This is classic engineer trying to build a business. Indie software is more of a business than it is software. Everyone wants to do the easy part(coding/tech), nobody wants to relentlessly service customers and do marketing/distribution.

Coding is easy. Building a business is hard, whether indie or VC backed.


>This is classic engineer trying to build a business.

My business has been profitable for 20 straight years, so I can't be that terrible at it. ;0)


this is the best lead generation form i've ever seen


how many changes(% of all changes) need an entire infra stack spun up? have you tried just having the changes deployed to dev with a locking mechanism?


Does anyone get actual insightful reviews from these code review tools? From most people I've spoke with, it catches things like code complexity, linting, etc but nothing that actual relates to business logic because there's no way it could know about the business logic of the product


I built an LLM that has access to documentation before doing code reviews and forces devs to update it with each pr.

Needless to say, most see it as an annoyance not a benefit, me included.

It's not like it's useless but... people tend to hate reviewing LLM output, especially on something like docs that requires proper review (nope, an article and a product are different, an order and a delivery note are as well, and those are the most obvious..).

Code can be subpar or even gross but to the job, but docs cannot be subpar as they compound confusion.

I've even built a glossary to make sure the correct terms are used and kinda forced, but LLMs getting 95% right are less useful than getting 0, as the 5% tends to be more difficult to spot and tends to compound inaccuracies over time.

It's difficult, it really is, there's everything involved from behaviour to processes to human psychology to LLM instructing and tuning, those are difficult problems to solve unless your teams have budgets that allow you hiring a functional analyst that could double as a technical and business writer, and these figures are both rare and hard to sell to management. And then an LLM is hardly needed.


I have gotten code reviews from OoenAI's Codex integration that do point out meaningful issues, including across files and using significant context from the rest of the app.

Sometimes they are things I already know but was choosing to ignore for whatever reason. Sometimes it's like "I can see why you think this would be an issue, but actually it's not". But sometimes it's correct and I fix the issue.

I just looked through a couple of PRs to find a concrete example. I found a PR review comment from Codex pointing out a genuine big where I was not handling a particular code path. I happened to know that no production data would trigger that code path as we had migrated away from it. It acted as a prompt to remove some dead code.


Graphite is a pull request management interface more than it is an AI code review tool.


How do you remediate failures?


in what context?


Summarizing pdfs


How is this different from Extend(Also YC)?


we're more focused on the core extraction layer itself rather than workflow tooling. we train our own vision models for layout detection, ocr, and table parsing from scratch. the key thing for us is determinism and auditability, so outputs are reproducible run over run, which matters a lot for regulated enterprises.


Are you still doing healthcare back office automation? Would love to learn why you pivoted out if not, happy to DM as well


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: