Sometimes I get the feeling that making super long and intricate prompts reduces the cognitive performance of the model. It might give you a feel of control and proper engineering, but I'm not sure it's a net win.
My usage has converged to making very simple and minimalistic prompts and doing minor adjustments after a few iterations.
That's exactly how I started using them as well. 1. Give it just enough context, the assumptions that hold and the goal. 2. Review answer and iterate on the initial prompt. It is also the economical way to use them. I've been burned one too many times by using agents (they just spin and spin, burn 30 dollars for one prompt and either mess the code base or converge on the previous code written ).
I also feel the need to caution others that by letting the AI write lots of code in your project it makes it harder to advance it, evolve it and just move on with confidence (code you didn't think about and write it doesn't stick as well into your memory).
I’d have to hunt, but there is evidence that using the vocabulary of an expert versus a layman will produce better results. Which makes sense since places where people talk “normally” in spaces are more likely to be incorrect. Whereas in places where people speak in the in the professional vernacular they are more likely to be correct. And the training will associate them together in their spaces.
At their heart, these are still just document-completion machines. Very clever ones, but still inherently trying to find a continuation that matches the part that came before.
This seems right to me. I often ask questions in two phases to take advantage of this (1) How would a professional in the field ask this question? Then (2) paste that question into a new chat.
For another kind of task, a colleague had written a very verbose prompt. Since I had to integrate it, I added some CRUD ops for prompts. For a test, I made a very short one, something like "analyze this as a <profession>". The output was pretty much comparable, except that the output on the longer prompt contained (quite a few) references to literal parts of that prompt. It wasn't incoherent, but it was as if that model (gemini 2.5, btw) has a basic response for the task it extracts from the prompt, and merges the superfluous bits in. It would seem that, at least for this particular task, the model cannot (easily) be made to "think" differently.
Yeah I had this experience today where I had been running code review with a big detailed prompt in CLAUDE.md but then I ran it in a branch that did not have that file yet and got better results.
> It might give you a feel of control and proper engineering
Maybe a super salty take, but I personally haven't ever thought anything involving an LLM as "proper engineering". "Flailing around", yes. "Trial and error", definitely. "Confidently wrong hallucinations", for sure. But "proper engineering" and "LLM" are two mutually exclusive concepts in my mind.
Same here: it starts with a relatively precise need, keeping a roadmap in mind rather than forcing one upfront. When it involves a technology I'm unfamiliar with, I also ask questions to understand what certain things mean before "copying and pasting".
I've found that with more advanced prompts, the generated code sometimes fails to compile, and tracing the issues backward can be more time consuming than starting clean.
I use specs in markdown for the more advanced prompts. I ask the llm to refine the markdown first and add implementation steps, so i can review what it will do. When it starts implementing, i can always ask it to "just implement step 1, and update the document when done". You can also ask it to verify if the spec has been implemented correctly.
It already did. Programming languages already are very strict about syntax; professional jargon is the same way, and for the same reason- it eliminates ambiguity.
My usage has converged to making very simple and minimalistic prompts and doing minor adjustments after a few iterations.