What do people think of Google's Gemini (Pro?) compared to Claude for code?
I really like a lot of what Google produces, but they can't seem to keep a product that they don't shut down and they can be pretty ham-fisted, both with corporate control (Chrome and corrupt practices) and censorship
Gemini is amazing for taking a merge file of your whole repo, dropping it in there, and chatting about stuff. The level of whole codebase understanding is unreal, and it can do some amazing architectural planning assistance. Claude is nowhere near able to do that.
My tactic is to work with Gemini to build a dense summary of the project and create a high level plan of action, then take that to gpt5 and have it try to improve the plan, and convert it to a hyper detailed workflow xml document laying out all the steps to implement the plan, which I then hand to claude.
This avoids pretty much all of Claude's unplanned bumbling.
I should mention I made that one for my research/stats workflow, so there's some specific stuff in there for that, but you can prompt chat gpt to generalize it.
I mean, damn. Are terms like “executable oracles” and “hermetic boots” related to your domain, or are you using these as terms of art for an agent? Oracle being a source of truth, hermetic meaning no external dependencies or side effects - definitions in furtherance of your request for concise language. Would love to understand more.
This prompt is for scientific research. In general my goal is to instruct the agent to build as much validation scaffolding as possible, so rather than holding its hand I can just give it a series of concrete hurdles and tell it not to come back until they're met. I don't want it finishing the basic tasks and coming back to me saying the app is "production ready," I want to come back after a few hours to the agent having "proven a spec" with a demo or a paper that I can iterate on.
I don't think Gemini Pro is necessarily worse at coding, but in my experience Claude is substantially better at "terminal" tasks (i.e. working with the model through a CLI in the terminal) and most of the CLIs use Claude, see https://www.tbench.ai/leaderboard.
Yeah, the main strength of gemini-cli is being open-sourced and it still needs much polishing. I ended up building my own web-based interactive agent based on gemini-cli [1] out of frustration.
In my recent tests I found it quite smart at analyzing bigger picture (i.e. "hey, test failing not because of that, but because of whole assumption has changed and let me rewrite this test from scratch". But it also got stuck few times "I can't edit file, I'm stuck, let me try completely differently". But the biggest difference so far is the communication style - it's a bit.. snarky? I.e. comments like "yeah, tests are failing - as I suspected". Why the f it suspected failing test on the project it sees for the first time? :D
Pretty much every time Claude code is stuck or more or less just coding in circles i use Gemini PRO to analyze the code/data and feed the response into Claude to solve it. I also have much more success with Gemini when creating big sql transforming scripts or similar. Both are quite bad on bigger tasks, they get you 60% and then i spend days and days to trying to get to 100% .. its such a time sink when i select the wrong task for the llm.
It's doing rather well at thinking, but not at coding. When it codes, often enough it runs in circles and ignores input. Where I find it useful is to read through larger codebases and distill what I need to find out from it. Even using gemini from claude to consult it for certain things. Opus is also like that btw, but a bit better at coding. Sonnet though, excels at coding.. from my experience though.
Personally gemini has been giving me better results. Claude keeps trying to generate react code even when the whole context and my command is svelte, and failing constantly to give me something that can at least run, gemini, on the other hand has been pretty good with styling, and useful with the bussines logic. I dont get all the hype around claude.
The Gemini CLI tool is atrocious. It might work sometimes for analyzing code, but for modifying files, never. The inevitable conclusion of every session I've ever tried has been an infinite loop. Sometimes it's an infinite loop of self-deprecation, sometimes just repeating itself to failure, usually repeating the same tool failure until it catches it as an infinite loop. Tool usage frequently (we're talking 90% of the time) fails. It's also, frankly, just a bummer to talk to. The "personality" is depressed, self-deprecating, and just overall really weird.
That's been my experience, anyway. Maybe it hates me? I sure hate it.
This matches my experience with it. I won’t let it touch any code I have not yet safely checked in before firing up Gemini. It will commonly get into a death loop mid session that can’t be recovered from.
this is so weird I am not at all getting the same experience, its tools work, it changes typescript and python confidently, makes mistakes, understands them and fixes them. I had a case of it giving up and admitting failure, but not in the way you describe
I really like a lot of what Google produces, but they can't seem to keep a product that they don't shut down and they can be pretty ham-fisted, both with corporate control (Chrome and corrupt practices) and censorship