More

logicprog · 2025-09-09T02:37:25 1757385445

Yeah, I bought a used Mac Studio (an M1, to be fair, but still a Max and things haven't changed since) hoping to be able to run a decent LLM on it, and was sorely disappointed thanks to the prompt processing speed especially.

alt227 · 2025-09-09T13:32:12 1757424732

No offense to you personally, but I find it very funny when people hear marketing copy for a product and think it can do anything they said it can.

Apple silicon is still just a single consumer grade chip. It might be able to run certain end user software well, but it cannot replace a server rack of GPUs.

zargon · 2025-09-09T18:01:07 1757440867

I don’t think this is a fair take in this particular situation. My comment is in response to Simon Willison, who has a very popular blog in the LLM space. This isn’t company marketing copy; it’s trusted third parties spreading this misleading information.

alt227 · 2025-09-15T15:28:03 1757950083

Fair enough, apologies for assuming.

logicprog · 2025-09-07T01:14:46 1757207686

I wish there was a surgery like this for me...

logicprog · 2025-09-04T19:22:38 1757013758

Yeah, that is where things get real fun!

logicprog · 2025-09-02T10:59:30 1756810770

I really like those local AI features. It's tempting me yo move to vanilla Firefox over LibreWolf...

gucci-on-fleek · 2025-09-02T11:19:09 1756811949

I'm still ambivalent on the rest of the AI features, but the AI translation is absolutely amazing. The translation quality isn't perfect, but being able to seamlessly translate 20+ languages 100% locally is amazing.

deivid · 2025-09-02T14:02:07 1756821727

The models[0] are open and great, I find them so useful that I ended up making an Android app[1] just to have them available also outside of Firefox.

[0]: https://github.com/mozilla/firefox-translations-models/

[1]: https://github.com/davidventura/firefox-translator

neobrain · 2025-09-02T14:44:46 1756824286

That translation app is so cool, exactly what I've always been looking for (offline + camera integration + clean UI). Thanks for putting in the work and for putting it on F-Droid even!

diggan · 2025-09-02T12:08:47 1756814927

Agree, for us switching between languages all the time, with some of those languages being less known to us, it's a great tool!

My only wish was that I can force it to always allow me to try translating things, even if it doesn't identify it as some specific language. Sometimes what I want to translate is like 30% one language and 70% another language, and I still want to translate it to another language, but since the tool doesn't see it as "foreign enough" or something, I don't even get the choice of having it translated.

Besides that, it's a wonderful despite it not being perfect. Hopefully with time it'll only get better as they get more data. On that note, I'd be more than happy to contribute data if they added some way of giving "good translation / bad translation" feedback, but haven't seen that. I guess I had two wishes in the end.

gucci-on-fleek · 2025-09-02T12:50:01 1756817401

> My only wish was that I can force it to always allow me to try translating things, even if it doesn't identify it as some specific language.

You can, but it's somewhat hidden. Open the hamburger menu in the top right, then select “Translate page…”.

jffry · 2025-09-02T12:26:39 1756815999

If you select a chunk of text in the page and right click there should be a context-menu option to translate the text. It's a popup with a textarea and not in-situ, but it's the same local model as far as I can tell

zveyaeyv3sfye · 2025-09-02T16:07:33 1756829253

> in-situ

I believe you mean ex situ:

> By contrast, ex situ methods involve the removal or displacement of materials, specimens, or processes for study, preservation, or modification in a controlled setting, often at the cost of contextual integrity.

Might as well use the correct words if you want to talk above people's heads.

jffry · 2025-09-09T22:31:34 1757457094

First: No need to be rude, "in situ" is a very commonly used phrase among English speakers, as should be evident from the Wikipedia article [1] you yourself cited

Second: The normal Firefox translate feature replaces the text in the page with the translated text - retaining its styling, position, context w/ images, etc. The right click menu, does not. I described the right click menu as "not in situ" which is correct.

[1] https://en.wikipedia.org/wiki/In_situ

zveyaeyv3sfye · 2025-09-10T11:24:59 1757503499

> "in situ" is a very commonly used phrase among English speakers

I would challenge the "very common" claim.

But sure, the phrase is in use. Mostly not in tech, though.

And you are using it wrong.

If you wanted to express "in-place translation", you could have done so using your native tongue.

You're welcome.

NoboruWataya · 2025-09-02T13:13:41 1756818821

I agree, I'm generally sceptical of new AI "features" in the browser and will be turning most of them off. But the translation feature (which has been in Firefox for a while now) is great. The difference is that translation in a browser is something that is clearly useful and has always been AI-based to an extent, so shipping with a local model for translation is a strict improvement (leaving aside any difference in translation quality, which I have not noticed). The other AI features are not obviously useful IMO.

logicprog · 2025-09-02T13:19:04 1756819144

Yeah! I have it on my phone and its legitimately super helpful; especially since I go to a lot of obscure websites and personal blogs!

sorenjan · 2025-09-02T12:53:46 1756817626

I can't get it to always translate a page, it keeps asking me every time I visit.

robertlagrant · 2025-09-02T11:22:01 1756812121

Yes - agree. It's a lovely feature.

captainepoch · 2025-09-02T17:41:15 1756834875

I moved long ago, I don't regret anything.

logicprog · 2025-08-28T20:53:17 1756414397

I was literally just wishing there was something like this, this is perfect! Do you do prompt caching?

reissbaker · 2025-08-28T21:09:31 1756415371

Aw thanks! We don't currently, but from a cost perspective as a user it shouldn't matter much since it's all bundled into the same subscription (we rate-limit by requests, not by tokens — our request rate limits are set to "higher than the amount of messages per hour that Claude Code promises", haha). We might at some point just to save GPUs though!

logicprog · 2025-08-28T22:55:57 1756421757

Yeah I wasn't worried so much about costs to me, as sustainability of your own prices — don't want to run into a "we're lowering quotas" situation like CC did :P

reissbaker · 2025-08-28T23:24:04 1756423444

Lol fair! I think we're safe for now; our most popular model (and my personal favorite coding model) is GLM-4.5, which fits on a ~relatively small node compared to the rumored sizes of Anthropic's models. We can throw a lot of tokens at it before running into issues — it's kind of nice to launch without prompt caching, since it means if we're flying too close to the sun on tokens we still have some pretty large levers left to pull on the infra side before needing to do anything drastic with rate limits.

logicprog · 2025-08-28T23:59:12 1756425552

> I think we're safe for now; our most popular model (and my personal favorite coding model) is GLM-4.5,

That's funny, that's also my favorite coding model as well!

> the rumored sizes of Anthropic's models

Yeah. I've long had a hypothesis that their models are, like, average sized for a SOTA model, but fully dense, like that old llama 3.1 405b model, and that's why their per token inference costs are insane compared to the competition.

> it's kind of nice to launch without prompt caching, since it means if we're flying too close to the sun on tokens we still have some pretty large levers left to pull on the infra side before needing to do anything drastic with rate limits.

That makes sense.

I'm poor as dirt, and my job actually forbids AI code in the main codebase, so I can't justify even a $20 per month prescription right now (especially when, for experimenting with agentic coding, qwen code is currently free (if shitty)) but when or if it becomes financially responsible, you will be at the very top of my list.

reissbaker · 2025-08-29T00:37:33 1756427853

<3 thank you!

logicprog · 2025-08-28T15:36:41 1756395401

> Unless I'm missing something, you'll have a bunch of links laying around?

This is true, but it's better than files in that it's a single tap and everything is instantly merged into your existing local storage, instead of you having multiple files downloaded local; and next time you export a link, it will be that merged version, with the other person's comments plus your new ones. So there's a single linear stream of links where the latest link is the correct version for both people at all times, and there's only one version on each person's local hard drive, stored at local storage, instead of file V1, V2, V3, etc. Idk if that makes sense.

> Merge?

Yeah basically. It rolls everything up each time.

> Edit: oh, there's a resolve thing. So presumably you'd get links from other people and resolve them? Is that tracked anywhere?

I initially decided not to have resolution be tracked, but if you think it would help with the sharing process, then I could totally do that!

logicprog · 2025-08-25T01:36:48 1756085808

This is actually a really good point. It isn't thay GenAI is useless in business — just that businesses don't know how to build, deploy, and use it usefully at an organization level. Kind of an example of Karpathy's "Power to the people" thesis that individual productivity benefits more from GenAI than corporate.

tim333 · 2025-08-25T19:15:52 1756149352

Yeah it was interesting. A lot of people seem keen to see AI fail but "90% of employees regularly use personal AI tools for work" is a lot.

I can see why people wouldn't want to pay $50k for a chatgpt wrapper when they can get it for free or $20.

logicprog · 2025-08-23T16:54:59 1755968099

I like this a lot.

logicprog · 2025-08-23T11:43:38 1755949418

Mixture of Experts. Llama 3.1 405B is a dense model, so to evaluate the next token, the context has to go through literally every parameter in its neural network. Whereas with mixture of experts, it's usually like a sixth to a tenth or even less of the neural network parameters that actually get evaluated for every token. Also, they don't use GPT-4 or 4.5 anymore iirc, which may have been dense (and that's why they were so expensive), 4.1 and 4o are much different models.

logicprog · 2025-08-23T11:31:01 1755948661

You're grouping those words wrong. As another commenter pointed out to you, which you ignored, it's median (Gemini Apps) not (median Gemini) Apps. Gemini Apps is a highly specific thing — with a legal definition even iirc — that does not include search, and encompasses a list of models you can actually see and know.

esperent · 2025-08-24T07:31:42 1756020702

I didn't ignore it, I actually spent some time researching to find out what Google means by "Gemini Apps" (plural) and whether it includes search AI overview, and I can't get a clear answer anywhere.

Of course, Gemini App (singular) means the mobile app. But it seems that the term Gemini Apps (plural) is being used by Google to refer to any way in which users can access the Gemini models, and also they do clearly state that a version of Gemini isused to generate the search overviews.

So it still seems reasonably likely, until they confirm otherwise, that this median includes search overview.

simianwords · 2025-08-24T09:18:09 1756027089

"This section presents the environmental impact metrics for the Gemini Apps AI assistant" is this also not specific enough?

esperent · 2025-08-24T13:34:25 1756042465

No, because unless they state otherwise we should assume that they consider search overview to be an AI assistant (they definitely believe this) and also that it's one of the Gemini Apps.

Look, there's not enough information to answer this within the paper. I'm not willing to give Google the benefit of the doubt on vague language, and you are. I'm assuming they're a huge basicappy evil corporation whose every publication is gone over and reworded by marketing to make them look good, and you're assuming... whatever.

That's fine by me, we disagree. Let's stop here.