That's around 350,000 tokens in a day. I don't track my Claude/Codex usage, but Kilocode with the free Grok model does and I'm using between 3.3M and 50M tokens in a day (plus additional usage in Claude + Codex + Mistral Vibe + Amp Coder)
I'm trying to imagine a use case where I'd want this. Maybe running some small coding task overnight? But it just doesn't seem very useful.
Essentially migrating codebases, implementing features, as well as all of the referencing of existing code and writing tests and various automation scripts that are needed to ensure that the code changes are okay. Over 95% of those tokens are reads, since often there’s a need for a lot of consistency and iteration.
It works pretty well if you’re not limited by a tight budget.
I only run small models (70b at my hardware gets me around 10-20 TOPS) for just random things (personal assistant kind of thing) but not for coding tasks.
For coding related tasks I consume 30-80M tokens per day and I want something as fast as it gets
Hard disagree. The difference in performance is not something you'll notice if you actually use these cards. In AI benchmarks, the RTX 3090 beats the RTX 4080 SUPER, despite the latter having native BF16 support. 736GiB/s (4080) memory bandwidth vs 936 GiB/s (3090) plays a major role. Additionally, the 3090 is not only the last NVIDIA consumer card to support SLI.
It's also unbeatable in price to performance as the next best 24GiB card would be the 4090 which, even used, is almost tripple the price these days while only offering about 25%-30% more performance in real-world AI workloads.
You can basically get an SLI-linked dual 3090 setup for less money than a single used 4090 and get about the same or even more performance and double the available VRAM.
If you run fp32 maybe but no sane person does that. The tensor performance of the 3090 is also abysmal. If you run bf16 or fp8 stay away from obsolete cards. Its barely usable for llms and borderline garbage tier on video and image gen.
> The tensor performance of the 3090 is also abysmal.
I for one compared my 50-series card's performance to my 3090 and didn't see "abysmal performance" on the older card at all. In fact, in actual real-world use (quantised models only, no one runs big fp32 models locally), the difference in performance isn't very noticeable at all. But I'm sure you'll be able to provide actual numbers (TTFT, TPS) to prove me wrong. I don't use diffusion models, so there might be a substantial difference there (I doubt it, though), but for LLMs I can tell you for a fact that you're just wrong.
To be clear, we are not discussing small toy models but to be fair I also don't use consumer cards. Benchmarks are out there (phoronix, runpod, hugginface or from Nvidias own presentation) and they say it's at least 2x on high and nearly 4x on low precision, which is comparable to the uplift I see on my 6000 cards, if you don't see the performance uplift everyone else sees there is something wrong with your setup and I don't have the time to debug it.
> To be clear, we are not discussing small toy models but to be fair I also don't use consumer cards.
> if you don't see the performance uplift everyone else sees there is something wrong with your setup and I don't have the time to debug it.
Read these two statements and think about what might be the issue. I only run what you call "toy models" (good enough for my purposes), so of course your experience is fundamentally different from mine. Spending 5 figures on hardware just to run models locally is usually a bad investment. Repurposing old hardware OTOH is just fine to play with local models and optimise them for specific applications and workflows.
Like others have mentioned, Blender has become quite the successful open-source story. They used to be riddled with bugs and UX issues, much like FreeCAD was. Yesterday FreeCAD released v1 of their software, and they seem to be on the same redemption path as Blender. It's too bad their v1 release didn't gain much traction on here, as more people ought to give FreeCAD another whirl. The improvements there are massive. And it's the only proper parametric CAD software available on Linux.
I would say avoid it. blender is an excellent MESH modeler but that puts it fundamentally at odds with being a good parametric modeler. a parametric modeler's base primitives are based in deformations on solid objects. mesh modelers are just vertices connected by line segments where 3 form a face. servicable if you're just doing simple objects for a 3d printer but disastrous if you need precision.
I don't understand why precision would be an issue? Is it not possible to fix the position of vertices to sub-micron precision?
I know that Blender is used more in the movie industry. But what if I wanted to make, say, an animation of some cartoon character that gets shredded in a gearbox? What program would I use?
A curve in a parametric CAD program will have an internal representation which is perfectly smooth. As rather than being than a set of straight lines (edges) connected by vertices it is instead a mathematical description of a curve which has infinite resolution.
For your animation example Blender would be the appropriate tool to use as you are doing stuff that requires flexibility of form rather than precision.
Yeah somewhat, there’s also the thing where mesh models can potentially have no thickness (eg. a single polygon) as well as gaps in the mesh whereas this is (usually) impossible in the case of a parametric model.
Rigging. The assembly bits in FreeCAD just haven't been great historically, and the Ondsel assembly layer is very new. If you want to visually check for clashes I can see how someone might prefer to just import a bunch of STLs into Blender, rig them up, and wiggle them about.
I was thinking that since Blender has physics simulation, and it also has nice video renderings, that would be two great reasons to use it for mechanical designs with moving parts, for example.
But I don't have much experience in designing parts. I like SolveSpace, but it becomes slow for medium/large designs. I know FreeCAD has a lot of problems with stability and UI consistency, so I avoided it.
FreeCAD has more rigorous simulation features - FEM/FEA, mechanical assembly, CAM path generation and simulation, and robotics to name a few - out of the box which makes sense as it’s for engineering rather than art, and there are additional addons for CFD and sheet metal available among many others.
The recent 1.0 update brought some major UI/UX improvements, though if you’re coming from other software you’ll find the Ribbon addon to be extremely helpful to feel comfortable. I think it gets a lot of over the top criticism given there are more people working on just the Autodesk CAD kernel than the entirety of FreeCAD and its dependencies. The rate of improvement is gradually accelerating and its already a big jump from where it was a few years ago.
So Linux now officially support RTOS capabilities, without patches, which is pretty cool. I wonder, realistically, how many applications that were originally designed to use microcontrollers for real-time purposes, can be migrated to use Linux, which vastly simplifies and lowers the cost development. And having the ability to use high-level languages like Python significantly lowers the barrier to entry. Obviously certain applications require the speed of a MCU without an operating system, but how many projects really don't need dedicated MCUs?
Unfortunately migrating real-time stuff to Linux _doesn't_ necessarily reduce costs or simplify real-time development needs. I've been doing embedded development for 5+ years at a few companies and doing embedded Linux is still a slog. I prefer a good MCU running Nim or other modern language. Heck there's even MicroPython nowadays.
Especially for anything that needs to "just run" for multiple years. Linux means you must deal of the distro or something like Yocto or Buildroot. Both of which have major pain points.
I would think the portability of, say, a Python application running on Linux is a nice benefit. Try switching from one MCU to a totally different one and you may have to start from scratch (e.g. try going from Microchip to STM.) Can you describe why embedded Linux is still a slog? And what do you think it would take for the issues to be addressed?
I thought we were talking about real-time applications, which I'm not sure Python is (even tuning the GC). But if we're talking about the difficulty of changing MCU families (remember stm32 are >1000 different chips) changing OS is also difficult, even changing from yocto to buildroot can also be a lot of pain on linux.
MicroPython supports[1] PIC16 and SAMD21/SAMD51, STM32, ESP8266/ESP32 and more, but it also supports Zephyr as a target, and with it the platforms Zephyr supports[2].
So yeah not everything under the sun, but certainly not what I'd consider a "very small" number of MCUs.
Of course, support level varies among the platforms, but you're not going to be doing too fancy things in MicroPython I imagine.
I think there's still a wide range of devices for which a bare-metal or smaller RTOS approach is still more cost-effective. Anything that's simple enough it doesn't need networking, a filesystem, or a display, for example. Especially considering bare-metal embedded work seems to pay less than development on linux. But yes, embedded linux can address a huge part of the market and RT expands that a lot (though, of course, most people for whom that is a good option are already using it, it was a well-supported patchset for a long time)
An old-time hang glider pilot has been keeping his website up-to-date since the very early 2000s, complete with rotating gifs, hidden SEO keywords, index pages, photo albums, and more goodies!
Wow, the state of self-hosted photo libraries has gotten way better since I last settled on Photoprism a few years ago. Both Memories and Immich seem very polished. The timeline features look great. I may need to play around with these.