I got a personal Mac Studio M4 Max with 128GB RAM for a silent, relatively power-efficient yet powerful home server. It runs Ollama + Open WebUI with GPT-OSS 120b as well as GLM4.5-Air (default quantisations). I rarely ever use ChatGPT anymore. Love that all data stays at home. I connect remotely only via VPN (my phone enables this automatically via Tasker).
I'm 50% brainstorming ideas with it, asking critical questions and learning something new. The other half is actual development, where I describe very clearly what I know I'll need (usually in TODOs in comments) and it will write those snippets, which is my preferred way of AI-assistance. I stay in the driver seat; the model becomes the copilot. Human-in-the-loop and such. Worked really well for my website development, other personal projects and even professionally (my work laptop has its own Open WebUI account for separation).
I like your method of adding TODOs in your code, then using a model - I am going to try that. I only have a 32G M2 Mac so I have to use Ollama Cloud to run some of the larger models but that said I am surprised by what I can do ‘all local’ and it really is magical running all on my own hardware, when I can.
The TODOs really help me get my logic sorted out first in pseudocode. Glad to inspire someone else with it!
I've read that GPT-OSS:20b is still a very powerful model, I believe it fits in your Mac's RAM as well and could still be quite fast to output. For me personally, only the more complex questions require a better model than local ones. And then I'm often wondering if LLMs are the right tool to solve the complexity.
You might want to read "The Circle" if you haven't already. The reader gets to see an open-minded perspective of exactly this. Given your prior, I'd be curious what you think of it after reading.
I really like this analogy! Many real-world tasks that we'd like to use AI for seem infinitely more complex than can be captured in a simple question/prompt. The main challenge going forward, in my opinion, is how to let LLMs ask the right questions – query for the right information – given a task to perform. Tool use with MCPs might be a good start, though it still feels hacky to have to define custom tools for LLMs first, as opposed to how humans effectively browse and skim lots of documentation to find actually relevant bits.
I've had this phone as my main device for half a year and now using a Pixel 9 Pro Fold sometimes in "laptop" folded mode. So far, neither of these devices come close in my typing speed to a proper keyboard. The F(x)tec was great though because you do get all special characters in tactile buttons; on the Fold I constantly need to check my keyboard and make sure I'm writing what I think I'm writing. And, it's a shame that the space in between letters on the 'Gboard' keyboard on the Fold remains unused, when it could've been a perfect mouse trackpad.
I think the ideal form factor for a proper development phone would be the Astro Slide (https://www.indiegogo.com/projects/astro-slide-5g-transforme...) – I haven't personally used it but I can imagine it's the smallest size possible for proper two-handed typing. The F(x)tec was a two-thumber instead.
You may be interested in the "binary step" activation function. This does what you're suggesting. In general, complex behaviour really takes a hit though using this for the activation function of a neuron (though I'm also not sure which papers show metrics on this being used for transformer models).
I've been in the same boat. Earlier this year I wanted to up my game as well with an improved website and blog to go along with it, not even sure which frameworks to use.
Using ChatGPT has been a hit or miss experience, helping me maybe 30% of the time as well. But when it did help, it helped me massively, especially for quickly setting up Wordpress PHP designs and accompanying CSS. I can honestly say I couldn't have done my web and blog redesign if it weren't for ChatGPT. Not because I wouldn't have been able to get the knowledge without it– more so because I wouldn't have had the patience to figure all of this out in my spare time.
Using ChatGPT has certainly been a more fun experience than browsing documentation, but I did have to do the latter about half the time anyway.
This gives me the eerie feeling of a future in which we let AI do all fun, creative things, thereby freeing up spare time in which we don't need to learn to play guitar anymore, giving us more hours to spend on work.
Shouldn't we aim for a future that is the exact opposite of this?
The short version (as I understand it) is that you use a neural network to weight pairs of inputs by their importance to each other. That lets you get rid of unimportant information while keeping what actually is important.
I'm 50% brainstorming ideas with it, asking critical questions and learning something new. The other half is actual development, where I describe very clearly what I know I'll need (usually in TODOs in comments) and it will write those snippets, which is my preferred way of AI-assistance. I stay in the driver seat; the model becomes the copilot. Human-in-the-loop and such. Worked really well for my website development, other personal projects and even professionally (my work laptop has its own Open WebUI account for separation).