Hacker Newsnew | past | comments | ask | show | jobs | submit | crvdgc's commentslogin

Checking the validity of a given proof is deterministic, but filling in the proof in the first place is hard.

It's like Chess, checking who wins for a given board state is easy, but coming up with the next move is hard.

Of course, one can try all possible moves and see what happens. Similar to Chess AI based on search methods (e.g. MinMax), there are proof search methods. See the related work section of the paper.


who likely wins, fify


> imagine a folder full of skills that covers tasks like the following:

> Where to get US census data from and how to understand its structure

Reminds me of my first time using Wolfram Alpha and got blown away by its ability to use actual structured tools to solve the problem, compared to normal search engine.

In fact, I tried again just now and am still amazed: https://www.wolframalpha.com/input?i=what%27s+the+total+popu...

I think my mental model for Skills would be Wolfram Alpha with custom extensions.


When clicking your link, for me it opened the following query on Wolfram Alpha: `what%27s the total population of the United States%3F`

Funnily enough, this was the result: `6.1% mod 3 °F (degrees Fahrenheit) (2015-2019 American Community Survey 5-year estimates)`

I wonder how that was calculated...


Wolfram alpha never took input in such a natural language. But something like population(USA) and many variations thereof work.


tbh wolfram alpha was the craziest thing ever. haven't done much research on how this was implemented back in the day but to achieve what they did for such complex mathematical problems without AI was kind of nuts


Wolfram Alpha is AI. It's just not an LLM. AI has been a thing since the 60s. LLMs will also become "not AI" in a few years probably.


I doubt that if the underlying parts changed, anyone outside the industry or enthusiasts would know what that is. How many people know what kind of engine is in their car? I stomp on the floor of my Corolla and away we go! Others might know that their Dodge Challenger has a Hemi. What even is that? Thankfully we have the Internet these days, and someone who's interested can just select the word and right click to Google for the Wikipedia article for it. AI is just such an entirely undefined term coloquially, that any attempts to define it will be wrong.


Not sure why you’re getting downvoted. The marketing that LLM=AI seems to have been interpreted as “_only_ LLM=AI”


I think the difference now is that traditional software ultimately comes down to a long series of if/then statements (also the old AI's like Wolfram), whereas the new AI (mainly LLM's) have a fundamentally different approach.


Look into something like Prolog (~50 years old) to see how systems can be built from rules rather than it/else statements. It wasn't all imperative programming before LLMs.

If you mean that it all breaks down to if/else at some level then, yeah, but that goes for LLMs too. LLMs aren't the quantum leap people seem to think they are.


They are from the user POV. Not necessarily in a good way.

The whole point of algorithmic AI was that it was deterministic and - if the algorithm was correct - reliable.

I don't think anyone expected that soft/statistical linguistic/dimensional reasoning would be used as a substitute for hard logic.

It has its uses, but it's still a poor fit for many problems.


Yeah, the result is pretty cool. It's probably how it felt to eat pizza for the first time. People had been grinding grass seeds into flour, mixing with water and putting it on hot stones for millennia. Meanwhile others had been boiling fruits into pulp and figuring out how to make milk curdle in just the right way. Bring all of that together and, boom, you have the most popular food in the world.

We're still at the stage of eating pizza for the first time. It'll take a little while to remember that you can do other things with bread and wheat, or even other foods entirely.


maybe not on their own - but having enough computing power to use LLMs in a way we do now and actually using them is quite a leap.


You're talking about non-deterministic algorithms, who yes are often associated with AI but existed way before LLM's


It is basically another take on Lisp, and the development approach Lisp Machines had, repackaged in a more friendly syntax.

Lisp was the AI language until the first AI Winter took place, and also took Prolog alongside it.

Wolfram Alpha basically builds on them, to put in a very simplistic way.


It's one of the only M-expression versions of Lisp. All the weird stuff about Wolfram Language suddenly made sense when I saw it through that lens


Would really like something selfhosted that does the basic Wolfram Alpha math things.

Doesn't need the craziest math capability but standard symbolic math stuff like expression reduction, differentiation and integration of common equations, plotting, unit wrangling.

All with an easy to use text interface that doesn't require learning.


Try maxima, it's open source:

https://maxima.sourceforge.io/

I used it when it was called Macsyma running on TOPS-20 (and a PDP-10 / Decsystem-20).

Text interface will require a little learning, but not much.


Maxima is amazing and has a GUI. My only beef with it is it doesn't show its work step by step.


That's wolfram mathematica.


Personal faves:

- Mathematica

- Maple

- MathStudio (mobile)

- Ti-89 calculator (high school favorite)

Others:

- SageMath

- GNU Octave

- SymPy

- Maxima

- Mathcad


TI-89 has surprisingly good symbolics tools and solvers for something that runs all year on a single set of AAA batteries. Feels like magic alien tech.


I used it a lot for calc as it would show you how they got the answer if I remember right, also liked how it understands symbols which ibv but cool to paste an integral sign in there


> without AI

We only call it AI until we understand it.

Once we understand LLMs more and there's a new promising poorly understood technology, we'll call our current AI something more computer sciency


My favorite definition of AI: "AI is whatever hasn't been done yet." - Larry Tesler, https://en.wikipedia.org/wiki/AI_effect


Thank you for being honest.


Some Chinese language source claims that it's a reaction to the Pakistan-US rare earth deal.

My pet theory is that this is intended as an attack to the concept of long-arm jurisdiction itself, due to

1. This is the first ever long-arm jurisdiction policy from China.

2. Diplomatically, China usually advocates for the total sovereignty of each country within its border.

3. The recent chip entity list has been a huge headache.

4. Notice how the language mirrors the US justification for the chip restriction: dual use, national security.


> Some Chinese language source claims that it's a reaction to the Pakistan-US rare earth deal.

Maybe they approached India for a deal that was too lopsided in favour of US for the former to accept so US did the show-and-tell cozying up to Pakistan to get a better while publicly shitting on India? Just follow the money?


Xi is getting China ready to attack Taiwan in 2026 or 2027, and the now mutual unwinding of economic relations between the US and China is underway. Still frenemies at this point, but Trump is aiming for more enemy status sooner because it causes media drama and draws attention to him. The US will be screwed because domestic production takes years to happen and it has lost most of its machine tool suppliers, knowledge, and workers. Manufacturing productivity is essential for any sort of war, as evidenced by the history of the American Civil War and WW II.

If the US doesn't impeach and remove Trump and Vance, and get a real, war-time leader who isn't a celebrity reality star ASAP, it will be doomed as China will rapidly seize Taiwan, disrupt Western chip production and plunge the West into an economic armageddon, and likely widen to a war with Japan who would definitely intervene militarily to defend economic technological resources in Taiwan. No more incompetent, self-destructive, corrupt, ideologue chaos can be tolerated.


From the title I thought they solved math! Turns out to be a framework to use SMT solvers for decision-based proof. For additional types, you still need to write the bridging part. Interesting nonetheless.


same


Nice. When using OpenAI Codex CLI, I find the /compact command very useful for large tasks. In a way it's similar to the context editing tool. Maybe I can ask it to use a dedicated directory to simulate the memory tool.


Claude Code already compacts automatically.


I believe Codex CLI also auto compacts when the context limit is met, but in addition to that, you can manually issue a /compact command at any time.


Claude Code had this /compact command for a long time, you can even specify your preferences for compaction after the slash command. But this is quite limited and to get the best results out of your agent you need more than rely on how the tool decides to prune your context. I ask it explicitly to write down the important parts of our conversation into an md file, and I review and iterate over the doc until I'm happy with it. Then /clear the context and give it instructions to continue based on the MD doc.


Codex was only explicit last time I checked


CC also has the same `/compact` command if you want to force it


/compact accepts parameters, so you can tell it to focus on something specific when compacting.


/clear too


Duolingo is useful, but not efficient. When people say I want to learn a language, they often mean I want to learn this language efficiently, e.g. to be able to write an essay like the post says after a realistic period of time.

I personally don't believe its pedagogical deficiency is mere incompetence. The whole business model is to keep you on the platform as long as possible, so why would they make you learn faster rather than just enough to keep you there?

As a long time user before, I have observed a lot of mechanism changes that bear out this observation.


Yep, that's exactly what happened in China. Once the tech giants did 996 without punishment, every employer wants this as well.


Very impressive that it can reproduce the Mandarin accent when speaking English and English accent when speaking Mandarin.


A recent real life example:

You can apply for an HSBC Global Money Account if you have: […] The HSBC UK Mobile Banking app (Global Money is only available via the app)

From https://www.hsbc.co.uk/current-accounts/products/global-mone...


> 10x productivity means ten times the outcomes, not ten times the lines of code. This means what you used to ship in a quarter you now ship in a week and a half.

This assumes the acceleration happens on all tasks. Amdahl's law states that the overall acceleration is constrained by the portion of the accelerated work. Probably it's just unclear if the "engineer" or "productivity" means the programming part or the overall process.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: