Adding complexity is just one aspect. Everywhere there is someone whose job is to ensure the bottom line never changes and status quo for the powerful is preserved. Insurance, taxes, rents.. in the absence of effective regulation, the average number of successful appeals will simply get factored in and average costs go up so that profit stays the same and grows at the same rate as before. Similar to how chains factor in losses due to spoilage or theft.. of course they don't actually take a profit loss, they just price it in.
I really don't get people who see this kind of thing as empowering because in the end your (now strictly necessary) appeal with lawyers or AI to get a more fair deal just becomes a new tax on your time/money; you are worse off than before. A good capitalist will notice these dynamics, and invest in AI once it's as required for life as healthcare is, and then work on driving up the costs of AI. Big win for someone but not the downtrodden.
Yes, unfortunately a phrase that's used in an attempt to lend gravitas and/or intimidate people. It sort of vaguely indicates "a complex process you wouldn't be interested in and couldn't possibly understand". At the same time it attempts to disarm any accusation of bias in advance by hinting at purely mechanistic procedures.
Could be the other way around, but I think marketing-speak is taking cues here from legal-ese and especially the US supreme court, where it's frequently used by the justices. They love to talk about "ethical calculus" and the "calculus of stare decisis" as if they were following any rigorous process or believed in precedent if it's not convenient. New translation from original Latin: "we do what we want and do not intend to explain". Calculus, huh? Show your work and point to a real procedure or STFU
This reminds me of an awesome bit by Žižek where he describes an ultra-modern approach to dating. She brings the vibrator, he brings the synthetic sleeve, and after all the buzzing begins and the simulacra are getting on well, the humans sigh in relief. Now that this is out of the way they can just have a tea and a chat.
It's clearly ridiculous, yet at the point where papers or PRs are written by robots, reviewed by robots, for eventual usage/consumption/summary by yet more robots, it becomes very relevant. At some point one must ask, what is it all for, and should we maybe just skip some of these steps or revisit some assumptions about what we're trying to accomplish
> It's clearly ridiculous, yet at the point where papers or PRs are written by robots, reviewed by robots, for eventual usage/consumption/summary by yet more robots, it becomes very relevant. At some point one must ask, what is it all for, and should we maybe just skip some of these steps or revisit some assumptions about what we're trying to accomplish
I've been thinking this for a while, despairing, and amazed that not everyone is worried/surprised about this like me.
Who are we building all this stuff for, exactly?
Some technophiles are arguing this will free us to... do what exactly? Art, work, leisure, sex, analysis, argument, etc will be done for us. So we can do what exactly? Go extinct?
"With AI I can finally write the book I always wanted, but lacked the time and talent to write!". Ok, and who will read it? Everybody will be busy AI-writing other books in their favorite fantasy world, tailored specifically to them, and it's not like a human wrote it anyway so nobody's feelings should be hurt if nobody reads your stuff.
As something of a technophile myself.. I see a lot more value in arguments that highlight totally ridiculous core assumptions rather than focusing on some kind of "humans first and only!" perspectives. Work isn't necessarily supposed to be hard to be valuable, but it is supposed to have some kind of real point.
In the dating scenario what's really absurd and disgusting isn't actually the artificiality of toys.. it's the ritualistic aspect of the unnecessary preamble, because you could skip straight to tea and talk if that is the point. We write messages from bullet points, ask AI to pad them out uselessly with "professional" sounding fluff, and then on the other side someone is summarizing them back to bullet points? That's insane even if it was lossless, just normalize and promote simple communications. Similarly if an AI review was any value-add for AI PR's, it can be bolted on to the code-gen phase. If editors/reviewers have value in book publishing, they should read the books and opine and do the gate-keeping we supposedly need them for instead of telling authors to bring their own audience, etc etc. I think maybe the focus on rituals, optics, and posturing is a big part of what really makes individual people or whole professions obsolete
In philosophy 101 the usual foil for Rousseau vs.. would be Hobbes, but that framing with a realist/pessimist would not be popular with the intended audience, where the goal is to lionize the nationalist, the inventors/owners, the 1%.
> Despite his own moral lapses, Franklin saw himself as uniquely qualified to instruct Americans in morality. He tried to influence American moral life through the construction of a printing network based on a chain of partnerships from the Carolinas to New England. He thereby invented the first newspaper chain. https://en.wikipedia.org/wiki/Benjamin_Franklin#Newspaperman
To be clear Franklin's obviously a complicated historical figure, a pretty awesome guy overall, and I do like American pragmatism generally. But it matters a lot which part of the guy you'd like to hold up for admiration, and elevating a preachy hypocrite that was an early innovator in monopolies and methods of controlling the masses does seem pretty tactical and self-serving here.
Funny but the eyebrow-raising phrase 'recursive self-improvement' is mentioned in TFA in an example about "style adherence" that's completely unrelated to the concept. Pretty clearly a scam where authors are trying to hack searches.
Prerequisite for recursive self-improvement and far short of ASI, any conception of AGI really really needs to be expanded to include some kind of self-model. This is conspicuously missing from TFA. Related basic questions are: What's in the training set? What's the confidence on any given answer? How much of the network is actually required for answering any given question?
Partly this stuff is just hard and mechanistic interpretability as a field is still trying to get traction in many ways, but also, the whole thing is kind of fundamentally not aligned with corporate / commercial interests. Still, anything that you might want to call intelligent has a working self-model with some access to information about internal status. Things that are mentioned in TFA (like working memory) might be involved and necessary, but don't really seem sufficient
Hah, tools-or-no does make things interesting, since this opens up the robot tactic of "use this discord API to poll some humans about appropriate response". And yet if you're suspiciously good at cube roots, then you might out yourself as robot right away. Doing any math at all in fact is probably suspect. Outside of a classroom humans tend to answer questions like "multiply 34 x 91" with "go fuck yourself", and personally I usually start closing browser tabs when asked to identify motorcycles
Turing test isn't actually a good test of much, but even so, we're not there yet. Anyone that thinks we've passed it already should experiment a bit a with counter-factuals.
Ask your favorite SOTA model to assume something absurd and then draw the next logical conclusions based on that. "Green is yellow and yellow is green. What color is a banana?" They may get the first question(s) right, but will trip up within a few exchanges. Might be a new question, but often they are very happy to just completely contradict their own previous answers.
You could argue that this is hitting alignment and guard-rails against misinformation.. but whatever the cause, it's a clear sign it's a machine and look, no em-dashes. Ironically it's also a failure of the turing test that arises from a failure in reasoning at a really basic level, which I would not have expected. Makes you wonder about the secret sauce for winning IMO competitions. Anyway, unlike other linguistic puzzles that attempt to baffle with ambiguous reference or similar, simple counterfactuals with something like colors are particular interesting because they would NOT trip up most ESL students or 3-5 year olds.
I never really understood what made the Turing test so special. On the face of it, it is a rather dumb test. And it was debunked within two decades by the Chinese room thought experiment. And now that we have language models that are obviously not intelligent, it should have been the last nail in the coffin for this test.
Alan Turing was a mathematician not a psychologist, this was his attempt of doing philosophy. And while I applaud brilliant thinkers when they attempt to do philosophy (honestly we need more of that) it is better to leave it to actual philosophers to validate the quality of said philosophy. John Searle was a philosopher which specialized in questions of psychology. And in 1980 he pretty convincingly argued against the Turning test.
A funny thing is even though we're pretty good at a text-based turing test, and we can have very convincing human generated speech we still don't have something that can pass the audio based turing test. Natural pausing and back and forth gives the AI away.
And when we pass that we can just add an optical component and judge that the AI has failed because its lack of facial expression gives it away[1], moving the goalpost one communication component at a time. But in any case we can just add the audio (or for that matter facial expression) component to the Chinese room though experiment and the Turing test remains equally invalid.
Although I am scrutinizing Turin’s philosophy and, no doubt, I am personally much worse at doing philosophy then Turing, I firmly hold the belief that we will never be able to judge the intelligence (and much less consciousness) of a non-biological (and probably not even non-animal, nor even non-human) system. The reason, I think, is that these terms are inherently anthropocentric. And when we find a system that rivals human intelligence (or consciousness) we will simply redefine these terms such that the new system isn’t compatible any more. And I think that has already started, and we have done so multiple times in the past (heck we even redefined the term planet when we discovered the Kuiper belt) instead favoring terms like capability when describing non-biological behavior. And honestly I think that is for the better. Intelligence is a troubled term, it is much better to be accurate when we are describing these systems (including human individuals).
---
1: Though in honesty I will be impressed when machine learning algorithms can interoperate and generate appropriate human facial expressions. It won’t convince me of intelligence [and much less consciousness] though.
Yeah, I mean I hope there are not many people that still think it's a super meaningful test in the sense originally proposed. And yet it is testing something. Even supposing it were completely solved and further supposing the solution is theoretically worthless and only powers next-gen slop-creation, then people would move on to looking for a minimal solution, and perhaps that would start getting interesting. People just like moving towards concrete goals.
In the end though, it's probably about as good as any single kind of test could be, hence TFA looking to combine hundreds across several dozen categories. Language was a decent idea if you're looking for that exemplar of the "AGI-Complete" class for computational complexity, vision was at one point another guess. More than anything else I think we've figured out in recent years that it's going to be hard to find a problem-criteria that's clean and simple, much less a solution that is
I would argue that current LLMs are passing the Turing test because many observers have a hard time distinguishing them from humans: just look at the difficulty many schools have in enforcing rules like "Not allowed to use LLMs for your homework". The teachers often (not always) can't tell, looking at a piece of text, whether a human produced it or whether ChatGPT or some other LLM produced it.
And that "not always" is the crux of the matter, I think. You are arguing that we're not there yet, because there are lines of questioning you can apply that will trip up an LLM and demonstrate that it's not a human. And that's probably a more accurate definition of the test, because Turing predicted that by 2000 or so (he wrote "within 50 years" around 1950) chatbots would be good enough "that an average interrogator will not have more than 70% chance of making the right identification after five minutes of questioning". He was off by about two decades, but by now that's probably happened. The average interrogator probably wouldn't come up with your (good) strategy of using counterfactuals to trick the LLM, and I would argue two points: 1) that the average interrogator would indeed fail the Turing test (I've long argued that the Turing test isn't one that machines can pass, it's one that humans can fail) because they would likely stick to conventional topics on which the LLM has lots of data, and 2) that the situation where people are actually struggling to distinguish LLMs is one where they don't have an opportunity to interrogate the model: they're looking at one piece of multi-paragraph (usually multi-page) output presented to them, and having to guess whether it was produced by a human (who is therefore not cheating) or by an LLM (in which case the student is cheating because the school has a rule against it). That may not be Turing's actual test, but it's the practical "Turing test" that applies the most today.
I think the TT has to be understood as explicitly adversarial, and increasingly related to security topics, like interactive proof and side channels. (Looking for guard-rails is just one kind of information leakage, but there's lots of information available in timing too.)
If you understand TT to be about tricking the unwary, in what's supposed to be a trusting and non-adversarial context, and without any open-ended interaction, then it's correct to point out homework-cheating as an example. But in that case TT was solved shortly after the invention of spam. No LLMs needed, just markov models are fine.
I've seen one blog where in order to comment, you have to answer the question "What color is a green orange?" I presume he gets spambots which all answer "orange" and get blocked from commenting, or more commonly which leave that input blank (since it's a non-standard input field) and also get blocked.
But it just occurred to me that you could ask "what color is a green eggplant?" for a similar result. Though you'd catch a lot of people who, like me before I looked up fruits that aren't green when unripe, assume that all unripe fruits are green. (They aren't, unripe eggplants are white and look like eggs, hence the name).
A green eggplant (also called Thai eggplant or green brinjal, depending on the variety) is actually green on the outside — sometimes solid green, sometimes streaked with white or pale green.
The inside flesh is typically off-white or light cream, just like purple eggplants.
If you mean a green-colored variety (not an unripe purple one), then the skin color can range from pale mint green to deep jade, often with light striping.
Me:
How would an average human answer this question?
ChatGPT:
An average person would probably answer something simple and direct like:
“It’s green.”
Or if they’re being a bit more specific:
“It’s green on the outside and white inside.”
Most people wouldn’t overthink the variety or mention Thai vs. unripe types — they’d just describe what they see.
Outcome would depend on the rest of the test, but I'd say the "human" version of this answer adds zero or negative value to chances of being human, on grounds of strict compliance, sycophancy, and/or omniscience. "No such thing" would probably be a very popular answer. Elaboration would probably take the form of "love it" or "hate it", instead of reaching for a comprehensive answer describing the inside and the outside.
Experimental design comes in here and the one TT paper mentioned in this thread has instructions for people like "persuade the interrogator [you] are human". Answering that a green eggplant is green feels like humans trying to answer questions correctly and quickly, being wary of a trap. We don't know participants background knowledge but anyone that's used ChatGPT would know that ignoring the question and maybe telling an eggplant-related anecdote was a better strategy
Not OP, but I think the plot twist is, maybe we need to be able to entertain "obviously absurd" ideas to be able to land on a correct position if the culture we're inside of is not ready for those ideas yet. (No idea if the journal was really that early on this particular position though)
Crucially, entertaining ideas isn't the same as believing them, it's about giving them some time and space so you can work out whether it's consistent, rich, useful. Even in math this stuff is hard to get right, just look at the resistance and ridicule that Cantor had to go through, or look at the development of non-Euclidean geometry. And that's a space where proof is actually possible. Critical theory is a real thing but is always walking this fine line between being nonsense or being revolutionary.
By its very nature, postmodernism can't be correct or incorrect. The most it can do is provide a perspective or method of analysis. Some people might find it interesting or even useful. (Personally I see it as trite intellectual masturbation, but that's just me.)
It is very, very rare that a fringe or radical ideology that holds fringe views on almost all topics happens to hold a correct and important stance that is also very rare among non-fringe ideologies. Support for Palestinian rights has definitely not been as common as it should have been in the United States, but was still quite common among many non-fringe groups and individuals. You didn't, and don't, need to go to a radical Marxist critical theory publication to be better informed about that topic. I think the author makes an admirable defense of his magazine and shows it was not quite as crazy or worthless as the Sokal affair initially made it out to be, but we don't need to go all the way in the other direction.
Despite Alex Jones watchers all citing it as a great reason to watch him, you don't, and didn't, actually need to watch Alex Jones to learn about Jeffrey Epstein's crimes. Alex Jones was not the person who broke that story or even popularized it - and even if he were to have been, it wouldn't mean he was factually correct about most of it or about anything else.
Sometimes there are brilliant people with fringe, contrarian views or findings that initially get ridiculed and are later found to be correct or valuable. But Cantor was already a respected individual for his past work; he was not some no-name crank mailing theory-of-everything letters to mathematicians. Ramanujan was closer to that, but Ramanujans are so incredibly rare that you really need to be extremely cautious. And math is unlike almost every other field, where fringe claims can generally be objectively and independently confirmed or refuted, so you need to be far more cautious about nearly every other field out there.
However, I agree with your point that at the very least all sides should be initially heard out. Just not necessarily heard for very long.
Agree with all this; see my other comment in thread for more color, more math. I don't want to come across as embracing pseudo-science, misinformation, etc.
I do like to think about the distinctions and boundaries for hard/soft/squishy knowledge though, and try to challenge assumptions and misconceptions about it. Invariably people have weird ideas about how "hard" their pet area is and how "soft" that thing they love to hate really is, which is itself a kind of dogma or superstition. Plus I think it's a public service to try and interest gear-headed nerds in things like criticism and philosophy (or vice versa, pushing engineering and math at the literature nerds). Last time I waded into this kind of debate I was pointing out that Frege worked on semiotics.
There's some interesting stuff in here if you can tolerate the meandering and the way-back-when. Like you'd expect from po-mo wonks, everything's gotta be infinitely subtle and infinitely contextualized. So no big mea-culpa and no big defensive denial either. All of that's been hashed and rehashed many times already I guess. You'll find some self-deprecating humor, some spots with surprising self-awareness, some with a surprising lack of it. The main fresh thing is how they'd like to try and compare/contrast/contextualize it in this moment. For example:
> Being a gatekeeper by maintaining high intellectual standards is not what public opinion would associate with Social Text, to say the least. Yet that is what the journal practiced, mainly. And it is a practice worth defending, however elitist it might look. All the more so because of how the Trump administration has weaponized both the idea of the hoax and the program of anti-elitism. [..] We know what has befallen intellectual standards. [..] Is this ChatGPT, or is it Orwell’s doublethink?
Well ok, there's a conversation to be had about these things! This is not the time to pontificate though, it's the time for sweet revenge. There's never been a better time for po-mo wonks to lean on AI slop and blast physics journals with fake stuff about gravity until someone understaffed falls for the trick. Then you can do a big scandalous reveal about how you can't believe you got away it ;)
I have come to think of the Sokal hoax as an early warning sign of our current information crisis. There was an era when high production values signaled information hygiene. The real Apple original sin wasn’t the walled garden, it was the LaserWriter.
Most people do this so that they can eat the rocks afterward. They are shiny and very nutritious, and it strengthens the teeth. It's normal for some teeth to break off during this phase, but a) you already have colorful rocks to replace the teeth with, and b) old broken teeth can now be placed inside the tumbler for smoothing. 9/10 geologists agree that unsmoothed teeth that aren't made of rock are the number one cause of oral hygiene problems
You can train the smart ones [1], and even wirelessly remotely control them with a mobile app [2], as long as you restrict yourself to commanding them to do things they were going to do anyway, just like cats.
Kudos to the Stoned Republicans of the High Frontier Panel and The Heritage Foundation for their groundbreaking earth shattering work on Smart Rocks!
>In March 1988, Teller and Wood were able to directly brief President Reagan on the concept, taking the model pebble with them and theatrically hiding it under a black cloth when reporters were allowed to take pictures. Teller reiterated that the price for the system would be on the order of $10 billion.
I really don't get people who see this kind of thing as empowering because in the end your (now strictly necessary) appeal with lawyers or AI to get a more fair deal just becomes a new tax on your time/money; you are worse off than before. A good capitalist will notice these dynamics, and invest in AI once it's as required for life as healthcare is, and then work on driving up the costs of AI. Big win for someone but not the downtrodden.