[Originally posted on X; edited and updated here.]
I spent several hours over a couple of days last week trying to get two AIs (first Grok, then Microsoft Copilot) to lay out the schedule for Writers Cantina. The task was very easy to define: there are four rooms and seven time slots for each of two days. I clearly defined constraints: A person can’t be in two places at once, certain events need to be in the rooms where A/V is available, certain people running things can’t both be occupied at the same time, etc.
Things would work out great, until I noticed and pointed out, “No, you’ve got this person in two events at the same time…” “No, you’ve got extra blanks in the schedule which means you’re leaving out events…” After every fix, the AI would report back and tell me that all rules had been followed and there were no conflicts, but a cursory check on my part would show that that was false.
This is exactly the kind of complex but rule-based task that a computer should excel at. Their inability to do so, even after several clarifications and error-checks, is the equivalent of the Terminator not being able to negotiate stairs.
After wasting far too many hours trying to get either AI to follow all constraints at the same time (and stop telling me there were no conflicts in its output when there clearly were), I finally printed out all of the events on 3×5″ cards, cleared some space on the kitchen table, and in 30 minutes was able to do something that both AIs failed to do over the course of multiple hours.
I no longer worry that AI is going to take over the world; it’s simply too incompetent.
I now worry that people will trust the AIs to be able to do what they’re told without checking the output, and the entire world will come crashing down in a perfect storm of digital incompetence.
Indeed, while there are (probably) higher quality AI engines out there that could do a better job with your task, the ones available to the general public are definitely over-hyped. AI image generators I’ve used on occasion have the same problem: you can set the parameters and refine them all you want, but in the end the generator will do whatever it bloody well pleases and give you whatever the program’s algorithms tell it you should want rather than what you actually requested. This was also what happened when I went looking for a half-remembered television show from the 1990s that started with a corporate executive making a marriage proposal to a gal who’d come to chew him out for wronging her: the search engine’s AI came up with numerous different TV series’ names and even specific episodes in which it insisted what I’d described occurred, but none of them were even close to being a match.
In the end, I only managed to stumble across the right answer by checking back to the AI’s sources and finding a listicle of 1990s TV series that began with a wedding. The one that turned out to be what I remembered (as I’d only seen about half its pilot episode) was Ned & Stacey, which the AI had never even thought to suggest. With every one of the wrong answers it gave me, it insisted that yes, this was definitely what I was seeking even though the most cursory of examinations quickly made clear it wasn’t.
My conclusion from these experiences is that these AI engines are currently calibrated to be “yes-men” trying to tell us what we want to hear, not what we want to know. Grok and Microsoft Copilot were likewise telling you “Oh yes, we’ve got the kind of schedule you want right here!” while handing you something superficially made up to look like a proper schedule, but purely cosmetic and riddled with errors. The chance of any these Potemkin intellects ever becoming self-aware and trying to take over the world Skynet-style is effectively nil.
Speaking of a Terminator being unable to negotiate the stairs, that’s pretty much what happens to the decidedly subpar enforcer bot ED-209 in the first Robocop movie: it “loses to a flight of stairs” as the Honest Trailer puts it. A lot of the other stuff in that movie (e.g. Detroit becoming a “bankrupt crime-ridden hellhole”) has also turned out similar in real life, albeit usually even worse than in the movie (e.g. instead of creating a cybernetic cop to enforce the law, governments and corporations alike have pretty much just given up on Detroit). Seems we’re in the Robocop future, not the Terminator one.
That being the case, the real danger of AI seems to be not that it’ll try to take over anything—it’s programmed to be far too sycophantic for that—but that people will try to outsource all their more tedious mental tasks to it and thereby get too intellectually lazy to think for themselves. There had already been some danger of this with the advent of search engines; e.g. children who grew up never knowing what life was like before the internet are increasingly unable to figure out what to do when they don’t have access to it, e.g. when the power goes out. (“Well, if the power went out, I’d ask Google to—” “No, the power’s out. You don’t have access to Google, remember?” “What?”) AI is merely accelerating that trend remarkably quickly now.
You and I grew up in an era before all of this, so when the AI falls down on the job, we can still do it ourselves. The generations after ours, unfortunately, are likely to get a lot more dependent on these things to the point that they may end up in something like a “The Machine Stops” kind of future that E.M. Forster envisioned. Just like in that story, they may end up saying (as the world crumbles around them with no one knowing how to repair anything) “The internet’s breaking down! Somebody’s got to do something about it!” “Oh yes, I’m sharing your post with all my followers, and we agree: somebody definitely should.” “We could ask it to fix itself…” “I tried that, and Grok said it would, but so far nothing’s happening.”
Honestly, this sounds like a task that would be better accomplished with a “dumb” program. The issue with the neural nets behind AI is that it’s hard-to-impossible to code hard and fast rules into them, and hard and fast rules are exactly what you need. Scheduling is not an easy problem—I believe that it’s NP-hard—but it’s probably better than trying to do it with AI.
As to the overall point, that the real threat from AI comes not from AI but from idiots believing that they can use it when they can’t, 100%. See the cases that keep coming up of lawyers including AI-hallucinated cases in their briefs. One day, one of those is going to get through undetected and potentially screw up the legal system.
The appeal to using an AI in this case was that it’s supposed to be able to extract that data from a document formatted for human use (in this case, an HTML table which listed the event title and the participants), and I’m supposed to be able to give the instructions in clear English instead of translate my instructions to a machine-readable format. Again, learning how to use a “dumb” program would have taken a lot longer than doing it myself with 3×5″ cards.
But you’re right — because the currently available consumer-facing AIs are specifically Large Language Model, their strength is specifically in the ambiguity of qualitative communication; somebody hasn’t thought to maybe make it able to be strictly analytical too.