Opinion The Turing test is about us, not the bots, and it failed.
Fans of the slow-burning mainstream media turnaround had a treat this past week.
On Saturday, news broke that Blake Lemoine, a Google engineer tasked with monitoring a chatbot called LaMDA for malice, was placed on paid leave for revealing confidential information.
Lamoine did go public, but instead of something useful like Google’s messaging strategy (a trade secret, if ever there was one), he claimed that LaMDA was alive.
Armed with a transcript in which the LamDA actually claimed sentience and claims it had passed the Turing Test, Lemoine was heaven’s tech whistleblower to the media. When the news reached the BBC radio news Sunday night, it was being reported as an event of some importance.
On Twitter, it was destroyed within hours, but who trusts Twitter with its large and active AI R&D community?
A few days later, the story was still flying, but now journalists had brought in expert commentary, via a handful of academics who had the usual reservations about expressing opinions.
Overall, no, it probably wasn’t, but you know it’s a fascinating area to talk about.
Finally, when the story went off the radar at the end of the week, the few remaining vehicles still covering it found better experts who, presumably, were just as exasperated as the rest of us. Not. Absolutely not. And you won’t find anyone in AI who thinks otherwise. The conversation still revolved around sentience and not how interesting it was.
Google has to use humans to check their chatbot outputs for hate speech, but we were back on the planet.
For future reference and to save everyone time, here’s the killer tell a story is android paranoia – “The Turing Test” as a touchstone for sentience. Is not.
It never was. Turing promised this in a 1950s paper as a way to avoid the question “can machines think?”
He sensibly characterized this as unanswerable until you find out what the thought is. We didn’t then. We don’t have it now.
Instead, the test – can a machine carry on a convincing human conversation? — was designed to be a thought experiment to verify arguments that machine intelligence was impossible. It tests human perceptions and misconceptions, but like Google’s “Quantum Supremacy” claims, the test itself is tautological: passing the test just means the test passed. By itself, it doesn’t prove anything else.
Take a hungry Labrador, that is, any Labrador not asleep or dead, that is aware of the possibility of food.
An animal with a prodigious and insatiable appetite, at the mere hint of available calories, the Labrador puts on a superb show of deep desire and immense unrequited need. Does this reflect an altered cognitive state analogous to the love-sick human adolescent he so closely resembles? Or is it a learned behavior that turns emotional blackmail into tidbits? We may think we know, but without a much broader context, we cannot. We can be credulous. Passing the lab test means you are fed. By itself, nothing more.
The first system to pass the Turing test, in spirit if not in the letter of the various versions proposed by Turing, was an investigation into the psychology of human-machine interaction. ELIZA, the progenitor chatbot, was a 1966 program by MIT computer researcher Joseph Weizenbaum.
It was designed to grossly mimic the therapeutic practice of echoing a patient’s questions back to them.
“I want to kill my editor.”
“Why do you want to kill your editor?”
“He keeps making me meet deadlines.”
“Why don’t you like meeting deadlines?” and so on.
Famously, Weizenbaum was surprised when his secretary, one of the first test subjects, imbued him with intelligence and asked to be alone with the terminal.
Google’s chatbot is a distant descendant of ELIZA, fed large amounts of data written from the internet and turned into language models by machine learning. It is an automated method actor.
A human actor who can’t add up can play Turing more convincingly – but question them about the decision problem and you will soon discover that they are not. Large language models are very good at simulating conversations, but if you have the means to generate the context that will test whether it is what it appears to be, you can’t say more than that.
We’re nowhere near defining sentience, although our increasingly nuanced appreciation of animal cognition is showing that it can take many forms.
At least three types – avian, mammalian and cephalopod – with significant evolutionary distance look like three very different systems. If machine sensitivity happens, it won’t be by a chatbot suddenly printing out a cyborg bill of rights. It will come after decades of targeted research, based on models and tests, successes and failures. It will not be an imitation of ourselves.
And that’s why the Turing test, as fascinating and thought-provoking as it was, survived its expiration date. He doesn’t do what people think he does, but has been interpreted as a Hollywood add-on that focuses on a fantasy. It absorbs the general attention that should be devoted to the real dangers of machine-created information. It’s AI astrology, not astronomy.
The very term “artificial intelligence” is as bad as everyone at Turingon knows. We’re stuck with it. But it’s time to move on and say goodbye to the less useful legacy of the brilliant Alan Turing. ®