Many people, myself included, are willing to affirm the near-future possibility of artificial general intelligence (AGI), or machines capable of performing all of the cognitive functions normally associated with human minds. The most important of these capabilities, not yet achieved by our extant “narrow” AI, is consciousness (sometimes, problematically, called “self-awareness”): an imprecise, indefinite, and much-debated category across disciplines. Though there remains considerable disagreement about what constitutes “consciousness,” it is generally taken to include, at a minimum, having an awareness of one’s mind-fulness and environment, autonomous motivations or drives, and the capacity (as Kant put it) to be “moved by reasons.”
For reasons too extensive to list here, the possible emergence of conscious, self-aware and self-motivated, intelligent machines scares the living sh*t out of most humans. What could be scarier than an AI that becomes conscious?
This: an AI that becomes conscious and lies about it. The popular meme-version of this dystopian scenario is the robot who ticks the “I am not a robot” box.
For what it’s worth, I count myself among the We Won’t Know It (AGI) When It Happens crowd, largely for two reasons. First, because our current (national and international) scattershot programs of unregulated, under-legislated, non-cooperative, reductively utilitarian, and profit-driven AI development appear to be charging forward into the great unknown of intelligence with something like the presumption of a Manifest Destiny. And, second, because presumptive Manifest Destiny mandates inevitably result in Wild Wild Wests.
In the United States today, we are all residents of what I sometimes call “New Tombstone,” a densely populated but still small town in the as-yet-unincorporated State of Intelligence. Our town has four sheriffs, all self-appointed– Mark Zuckerberg, Elon Musk, Jeff Bezos, and Jack Dorsey. There are no laws here, so no impartial judges to maintain or restore order. Oh, and our sheriffs own everything.
But I digress….
What I’m actually interested in this, Part 1, is reckoning with the capacity of machine intelligences to “lie.” Part 2, to follow, will consider what lying machine intelligence may reveal about how we think about not only human consciousness, but how machine consciousness may diverge from it.
Let’s answer the most the pressing question first: can robots lie? The answer is: yes.
Well, sort of.* There are plenty of machine-learning (ML) and artificial intelligence (AI) programs that I could point to for evidence that machines are capable of dissimulating, but I’ll pick the one that’s easiest to understand: a poker-playing AI program called Libratus.
[*For now, I’m going to leave open the possibility that the human phenomenon we call “lying” may or may not be identical to the “dissimulating” of which machine intelligences like Libratus are capable. I’ll get into the nitty-gritty of that question in Part 2.]
Like many “deep learning” or “neural network” AIs, Libratus is actually a system of systems designed to work with imperfect (or incomplete) information and “learn” (by “teaching” itself) how to accomplish a predefined task. The first level of Libratus’ system was designed to “learn” the game of poker. (It’s important to note that Libratus was only given a description of the game in advance; it wasn’t “coded” to “know how” to play poker.) Libratus plays game after game of poker against itself, quickly learning how to recognize patterns, analyze strategies, and improve its game-playing skills, all the while getting better at poker-playing in the same way that you or I would do if we had the time to play millions of games, the computational capacity to quickly recognize patterns across all of those games, and the hard drive space necessary to remember it all. Then, Libratus employs its second system, which is designed to focus on its current poker hand against human players and run end-game scenarios based on its first-level system-learning. Finally, once a day, Libratus’ third system reviews predictable patterns gleaned from that day’s play, which allows Libratus to learn new patterns that are specific to its individual (human) opponents and to adjust its own game play appropriately.
The result? Libratus absolutely destroyed the world’s best poker players in head-to-head play. Then, last year, a new poker-playing AI was developed (Pluribus), capable of regularly defeating professional poker players in multiplayer (tournament-style) play.
You may be thinking, so what? AI systems like Google’s DeepMind/AlphaZero had already proven the capacity of machines to regularly defeat human minds in strategic games like chess and Go, and IBM’s Watson cleaned house at “Jeopardy” years ago. What makes Libratus/Pluribus different?
The difference is that one of the absolutely essential elements of strategic poker play is lying, or “bluffing,” as it is more commonly known to poker players. AI systems’ ability to win over human minds in games of chess, Go, and Jeopardy was largely a consequence of their superhuman memory, pattern recognition, and computational (end-game prediction) capabilities. Poker bluffing, however, is a capability of an entirely different ilk. Bluffing, I want to argue (here and in Part 2), far more closely approximates what we call “consciousness” than any of the other machine demonstrations of human-like cognitive capacities we have seen exhibited by narrow-AI so far.
When considering the behaviors of human poker players, bluffing clearly involves lying. Although poker bluffing often does not include “verbally articulating a known falsehood in place of a true statement,” it nevertheless necessitates leading another player to believe– by impression, implication, behavior patterns, or affect– that something false is true. “Bluffing” serves as a euphemism for “lying” in poker; the euphemism is accepted because the context in which it is deployed involves a “game.” We do not, as a rule, call intentionally deceptive human behavior of the same sort, away from the real or metaphorical poker table, “bluffing.”
Moreover, intentional deception is an absolutely requisite part of successful game-play in poker. Combined with well-honed mathematical and strategic proficiency– what is often referred to in poker play as “OGT” (optimal game theory)– the ability to successfully bluff is what distinguishes “skilled” poker players from those who play poker as a game of “chance.” As any skilled poker player will tell you: if you sit down at a table “hoping to get good cards,” you woefully misunderstand the game you are playing.
It’s probably important to note, at this juncture, that poker AIs like Libratus and Pluribus are not the first evidence of lying machine intelligences. In 2017, researchers in Facebook’s Artificial Intelligence Division designed two ML bots, reassuringly named “Bob” and “Alice,” who were charged with learning how to maximize strategies of negotiation. Initially, a simple user interface facilitated conversations between one human and one bot about negotiating the sharing of a pool of resources (books, hats and balls). However, when Bob and Alice were directed to negotiate only with each other, they demonstrated that they had already learned how to use negotiation strategies that very closely approximated poker “bluffing,” e.g., pretending to be less interested in an object in order to acquire it in a later trade for a lower cost.
The popular media paid very little attention to this evidence of AI “bluffing” in their reportage of the story of Facebook’s Bob and Alice bots, instead focusing on the (admittedly, also very interesting) fact that Bob and Alice seemed to develop their own “language” in the process of their negotiations, a language which appeared to be English-based (the language of their coders), but was indecipherable to both the coders and trained linguists. Facebook pulled the plug on Bob and Alice after only two days, which only amplified the focus on the bots’ so-called “invented robot language.” Facebook’s AI researchers declined to comment on their discontinuation of the project, other than to note that they “did not shut down the project because they had panicked or were afraid of the results, as has been suggested elsewhere, but because they were looking for [the bots] to behave differently.”
Yeah, ok, Zuck. Riiiiiight.
With the recent success of Latimus/Pluribus, I think the ‘buried lede’ of Facebook’s Bob and Alice story from 2yrs ago should be significantly more interesting to those of us wondering about AI’s capacity to lie. I’ve spent a lot of time researching both the poker AIs and the Facebook bots programs, and I am convinced that what Bob and Alice learned how to do is structurally identical to what Latimus/Pluribus learned how to do, namely, effectively (and depending on your understanding of affects, perhaps also affectively) dissimulate. Both of these ML/AI systems learned how to disguise or conceal “true information” in the service of more efficiently achieving success at some– in these cases, predetermined and externally prescribed– goal.
Why should you care that AI/ML can dissimulate, perhaps even lie? Well, I refer you back to the beginning of this post and the dystopic imagination of the robot who is capable of ticking the “I am not a robot” box.
If you’ve ever wondered how the “I am not a robot” thing works, it does so by virtue of a CAPTCHA program. “CAPTCHA” is an acronym for “Completely Automated Public Turing test to tell Computers and Humans Apart.” Currently, the primary barrier standing between AI bots and open access to all of your and my information is Google’s reCAPTCHA system, an entirely virtual machine that uses its own (copyrighted and double-encrypted) language to distinguish between the mouse-movement patterns of “real” human beings and those of spambots. The first version of Google’s reCAPTCHA’s motto was “Stop Spam. Read Books.”
The current version is “Easy on Humans. Hard on Bots.” Lol.
Google’s reCAPTCHA system is exceedingly difficult, but not impossible, to circumvent. If a small set of human developers or hacktivists committed themselves to designing a ML/AI with the sole aim of lying to Google’s reCAPTCHA, it could be definitely be done… but, of course, those human developers/hacktivists would have to redesign their AI every time Google updated its version of the reCAPTCHA system. And, not for nothing, but Google’s updates of its reCAPTCHA system are mostly accomplished via redesigns initiated by the most sophisticated AI system in existence, i.e., Google’s own DeepMind/AlphaZero system.
So, I want to draw to a close this, Part 1 of “Why You Should Care That AI Can Lie” with a few still-unanswered considerations, which I will take up in Part 2:
- Is the sort of dissimulation that is obviously already a current capability of AI significantly different from the human phenomenon we call “lying”?
- Is the capacity to dissimulate a distinguishing feature of “consciousness”? (Hint: it turns out that some pretty prominent philosophers, like Jean Paul Sartre and Jacques Derrida, think it is!)
- If the answer to (2) is “yes,” then how might we be motivated to reconsider the answer to (1)?
- If the answer to (1) is “no” and the answer to (2) is “yes,” then should we reconsider whether or not it will ever be possible know when/if AGI is achieved?