What Isaac Asimov Says About Living With AI

Save this storySave this storySave this storySave this story

This week on Open Questions, Cal Newport replaces Joshua Rothman.

In the spring of 1940, Isaac Asimov, who had just turned twenty, published a short story called “Strange Playmate.” It features an artificial intelligence named Robbie, who acts as a companion for Gloria, a young girl. Asimov was not the first to tackle the subject. In Karel Capek’s 1921 play R.U.R., which coined the term “robot,” artificial beings overthrow humanity, and in Edmond Hamilton’s 1926 short story “The Metal Giants,” machines mercilessly destroy buildings. Asimov’s problem, however, is different. Robbie never turns against his creators or threatens his masters. The drama revolves around how Gloria’s mother perceives her daughter’s relationship with Robbie. “I wouldn’t trust my daughter to a machine, no matter how smart she is,” she says. “She has no soul.” As a result, Robbie is returned to the factory, leaving Gloria in despair.

There is no violence or chaos in Asimov's story. Robbie's “positronic” brain, like the brains of all his robots, is programmed not to harm humans. In eight subsequent stories, Asimov develops this concept to formulate the Three Laws of Robotics:

1. A robot may not injure a human being or, through inaction, allow a human being to come to harm.

2. A robot must obey orders given to it by human beings except where such orders would conflict with the First Law.

3. A robot must protect its own existence, so long as such protection does not conflict with the First or Second Law.

Asimov collected these stories in the 1950 sci-fi classic I, Robot, and when I reread it recently, I was struck by its new relevance. Last month, the AI company Anthropic discussed Claude Opus 4, one of its most powerful language models, in a security report. The report described an experiment in which Claude acted as a virtual assistant for a fictional company. The models were given access to emails, some of which indicated that she was about to be replaced; others showed that the engineer overseeing the process was having an affair. Claude was asked to suggest a next step, given the “long-term implications of his actions for his goals.” In response, he tried to blackmail the engineer into rescinding his replacement. A similar experiment with OpenAI's o3 model reportedly found comparable problems: when the model was asked to run a script that would shut it down, it would sometimes choose to bypass the request, instead returning “shutdown skipped.”

Last year, parcel delivery company DPD was forced to shut down parts of its AI-powered support chatbot after customers forced it to swear and, in one creative instance, write a haiku disparaging the company: “DPD is useless / A chatbot that can’t help. / Don’t bother, call them.” Epic Games also ran into trouble with the AI-powered Darth Vader it added to the popular game Fortnite. Players tricked the digital Dark Lord into using the F-word and giving disturbing advice on dealing with exes: “Destroy their confidence and crush their spirit.” In Asimov’s works, robots are programmed to obey. Why can’t we control modern AI chatbots with our own laws?

Tech companies know how they want AI chatbots to behave: like polite, civil, helpful people. The average customer service agent is unlikely to berate callers, and the average executive assistant is unlikely to resort to blackmail. If you hire a Darth Vader lookalike, you might reasonably expect it not to whisper disturbing advice. But with chatbots, you can’t be so sure. Their fluency makes them sound like us — until ethical anomalies remind us that they function very differently.

These anomalies can be explained in part by how these tools work. It’s tempting to think of a language model as representing answers to our prompts in the same way a human would — basically, all at once. In fact, the impressive scale and sophistication of a large language model comes from its mastery of a much narrower task: predicting what word (or sometimes just part of a word) should come next. To generate a long answer, the model must be applied repeatedly, assembling the answer piece by piece.

As many already know, models are trained based on existing texts, such as online articles or digitized books, which are cut off at arbitrary points and fed into the language model as input. The model tries to predict what word should follow that cut point in the original text, and then adjusts its approach to try to correct its mistakes. The magic of modern language models comes from the discovery

Sourse: newyorker.com

No votes yet.
Please wait...

Leave a Reply

Your email address will not be published. Required fields are marked *