Save this storySave this storySave this storySave this story
In this week's Open Questions, Cal Newport replaces Joshua Rothman.
Much of the excitement and anxiety surrounding modern AI technology dates back to January 2020, when a team of OpenAI researchers released a 30-page paper titled “Scaling Laws for Neural Language Models.” The team was led by AI researcher Jared Kaplan and included Dario Amodei, now CEO of Anthropic. They were looking at a tricky question: How does the performance of language models change as their size and training intensity increases?
At the time, many machine learning researchers believed that once language models reached a certain size, they would begin to memorize the answers to training questions, making them less effective once deployed. However, the OpenAI paper argued that these models would only improve as they grew, and that such improvements could follow a power law—a steep curve reminiscent of a hockey stick. This meant that if you kept building larger language models and training them on larger datasets, they would start to show astonishing results. A few months after the paper was published, OpenAI seemed to confirm the scaling law by releasing GPT-3, which was ten times larger—and significantly better—than its predecessor, GPT-2.
Suddenly, the concept of artificial general intelligence, capable of performing a wide range of tasks at or above human standards, seemed tantalizingly close. If the scaling law were true, AI companies would be able to reach AGI levels by investing more money and computing power in language models. Within a year, Sam Altman, the CEO of OpenAI, published a blog post titled “Moore’s Law for Everything,” in which he argued that AI would take over “an increasing share of the work that humans currently do,” creating unprecedented wealth for wealth holders. “This technological revolution is unstoppable,” he wrote. “The world will change so rapidly and so radically that equally radical policy changes will be needed to distribute that wealth and enable more people to live the lives they want.”
It’s hard to overstate how confident the AI community has become that it will inevitably reach the level of AGI. In 2022, Gary Marcus, an AI entrepreneur and professor emeritus of psychology and neuroscience at New York University, criticized Kaplan’s paper, noting that “the so-called scaling laws are not universal principles like gravity, but rather just observations that may not hold indefinitely.” The backlash was swift and brutal. “No other essay I’ve ever written has received so much criticism from so many prominent people, from Sam Altman and Greg Brockton to Yann LeCun and Elon Musk,” Marcus later reflected. He recently told me that his remarks effectively “excommunicated” him from the world of machine learning. ChatGPT soon reached a hundred million users faster than any digital service in history; In March 2023, OpenAI's next version, GPT-4, showed rapid growth in its scaling curve, inspiring a Microsoft research paper called “Sparks of General-Purpose Artificial Intelligence.” Over the next year, venture capital spending on AI increased by eighty percent.
But progress has since slowed. OpenAI hasn’t released a new commercially successful model in more than two years, focusing on niche releases that have become difficult for the general public to follow. Some industry experts have begun to question whether the law of scaling is faltering. “The 2010s were the era of scaling, and now we’re back to the era of wonder and discovery,” Ilya Sutskever, one of the company’s founders, told Reuters in November. “Everyone is looking for something new.” Meanwhile, a TechCrunch article summed up the mood: “There seems to be agreement that you can’t just add more computing power and data to pre-train large language models and expect them to become all-knowing digital beings.” But these observations have been largely drowned out by the rhetoric of other AI leaders, sparking a heated debate. “AI is starting to outperform humans at almost every intellectual task,” Amodei recently told Anderson Cooper. In an interview with Axios, he predicted that half of entry-level white-collar jobs could be “disappeared” in the next one to five years. This summer, Altman and Meta’s Mark Zuckerberg said their companies were close to creating superintelligence.
Last week, OpenAI finally unveiled GPT-5, which many hoped would be a major step up in AI capabilities. Early reviewers noted several interesting features. When popular tech YouTuber Mrwhosetheboss asked it to create a chess game using Pokémon as pieces, he got a significantly better result than he could with GPT-o4-mini-high, the industry-leading programming model; he also noted that GPT-5 could write a more efficient script for his YouTube channel than GPT-4o. Mrwhosetheboss was particularly excited by the fact that
Sourse: newyorker.com