

If you read the news about AI, you may feel bombarded with conflicting messages: AI is booming. AI is a bubble. AI’s current techniques and architectures will keep producing breakthroughs. AI is on an unsustainable path and needs radical new ideas. AI is going to take your job. AI is mostly good for turning your family photos into Studio Ghibli-style animated images.
Cutting through the confusion is the 2025 AI Index from Stanford University’s Institute for Human-Centered Artificial Intelligence. The 400+ page report is stuffed with graphs and data on the topics of R&D, technical performance, responsible AI, economic impacts, science and medicine, policy, education, and public opinion. As IEEE Spectrum does every year (see our coverage from 2021, 2022, 2023, and 2024), we’ve read the whole thing and plucked out the graphs that we think tell the real story of AI right now.
1. U.S. Companies Are Out Ahead
While there are many different ways to measure which country is “ahead” in the AI race (journal articles published or cited, patents awarded, etc.), one straightforward metric is who’s putting out models that matter. The research institute Epoch AI has a database of influential and important AI models that extends from 1950 to the present, from which the AI Index drew the information shown in this chart.
Last year, 40 notable models came from the United States, while China had 15 and Europe had 3 (incidentally, all from France). Another chart, not shown here, indicates that almost all of those 2024 models came from industry rather than academia or government. As for the decline in notable models released from 2023 to 2024, the index suggests it may be due to the increasing complexity of the technology and the ever-rising costs of training.
2. Speaking of Training Costs...
Yowee, but it’s expensive! The AI Index doesn’t have precise data, because many leading AI companies have stopped releasing information about their training runs. But the researchers partnered with Epoch AI to estimate the costs of at least some models based on details gleaned about training duration, type and quantity of hardware, and the like. The most expensive model for which they were able to estimate the costs was Google’s Gemini 1.0 Ultra, with a breathtaking cost of about US $192 million. The general scale up in training costs coincided with other findings of the report: Models are also continuing to scale up in parameter count, training time, and amount of training data.
Not included in this chart is the Chinese upstart DeepSeek, which rocked financial markets in January with its claim of training a competitive large language model for just $6 million—a claim that some industry experts have disputed. AI Index steering committee co-director Yolanda Gil tells IEEE Spectrum that she finds DeepSeek “very impressive,” and notes that the history of computer science is rife with examples of early inefficient technologies giving way to more elegant solutions. “I’m not the only one who thought there would be a more efficient version of LLMs at some point,” she says. “We just didn’t know who would build it and how.”
3. Yet the Cost of Using AI Is Going Down
The ever-increasing costs of training (most) AI models risks obscuring a few positive trends that the report highlights: Hardware costs are down, hardware performance is up, and energy efficiency is up. That means inference costs, or the expense of querying a trained model, are falling dramatically. This chart, which is on a logarithmic scale, shows the trend in terms of AI performance per dollar. The report notes that the blue line represents a drop from $20 per million tokens to $0.07 per million tokens; the pink line shows a drop from $15 to $0.12 in less than a year’s time.
4. AI’s Massive Carbon Footprint
While energy efficiency is a positive trend, let’s whipsaw back to a negative: Despite gains in efficiency, overall power consumption is up, which means that the data centers at the center of the AI boom have an enormous carbon footprint. The AI Index estimated the carbon emissions of select AI models based on factors such as training hardware, cloud provider, and location, and found that the carbon emissions from training frontier AI models have steadily increased over time—with DeepSeek being the outlier.
The worst offender included in this chart, Meta’s Llama 3.1, resulted in an estimated 8,930 tonnes of CO2 emitted, which is the equivalent of about 496 Americans living a year of their American lives. That massive environmental impact explains why AI companies have been embracing nuclear as a reliable source of carbon-free power.
5. The Performance Gap Narrows
The United States may still have a commanding lead on the quantity of notable models released, but Chinese models are catching up on quality. This chart shows the narrowing performance gap on a chatbot benchmark. In January 2024, the top U.S. model outperformed the best Chinese model by 9.26 percent; by February 2025, this gap had narrowed to just 1.70 percent. The report found similar results on other benchmarks relating to reasoning, math, and coding.
6. Humanity’s Last Exam
This year’s report highlights the undeniable fact that many of the benchmarks we use to gauge AI systems’ capabilities are “saturated” — the AI systems get such high scores on the benchmarks that they’re no longer useful. It has happened in many domains: general knowledge, reasoning about images, math, coding, and so on. Gil says she has watched with surprise as benchmark after benchmark has been rendered irrelevant. “I keep thinking [performance] is going to plateau, that it’s going to reach a point where we need new technologies or radically different architectures” to continue making progress, she says. “But that has not been the case.”
In light of this situation, determined researchers have been crafting new benchmarks that they hope will challenge AI systems. One of those is Humanity’s Last Exam, which consists of extremely challenging questions contributed by subject-matter experts hailing from 500 institutions worldwide. So far, it’s still hard for even the best AI systems: OpenAI’s reasoning model, o1, has the top score so far with 8.8 percent correct answers. We’ll see how long that lasts.
7. A Threat to the Data Commons
Today’s generative AI systems get their smarts by training on vast amounts of data scraped from the Internet, leading to the oft-stated idea that “data is the new oil” of the AI economy. As AI companies keep pushing the limits of how much data they can feed into their models, people have started worrying about “peak data,” and when we’ll run out of the stuff. One issue is that websites are increasingly restricting bots from crawling their sites and scraping their data (perhaps due to concerns that AI companies are profiting from the websites’ data while simultaneously killing their business models). Websites state these restrictions in machine readable robots.txt files.
This chart shows that 48 percent of data from top web domains is now fully restricted. But Gil says it’s possible that new approaches within AI may end the dependence on huge data sets. “I would expect that at some point the amount of data is not going to be as critical,” she says.
8. Here Comes the Corporate Money
The corporate world has turned on the spigot for AI funding over the past five years. And while overall global investment in 2024 didn’t match the giddy heights of 2021, it’s notable that private investment has never been higher. Of the $150 billion in private investment in 2024, another chart in the index (not shown here) indicates that about $33 billion went to investments in generative AI.
9. Waiting for That Big ROI
Presumably, corporations are investing in AI because they expect a big return on investment. This is the part where people talk in breathless tones about the transformative nature of AI and about unprecedented gains in productivity. But it’s fair to say that corporations haven’t yet seen a transformation that results in significant savings or substantial new profits. This chart, with data drawn from a McKinsey survey, shows that of those companies that reported cost reductions, most had savings of less than 10 percent. Of companies that had a revenue increase due to AI, most reported gains of less than 5 percent. That big payoff may still be coming, and the investment figures suggest that a lot of corporations are betting on it. It’s just not here yet.
10. Dr. AI Will See You Soon, Maybe
AI for science and medicine is a mini-boom within the AI boom. The report lists a variety of new foundation models that have been released to help researchers in fields such as materials science, weather forecasting, and quantum computing. Many companies are trying to turn AI’s predictive and generative powers into profitable drug discovery. And OpenAI’s o1 reasoning model recently scored 96 percent on a benchmark called MedQA, which has questions from medical board exams.
But overall, this seems like another area of vast potential that hasn’t yet translated into significant real-world impact—in part, perhaps, because humans still haven’t figured out quite how to use the technology. This chart shows the results of a 2024 study that tested whether doctors would make more accurate diagnoses if they used GPT-4 in addition to their typical resources. They did not, and it also didn’t make them faster. Meanwhile, GPT-4 on its own outperformed both the human-AI teams and the humans alone.
11. U.S. Policy Action Shifts to the States
In the United States, this chart shows that there has been plenty of talk about AI in the halls of Congress, and very little action. The report notes that action in the United States has shifted to the state level, where 131 bills were passed into law in 2024. Of those state bills, 56 related to deepfakes, prohibiting either their use in elections or for spreading nonconsensual intimate imagery.
Beyond the United States, Europe did pass its AI Act, which places new obligations on companies making AI systems that are deemed high risk. But the big global trend has been countries coming together to make sweeping and non-binding pronouncements about the role that AI should play in the world. So there’s plenty of talk all around.
12. Humans Are Optimists
Whether you’re a stock photographer, a marketing manager, or a truck driver, there’s been plenty of public discourse about whether or when AI will come for your job. But in a recent global survey on attitudes about AI, the majority of people did not feel threatened by AI. While 60 percent of respondents from 32 countries believe that AI will change how they do their jobs, only 36 percent expected to be replaced. “I was really surprised” by these survey results, says Gil. “It’s very empowering to think, ‘AI is going to change my job, but I will still bring value.’” Stay tuned to find out if we all bring value by managing eager teams of AI employees.
Reference: https://ift.tt/zwjqaNY
No comments:
Post a Comment