The Edge of Mathematics

AI Summary9 min read

TL;DR

Terence Tao discusses AI's role in mathematics, noting it solves easier Erdős problems but lacks creativity. He sees AI as a collaborative tool for tedious tasks and large-scale analysis, though it needs better confidence indicators and interactive platforms.

Key Takeaways

•AI has solved some Erdős problems, but these are 'cheap wins' on easier tasks that experts could also handle.
•AI excels at tedious computations and enables 'population studies' in math, shifting from handcrafted case studies to larger-scale analysis.
•AI currently lacks creativity and accurate self-confidence ratings, limiting its usefulness compared to human insight.
•The future lies in hybrid human-AI collaboration, with AI acting as a junior co-author for grunt work, not fully autonomous solutions.
•Mathematicians need interactive AI platforms for conversation, not just push-button automation, to responsibly integrate the technology.

Terence Tao, the legendary mathematician, explains the promise of generative AI.

An illustration of the mathematician Terrance Tao — Illustration by The Atlantic. Source: Kimberly White / Getty Images

Over the past couple of months, several researchers have begun making the same provocative claim: They used generative-AI tools to solve a previously unanswered math problem.

The most extreme promises—AI-assisted resolutions to some of the hardest problems in mathematics—may well turn out to be empty hype. But a number of AI-written solutions, albeit to far less lauded problems, have checked out. These were answers to a number of the Erdős Problems—more than 1,000 mathematical questions set forth by the Hungarian mathematician Paul Erdős—written with generative-AI models including ChatGPT. OpenAI quickly claimed a victory: “GPT-5.2 Pro for solving another open Erdős problem,” OpenAI President Greg Brockman posted on X in January. “Going to be a wild year for mathematical and scientific advancement!” (OpenAI and The Atlantic have a corporate partnership.)

Much of the excitement around the news has stemmed from the adjudicator of these AI-written proofs: Terence Tao, a professor at UCLA who is widely considered to be the world’s greatest living mathematician. His stamp of approval seemingly legitimizes the greatest promise of generative AI—to push the frontier of human knowledge and civilization. When I called Tao earlier this month to get his take on what AI can offer mathematics, he was more tempered. The AI-generated Erdős solutions are impressive, he told me, but not overwhelmingly so: The bots have functionally landed some “cheap wins,” Tao said.

Read: We’re entering uncharted territory for math

Tao has long been intrigued by, but reserved about, what AI tools can do for his field. The first time we spoke, in the fall of 2024, Tao had likened chatbots to “mediocre, but not completely incompetent” graduate students. About six months later, he told me the models had gotten better “at certain types of high-level math reasoning,” but lacked creativity and made subtle mistakes. But during our most recent conversation, he was more bullish. AI may not be on the cusp of solving all of the world’s great math problems, but chatbots are at the point where they can collaborate with human mathematicians. In the process, he said, the technology is opening up a different “way of doing mathematics.”

This conversation has been edited for length and clarity.

Matteo Wong: There has recently been a lot of excitement around ChatGPT’s ability to solve some Erdős Problems. How have you seen generative AI’s mathematical capabilities evolve over the past year or so?

Terence Tao: There’s a big crowd of people who really, really want AI success stories. And then there’s an equal and opposite crowd of people who want to dismiss all AI progress. And what we have is a very complicated and nuanced story in between.

In these Erdős Problems in particular, there’s a small core of high-profile problems that we really want to solve, and then there’s this long tail of very obscure problems. What AI has been very good at is systematically exploring this long tail and knocking off the easiest of the problems. But it’s very different from a human style. Humans would not systematically go through all 1,000 problems and pick the 12 easiest ones to work on, which is kind of what the AIs are doing.

There really is this massive scale of difficulty between these problems. And looking at the problems that AIs have solved by themselves so far, it’s like, Oh, okay, they were using a standard technique. If an expert had half a day to look into the matter, they would have worked it out too. There have been more sophisticated solutions, which are AI-assisted. I think in the short term we’re going to get a lot of quick wins on easy problems from pure AI methods. And then over the next few months, I think we’re going to have all kinds of hybrid, human-AI contributions.

I’m learning from some of the proofs that show up. I enjoy reading them—maybe it uses a trick from some paper from 1960 that I wasn’t aware of. So it may not be super, super creative, but it was new and it can do things that human experts looking at the problem dismissed.

Wong: You’ve written that when human mathematicians approach a new problem, regardless of whether they succeed, they produce insights that others in the field can build on—something AI-based proofs don’t provide. How come?

Tao: These problems are like distant locations that you would hike to. And in the past, you would have to go on a journey. You can lay down trail markers that other people could follow, and you could make maps.

AI tools are like taking a helicopter to drop you off at the site. You miss all the benefits of the journey itself. You just get right to the destination, which actually was only just a part of the value of solving these problems.

Wong: When you think about the abilities of these models today, what can they contribute to your field in addition to enabling nonmathematicians to tackle more advanced problems?

Tao: Today there are a lot of very tedious types of mathematics that we don’t like doing, so we look for clever ways to get around them. But AIs will just happily blast through those tedious computations. When we integrate AI with human workflows, we can just glide over these obstacles.

I also think mathematicians will start doing math at larger scales. Think about the difference between case studies and population surveys in sciences. If you were to study a disease in the 18th century, if it was a rare disease, you might study one patient who has this disease and record all their symptoms and take meticulous notes. But in the 21st century, you can do a clinical trial and you can administer a drug to 1,000 people and do statistics and get much more precise information about the efficiency of your drug.

Mathematics is still very much at the case-study level. A paper will take one or two problems and study them to death in a very handcrafted, intensive way. That’s our style. But what AI tools enable is population studies.

Wong: Have you been surprised by the progress that AI models have made in their mathematical abilities?

Tao: A little bit surprised. A lot of the things that have happened, I expected to happen, but they came a little ahead of schedule than I expected. Not by much.

In 2023, for example, I wrote this article for Microsoft predicting that by 2026, AI will be a trusted co-author—that its contributions will be on the level of a co-author to a technical paper. The paper got a mixed response: People either said I was being way too ambitious or way too pessimistic. But I think it’s basically almost exactly the schedule. We are basically seeing AIs used on par with the contribution that I would expect a junior human co-author to make, especially one who’s very happy to do grunt work and work out a lot of tedious cases.

Wong: What improvements are you hoping or expecting to see from generative-AI models in the next year or two?

Tao: There’s a middle ground where we want to encourage responsible AI use and discourage irresponsible AI use. It is a delicate line to tread. But we’ve done it before. Mathematicians routinely use computers to do numerical work, and there was a lot of backlash initially when computer-assisted proofs first came out, because how can you trust computer code? But we’ve figured that out over 20 or 30 years. Unfortunately, the timelines are much more compressed now. So we have to figure out our standards within a few years. And our community does not move that fast, normally.

One very basic thing that would help the math community: When an AI gives you an answer to a question, usually it does not give you any good indication of how confident it is in this answer, or it will always say, I’m completely certain that this is true. Humans do this. Whether they are confident in something or whether they are not is very important information, and it’s okay to tentatively propose something which you’re not sure about, but it’s important to flag that you’re uncertain about it. But AI tools do not rate their own confidence accurately. And this lowers their usefulness. We would appreciate more honest AIs.

Additionally, a lot of AI companies have this obsession with push-of-a-button, completely autonomous workflows where you give your task to the AI, and then you just go have a coffee, and you come back and the problem is solved. That’s actually not ideal. With difficult problems, you really want a conversation between humans and AI. And the AI companies are not really facilitating that.

If we can work with at least some tech companies that are willing to develop more interactive platforms, that will be much more readily embraced by the people. We don’t want to be reduced to just pushing buttons.