Thoughts and notes from the workshop on the Cognitive Science of Mathematical Understanding
Sneaking around ∞Dedekind’s idea of infinite sets:
if you’re one of them you’re simply a set
admitting 1-to-1 correspondence
with—wait:— a proper subset
of yourself!. . . you’re a snake
that can gobble a snake your size
with room left over besides. . .
not worrying that in your meal
yet another same size snake resides;
stream upon stream
of mise en abîme.
What does it mean to understand mathematics?
Maybe mathematical understanding is having factual knowledge of mathematical concepts.
Well, that could hardly be it. As Bethany Rittle-Johnson explained, very young children may know the number words from “one” to “five”, but when asked for five coins, they fail to understand the concept of size (i.e. cardinality of a set)
There is an interesting tension in these examples, and more broadly in maths education, so argued Keith Weber, between reconciling the internal representations we have of concepts (e.g. the relationship between number words and cardinality) with the external, institutional representations imposed on us (e.g. the ways the “equals sign” should be interpreted)
Such an internal-external representational conflict is one we also often see in artificial intelligence (AI). The reason a large language model (LLM) cannot answer the question “How many Rs are there in the word strawberry?”
Due to the subtle differences in representations, it is important that we stay mindful of the different ways humans and AI represent concepts.
Perhaps mathematical understanding is the ability to generalize and interlink the mathematical concepts we know.
Drawing connections between related concepts can be a source of immense insight: just think of the Langlands Program that drew connections between number theory, geometry, and group representations
Why do some people think 400 is more even than 326?
Curiously, when I prompted ChatGPT “Is 400 more even than 326?” it gave a very definite “No.” as an answer. Similarly, for the example of triangles. Of course, in either case, the LLM is mathematically correct, but to me, it seems something useful is lost in the answers’ “definiteness”. Going back to which triangle is more triangle-y, one mathematician in the workshop claimed that they always try to draw the least regular triangle when thinking about just any triangle. To mathematicians, it seems attaching additional properties (e.g. symmetry) to a concept may be a dangerous distraction that could lead down garden paths. It may seem drawing connections between concepts is not always helpful.
On the other hand, LLMs also seem ready to find ad-hoc concepts where none exists. I asked ChatGPT to group a bunch of random words I just typed out,
In the end, if we are to use AI in maths, finding the right balance between exploiting these human-like biases versus removing human-like bias from the mathematical research process will be crucial.
It seems mathematical understanding may have to do with having intuitions about a generalizable and connected network of mathematical concepts.
Certainly, this seems important. Knowing intuitively which theorem to apply in a proof without searching through the vast space of all theorems is certainly helpful. So is having an intuition for which theorems can be proven and which are out of reach. However, intuitions aren’t created equal, as explained by Silvia De Toffoli.
Some intuitions are plain, as in, they provide an immediate justification to something, but nothing that would reveal why the intuition is true. This is just like having a “bullshit detector” for AI slop. I know intuitively that a video is slop even though I cannot articulate why. So then the second kind of intuition is articulate, which is amenable to epistemic inquiry. Articulate intuition, the one that allows a mathematician to say Theorem X is more “provable” than Theorem Y because of this and that concept, seems much more like mathematical understanding.
Intuition in turn is very related to the ability to actually use our mathematical understanding, as in mathematical inference. As Jeremy Avigad has remarked, I may have all the knowledge of the right mathematical tools and understand their connections, but if I can never apply those tools, or never learn anything from applying those tools, then it feels like I haven’t understood a lot after all. For example, I could have a sense that I understand how one may turn a sphere inside out (i.e. sphere eversion). Maybe a cool animation has even shown how eversion is done, yet I would not be able to tell you why eversion is useful and interesting, or when one could use it to prove something.
This is a crucial point when it comes to AI. There is a broad recognition in computer science that AI tools are opaque, indecipherable systems. Using these tools repeatedly incurs cognitive debt
So mathematical understanding is having externalizable and intuitive knowledge of generalizable and interconnected concepts?
Well, the way we externalize our knowledge will depend on a range of dimensions about how we can formalize and informalize our thoughts. One such dimension is the degree of constraints imposed by the shared language used between interlocutors. For example, writing a strictly typed formal proof of Fermat’s Last Theorem in Lean4
One might hope here, that LLMs would excel at externalization. After all, their billions of parameters have stored all that the internet can offer in data, so they have the background knowledge. Unfortunately, LLMs are neither great at writing Lean4 code nor communicating complex ideas about proofs. In our preliminary experiments with Simon DeDeo, we compared how mathematicians edit Lean4 code to how LLMs edit the same code, and found that LLMs differ drastically from humans. While humans had more or less stable ontologies keeping the overall proof-dependency structure largely unaltered, LLMs had no qualms about disrupting the structure of a proof by introducing new lemmas or removing existing lines. This raises the important question of how the use of LLMs in mathematical research will steer downstream research.
Fortunately, LLMs don’t write maths unless we ask them to, and even when you ask them, they often simply say: “It can’t be done.” Just try asking ChatGPT to prove (or provide a next step of) whether $e\pi$ is irrational. It seems to me, that mathematical understanding requires rather a desire to understand, not something AI systems are equipped with. One such desire seems to stem from a sense of aesthetics in maths.
Maybe understanding is having externalizable and intuitive knowledge of generalizable and interconnected mathematical concepts in search of beauty.
After all, there is no more intrinsic truth value to Gauss’s pairing proof that $1 + 2 + … + N = N(N+1)/2$ as opposed to a straightforward proof by induction. Similarly, rote application of algebraic manipulations to show that the sum of odd numbers is a square number:
\[\displaylines{(n+1)^2 = n^2 + 2n+1 = (n-1)^2 + 2n-1 +2n+1 = … \\= 1 + 3 + … + 2n-1 + 2n + 1}\]is equally correct as the below proof without words. So why is it that we each feel one proof may be more beautiful than the other?
When I ask ChatGPT to pick which proof of the formula of the sum of integers (Gauss v. Induction) is more beautiful, it picks what most of us would expect: Gauss’s proof. It appeals to a sense of elegance in noticing a non-trivial pattern; it mentions that the proof uses a “quick, surprising, and deep pattern — the kind of simplicity and insight mathematicians often call beautiful.” It also evokes the contrast case of induction, which it refers to as “solid, rigorous, but routine”.
Of course the above qualities are hugely proof dependent. For example, the proof of Fermat’s Last Theorem is anything but quick and surprising, yet there is still much beauty to be found in it. The danger with using AI to make choices for us is to conflate the sycophancy of AI with aesthetic morality, or indeed innate agency. LLM tools are just that: tools that we use in pursuit of our goals.
Maybe there is beauty to be found in using LLMs for maths though. As Barry Mazur argued, historically much of our mathematical understanding improved when we took our existing set of tools and abstracted shared patterns, thereby opening up new domains for inspection. Perhaps we could understand AI as mathematical tools much like we understand set theory as a tool to talk about collections and their relationships. In turn, we could formulate an abstract theory of LLM-maths, much as category theory abstracts sets and functions into objects and arrows.
So what is mathematical understanding?
I have dissected the question into yet many more questions, related to knowledge, connections, intuition, externalization, and aesthetics, and I am certain I have missed so many more aspects. For each of these questions, AI systems have a role to play, but it is ever-so-crucial to retain our humanity in doing mathematics. Doing that consciously seems like a grand challenge, and it seems we will have to revisit the question of what is mathematical understanding year after year, much like a doctor checks up on their patient every year.