Is the simplest answer always the best? Connecting Medieval monks to computational complexity, using the branch of mathematics known as category theory.
August 17, 2014
Sometimes we get a gut feeling that something is right, but we just can’t explain why. That’s how many scientists feel about Occam’s Razor, the notion of choosing the simplest theory to fit the data. A bright ball of light streaks through the sky: we could dream up the existence of aliens to explain it, but the simplest explanation is a meteorite.
The principle was first articulated in the 14th century by a monk called William from the village of Ockham
in southern England. It has guided scientists’ work for centuries, but it’s only in the last 50 years that we have developed the mathematical language and understanding to be able to attempt to pin-down why it seems right. Or perhaps our gut feelings are wrong and the simplest explanation isn’t always the best description of nature? Now, Noson Yanofsky
, who is based at the computer science department at Brooklyn College, New York, is hoping to make some concrete progress on these deep questions.
I’m meeting with Yanofsky at Churchill College, Cambridge; we escape the pavement heat to sit in the college’s quiet coffee lounge. He is in Cambridge for a conference on category theory
, the mathematical language he will use to tackle Occam’s Razor and other topics. Category theory
is an increasingly popular and important branch of modern mathematics that allows mathematicians to describe mathematical structures by their transformations, in other words, how they map from one to the other. Many parts of mathematics have been “categorified,” but it has also proved useful in computer science, particle physics, quantum mechanics, and even linguistics. (See, “The Quantum Linguist
.”) “I still find it fascinating that you have this language that can talk about so many different things,” he says.
Yanofsky charts his interest in category theory back to the 2nd year of his PhD at the City University of New York, when a course taught by Alex Heller
“woke” him up. Heller became his dissertation adviser, and the pair continued meeting twice a week to discuss philosophy, mathematics and politics long after Yanofsky graduated. “He was an amazing genius and his range of knowledge was astonishing,” says Yanofsky.
My first impression of Yanofsky is of someone friendly and generous, a brilliant ambassador for mathematics, but not just because of his personality—he has worked hard on writing two universally-lauded books, one about quantum computing
, and one more recently about the limits of knowledge, The Outer Limits of Reason
, which was published last year. He finds explaining ideas to other people very rewarding.
The notion of simplicity
of a scientific theory
does not have a
precise meaning unless
it is formalized in
- Olivia Caramello
”When my students say, “oh, now I understand it”—that’s a great feeling,” Yanofsky says. His students at Brooklyn College even helped him to write his books. “I wrote chapters and handed them out at each lecture, and the students complained anytime they found something too complicated,” says Yanosfsky. “There is nothing like a whole class of student editors—they got so excited about finding spelling mistakes or poor grammar, they’d circle it and underline it and come up to me after class and say incredulously, “how can you write that?!””
Yanofksy has a unique take on any problem that he tackles because his academic background bridges a number of disciplines. As a teenager his passion was computing, and he spent many hours programming Pac-Man and other games in the 1980s. Naturally he chose computer science for his first degree, but then veered into mathematics for his postgraduate studies. And writing his books has drawn him into physics, particularly into the foundations of quantum mechanics and quantum computing.
Mathematician Olivia Caramello
, of the Institut des Hautes Etudes Scientifiques in France, says Yanofsky’s broad perspective is one of his key strengths. “He’s a very original and eclectic researcher with an extremely broad knowledge,” she says. “His exquisitely inter-disciplinary attitude means he is able to combine ideas from different areas and conceive generalizations and unifying patterns across them.”
In his latest project, funded by an FQXi grant of nearly $50,000
, Yanofsky is taking a tool from computer science, called Kolmogorov complexity
, and applying it to categories. Kolmogorov complexity was pioneered in the 1960s, and defines how much information is contained in a string of letters or numbers in terms of the computing resources needed to reproduce it.
Simple or complex? Ask Kolmogorov
A computer program can generate a seemingly complicated image, such as
this portion of the Mandelbrot set fractal, relatively easily. Rather than
storing information about every pixel required, it simply uses the definition of
the Mandelbrot set to produce it.
Credit: Reguiieee, Wikimedia Commons
For instance, reproducing the string 0,0,0,0,0,0, is obviously much easier than a list of the first six primes, 2,3,5,7,11,13, because it contains less information. However a list of six random numbers, such as 1,5,3,9,8,2, has no pattern, so it cannot be compressed. In this sense the string of random numbers contains the most information. Kolmogorov defined randomness as a string that is incompressible, when the shortest description for something is the string itself. He also defined the information content of a string as the length of the most succinct computable description of that string.
But Yanofsky isn’t just interested in strings; he has a wider vision of determining the information content of structures across mathematics, computer science and physics. So he will generalise Kolmogorov complexity to find the information content of the categorical descriptions of these structures. He has created a programming language to generate the categorical descriptions, so the information content of each structure will be the length of the most succinct computer program that generates it.
If “simple” is taken to mean “contains the least information,” then quantifying the information content of different theories and comparing them should give insights into why Occam’s Razor seems to work. “If Yanofsky can describe the information content of a mathematical or scientific theory, then we can begin to think about these philosophical questions, like Occam’s Razor, in a more precise way,” says Jim Cox
, a close colleague of Yanofsky’s at Brooklyn College.
”It’s a very innovative and exciting idea,” agrees Caramello. “The notion of “simplicity” of a scientific theory does not have a precise meaning unless it is formalized in some way. Because of its broad range of applicability, the general notion of categorical complexity developed by Yanofsky should eventually significantly clarify these issues, and help us detect when a certain theory should be preferable to another.”
Yanofsky also has his eye on a problem that many consider the main open problem in computer science. Known as the P versus NP problem
, it concerns a whole set of problems called NP-Complete problems. There are billions of dollars at stake in being able to solve these hard problems because they crop up in scheduling, shipping routes, and even gene sequencing. Although NP-Complete problems were first defined in the 1970s, no one has yet managed to prove whether or not they can be solved in a reasonable timescale—even with proposed quantum computers
which in principle should one day be able to exploit subatomic physics to perform computational tasks far faster than today’s machines.
The difficulty with these hard problems is that a computer has to do a lot of searching to find the possible solutions. For example, the only way for a computer to work out the shortest route between ten cities is to go through all the possible combinations of routes, as there is no structure to point the way to the solution. But easy problems have a structured solution space, for example, if you are finding three times three, you could count up in threes, making it quicker to get to the answer.
”Computer scientists want to know what’s a hard problem and what’s an easy problem so that they know whether they should carry on searching for the
answer, or instead focus on finding an approximate
answer,” says Yanofsky. He hopes to use his Kolmogorov complexity to determine whether a problem’s solution space has a lot of structure or not, which could define whether it is an easy or a hard problem.
Cox agrees that Yanofsky’s categoric approach could shed light on the question, but cautions that the most difficult part of the project will be in the interpretation of the results, rather than in the technical details. Nevertheless Cox is optimistic that something philosophically interesting will come out of the work: “He’s thought deeply about the philosophical issues, so he’ll probably come up with a reasonable interpretation that we can argue about over coffee.”
Please read the important Introduction that governs your
participation in this community. Inappropriate language will not be tolerated and posts containing such language will be deleted. Otherwise, this is a free speech Forum and all are welcome!
HERB WIGGINS wrote on September 16, 2014
First, Occam's Razor is very likely a kind of least energy principle. That is, it suggests the least energy solution to a problem, viz., that the best solution to a problem creates the fewest new terms, hypotheses, etc., in explaining a phenomenon. In this case, it's thermodynamically efficient when compared to the other hypotheses which purport to explain a phenomenon. It's a process comparing the energy cost outcomes of the other hypotheses.
But what does it mean to "explain"? That is...
ECKARD BLUMSCHEIN wrote on September 11, 2014
"whether questions exist whose answer can be quickly checked, but which require an impossibly long time to solve by any direct procedure."
The question how large is the sum 1/2 + 1/4 + 1/8 + ... can be quickly and plausibly answered by means of common sense but definitely not by direct procedure.
While Kolgomorov's idea is convincing to me, I did not yet grasp its promised application.
Let me reiterate a seemingly quite different statement of mine: Addition of redundancy,...
ELLIOT MARTIN PINES wrote on August 28, 2014
read all article comments
Still the thought nags.
Is not Occum's Razor--be it Kolmogorov complexity/(Chaitin's) algorithmic information--merely the approximation of a rapidly decaying tail of an infinite distribution of possibilities.
In the greater scheme of things, will assuming such a decaying tail still prove valid?