This weekend, Ilia Malinin–the “Quad God”—landed five quadruple jumps to help the United States win gold in team figure skating at the Milano-Cortina Olympics. And at home, we were able to watch as fourteen 8k cameras captured his rotations, AI processed the footage into 3D models, and, in real-time, we could learn the airtime, landing speed, and heat maps and graphic overlaps.  

Despite the sophistication offered by these cameras, computer-vision, and AI processing, the judging was still decidedly human, as nine judges watched his performance with their own eyes, as they always have. The International Skating Union (ISU) has been testing high-resolution camera systems that use AI to analyze technical elements in real-time, but it hasn’t empowered the judges yet with that technology. Even so, the case for doing so is quite powerful.

At the 2024 Paris Olympics, U.S. gymnast Jordan Chiles delivered a Beyoncé-inspired floor routine and received a score that put her in fifth place. But her coach, Cecile Landi, realized that the judges had failed to credit a technical element, a tour jeté full that Chiles had executed cleanly, so she lodged an inquiry. The element was then recognized, 0.1 points were added, and Chiles moved onto the podium with a bronze medal.

Then Romania challenged the inquiry itself, arguing Landi had submitted it four seconds past the one-minute deadline. The Court of Arbitration for Sport (“CAS”), an international body that allows athletes to settle disputes through binding arbitration, agreed just a few days later and Chiles had her medal stripped. Chiles then endured months of racist abuse online as she appealed the decision to the Swiss Federal Supreme Court, which found “conclusive” video evidence that the inquiry was timely. The case has just been sent back to the CAS to reconsider its decision. Perhaps it will be resolved before the next Olympics.

None of this had to happen. If AI had been scoring Chiles’ routine, her tour jeté full would have been automatically credited. Rather than a narrow inquiry window and procedural collapse, Chiles would have received her bronze medal and gone home a celebrated Olympic winner. The Judging Support System already used in artistic gymnastics can identify nearly 2,000 elements with roughly 90 percent accuracy. It doesn’t get tired, or anchored by the last routine it watched. For the work of identifying elements, counting rotations, and detecting deductions, AI may simply be better than human judges.

So why does handing judgment to AI still make us uneasy?

Because we sense, even if we can’t quite say why, that our idea of “judging” requires something beyond mere pattern recognition. That expertise isn’t just one thing but contains multitudes beyond the technical scoring that we’ve never clearly separated, never had to name, because it was always bundled inside the same word judgment.

But perhaps, and just hear me out, we’ve been conflating two different things and calling the whole package “expertise.” And it’s now time to separate what we do well and what AI may do better. The first aspect of expertise is pattern recognition, where we match what we see against a vast library of prior experience. The second aspect is harder to name. Call it embodied perception, an integration of knowledge that can’t easily be reduced to a checklist. AI is pulling these two things apart, and maybe, just maybe, that isn’t a threat but a chance to be honest about what we value.

The difference between these two kinds of knowledge is older than AI.

In the 1920s, Japan’s poultry industry faced an expensive problem: you can’t tell male chicks from female ones until they’re weeks old. The Zen-Nippon Chick Sexing School had figured out how to sort day-old hatchlings, where expert sexers could examine 800 chicks an hour at 98 percent accuracy, but the experts couldn’t explain how they did so. Cognitive scientist Richard Horsey found that even the best sexers said they just “knew.” Formal instruction, from lectures, to diagrams, and decision rules, couldn’t train others to do the same. So they did the only thing they could do, which was to position a trainee next to a master. The trainee would pick up a chick and guess, and the master would say yes or now. Thousands of chicks later, and the trainee became an expert, too, but still couldn’t explain how.

The philosopher Michael Polyani identified the structure of knowledge at work here. In The Tacit Dimension, he argued that “we know more than we can tell.” And that experts perceive wholes before they can identify particulars.

Herbert Simon’s landmark chess research showed us something similar from the other direction. Grandmasters don’t think harder than novices; they see differently, perceiving positions in large meaningful chunks built from tens of thousands of games. AI seems to replicate and often surpass this form of expertise because it doesn’t get tired, isn’t anchored by the last case it read, and can train on a much larger set of examples than any human can.

The chicken sexer doesn’t just see the chick. She holds it, feels it, smells it, and integrates these senses together simultaneously to decide male or female. A gymnastics judge, at least the very best kind, perceives the difference between a routine that is technically correct and one that is alive.

This second kind of knowledge may genuinely require a body. There’s a serious tradition in cognitive science, running from the Dreyfus brothers through contemporary enactivism, that argues embodiment isn’t incidental to the highest forms of expertise, it’s constitutive of it. AI, which has no body, has nothing at stake, doesn’t experience fatigue level, and has no felt experience of the world may be structurally unable to access this layer. Not because of a technical limitation that will be solved with more data but because  of what it is.

If you want to see something that no camera system measures, consider what happened Sunday in Cortina d’Ampezzo.

At 41 years old, Lindsey Vonn stood at the top of the Tofane course with a fully reconstructed titanium right knee and a completely torn left ACL that she had ruptured nine days earlier. Doctors were skeptical that she could compete in these Olympics. “Just because it seems impossible to you doesn’t mean it’s not possible,” she said the day before the race. “My ACL is 100% ruptured. Not 80% or 50%. It’s 100% gone,” she confirmed.

And yet, she chose to race. She crashed within seconds of the starting gate, clipping a flag, cartwheeling down the slope, and was airlifted off the mountain by helicopter with a broken leg.

It was absolutely devastating to watch. But the reason we were riveted wasn’t because of the ultimate outcome. It was her decision. Here she was, a human being, subject to gravity and fear and pain like the rest of us, choosing to attempt something extraordinary seemingly outside the  constraints of an ordinary, let alone a broken, body. Sure, the cameras could measure her speed at the moment of impact. But they could not even begin to score what it meant for her to push out of the gate in the first place.

That’s what we watch the Olympics for. Not to see a body satisfy a checklist. To witness courage and artistry, the way an athlete pours herself into the fraction of a second between a rule’s specification and its execution. It’s our awe at the felt experience. It may be the thing our scoring systems were never actually measuring, the thing we called “judgment” while we were really just counting elements and averaging scores.

Figure skating has been grappling with this since the 2002 Salt Lake City scandal led to a new system that tried to systematize even artistic impressions into discrete component scores. The instinct is understandable, that when you find bias in human judgment, you decompose everything into measurable parts. But if Polanyi is right, if decomposing a whole into its particulars can destroy the understanding itself, then perhaps more elaborate checklists aren’t the answer.

So here is a different possibility. What if we let AI handle the measurements, accurately, fairly, without fatigue, and with bias we can detect and correct. Then protect the space where the only appropriate response is awe. Not a better checklist but just honesty about what we’re celebrating, and the willingness to leave room for it in watching, in judging, and in scoring the impossible.

We conflated what’s measurable with what’s meaningful and called the whole thing “expertise.” AI is now pulling those concepts apart. That gives us a chance to be honest about what we value, be it in elite sports, medicine, law, and every other domain where machines are arriving.