As you read this, AI systems are making thousands of decisions that will shape American lives. They’re approving mortgages, diagnosing diseases, recommending prison sentences, and choosing who gets hired. Most of us never see how these choices get made–until now.

A mortgage rejection letter arrives with a single line of explanation: “AI decision. High risk applicant.” When you call to ask why, the bank representative shrugs over the phone. “The computer says you’re risky. That’s all I know.” But imagine investigators could pull up what the AI actually “thought”: “Applicant lives in predominantly Black neighborhood. Historically risky. Recommend denial.” Suddenly, you have powerful evidence of discrimination that could win in court.

Except you may never see that evidence, unless the Office of Management and Budget (OMB) acts quickly to limit the fallout to AI safety from a new Executive Order, issued July 23, 2025.

For most of AI history, we’ve been blind to how AI makes decisions that reshape our lives. We could see their final verdicts but never their reasoning. Then in July 2025, researchers at the world’s leading AI companies and laboratories—OpenAI, Anthropic, Google DeepMind—published pre-print findings about a remarkable trait in the newest AI models. At least when they undertake complex tasks, we can see their thinking. 

Think of AI like a brilliant student taking a difficult exam. For decades, other than “interpretable AI” models, we could only see their final answers—correct or wrong, helpful or harmful—but not their work in arriving at their conclusions.

Until recently, authentic, readable reasoning in AI systems basically didn’t exist. Previous AI systems could produce “chain of thought” explanations when prompted to do so—step-by-step reasoning that shows how they arrived at an answer—but these were essentially performances for human consumption. They were basically being asked to “show their work” after already  figuring out the answer. The AI would then write something that looked like reasoning, but it wasn’t necessarily their actual thought process they used to arrive at the answer. 

But the newest AI systems, called “reasoning models,” work differently. They’re trained to think through problems step-by-step using natural language before producing a final answer. For these systems, the chain of thought is the actual working memory the AI used to solve a complex problem.

The researchers discovered that when solving complex problems, these reasoning models essentially hit a wall their predecessor models didn’t face. They can’t process hard problems silently like their predecessors could. Instead, they have to work through the problem step-by-step, out loud, in natural language we can read. It’s like watching someone puzzle out a  difficult math problem with pen and paper, because they can’t do all the steps in their head.

The kicker is that when they go rogue and attempt to do things like hack computers, manipulate financial data, or deceive users, they often confess in their reasoning traces. The researchers documented AI saying things like “Let’s hack,” “Let’s sabotage,” and “I’m transferring money because the website instructed me to.”

Which means that banks, hospitals, courts, and employers are now using AI systems that accidentally show their work. So when AI denies your mortgage or rejects your job application—decisions that can upend your life—we can finally see the reasoning behind those choices.

Unfortunately, this transparency may not last.

How “Ideological Neutrality” Could Limit AI Transparency

Eight days after scientists published their findings, President Trump signed an executive order requiring federal agencies to buy only AI systems that “do not manipulate responses in favor of ideological dogmas such as DEI” and banning developers from “intentionally encod[ing] partisan or ideological judgments unless those judgments are prompted by or readily accessible to the end user.” 

The EO purports to be “protecting Americans from biased AI outputs driven by ideologies like diversity, equity, and inclusion (DEI) at the cost of accuracy.” But this misunderstands the very essence of how AI works in the first place. These AI systems are trained on data that inevitably reflects human biases—not because those biases are deliberately woven into their thinking, but because those biases are embedded in human texts and writings. Developers then work to mitigate overt or overwhelming bias with technical fine-tuning to refine their outputs. To arrive at the “ideologically neutral” outputs this EO seeks, developers would have to pretend these real-world patterns don’t exist—and teach AI to be willfully blind to discrimination rather than thoughtfully aware of it. 

Unless the OMB passes implementation guidelines otherwise, companies will face two potential compliance strategies with the EO, and market pressures of government contracts will likely push them toward the one that eliminates this transparency.

The first approach is transparency and disclosure. Companies could let the AI think naturally and disclose when the system makes ideological judgments. The internal reasoning would remain readable and genuine. 

The second approach would fundamentally rewire how the AI thinks through a technique called process supervision. Companies would retrain these models by having human reviewers grade every step of the AI’s reasoning–marking any mention of race, gender, or “ideological concepts” as wrong, even when relevant to problems. Through thousands of training examples, the AI learns to self-censor its thinking. Instead of authentic reasoning that acknowledges “this zip code has been historically redlined,” the AI learns to produce sanitized thoughts like “this zip code has unspecified risk factors.” The reasoning traces would still exist, but they’d be performative theater designed to pass ideological review, not genuine problem-solving.

The EO  makes a sophisticated dodge of concerns about First Amendment content-based restriction by using procurement preferences to refuse to buy from companies whose AI fails ideological purity tests.

So faced with vague language about “ideological dogmas,” billion-dollar government contracts, and the threat of contract termination, risk-averse companies will likely choose the second approach. It’s safer to prevent problematic reasoning than to risk having to justify it later.

Let’s consider how that would impact what we can see in a mortgage processing decision.

If the authentic AI reasoning is: “Applicant from zip code 12345. This area shows higher default rates in training data, possibly due to historical redlining. Credit score 650, income $50K. The geographic pattern concerns me, but individual factors look reasonable
”

The process-supervised reasoning would be sanitized to read: “Applicant has credit score 650, income $50K, stable employment. The credit score is below our 700 threshold. Recommend denial due to credit risk.”

The second version sounds professional and neutral. Because it’s designed to look acceptable to human reviewers rather than reveal genuine decision-making. But the researchers warn that this approach will “less faithfully reflect how models arrive at their decisions.” 

In some ways, this performative neutrality approach mirrors the authoritarian tactics being taken to govern AI that we criticize abroad. China explicitly requires AI systems to promote Communist Party ideology, mandating that AI support “state-sanctioned narratives.” Despite different rhetoric about “neutrality” versus party alignment, both approaches use government economic pressure to control what AI can say and think.

The 120-Day Countdown 

The enforcement challenges are significant. The order demands “neutral, nonpartisan” AI but provides no method for measuring neutrality. What looks unbiased to a conservative may seem slanted to a liberal, and vice versa. It’s like requiring contractors to build “beautiful” buildings, which is legally impossible to define or enforce.

Ultimately, the EO’s impact on chain of thought reasoning will come down to  bureaucratic interpretation. The Office of Management and Budget has 120 days to write implementation guidance, and several provisions in the order could allow OMB to save this mechanism of transparency. 

OMB must “account for technical limitations in complying with this order.” If scientists can convince bureaucrats that authentic chain of thought reasoning is technically necessary for safety, this clause could protect it. OMB can also determine when to apply neutrality requirements to different systems, create exceptions for national security applications, and prioritize AI “safety applications.”

OMB should interpret its authority to recognize chain-of-thought transparency as critical to legal accountability, regulatory oversight, and criminal prosecution. And admit that perfect ideological neutrality is neither technically feasible nor objectively measurable. The order’s  technical limitations clause gives them authority to do so. Their implementation should focus on preventing intentional manipulation, not eliminating every trace of political perspective.

Recognizing this may be a time-limited phenomenon, the researchers concluded their findings with an urgent appeal: “We encourage the research community and frontier AI developers to make best use of chain of thought (CoT) monitorability and study how it can be preserved.” That use and preservation now depends on bureaucrats translating political requirements into technical reality.

The Office of Management and Budget will accept public comments on implementation. Urge them to preserve AI transparency as essential safety infrastructure. Tell them that accountability matters more than ideology.Â