The ‘Pre-Baking’ Paradox: Dwarkesh Patel on Why AGI Is Not Imminent

In a detailed analysis released this week, influential podcaster and AI commentator Dwarkesh Patel challenged the prevailing narrative that Artificial General Intelligence (AGI) is imminent, arguing that the current industry strategy of “pre-baking” skills into models exposes a fundamental lack of true learning capabilities.

In a video essay titled “Thoughts on AI progress,” published on December 23, 2025, Patel contends that leading AI laboratories are relying on inefficient methods to simulate intelligence rather than creating models that can learn dynamically like humans. He suggests that while models are becoming more impressive on benchmarks, they lack the essential “continual learning” required to function as autonomous economic agents.

The “Pre-Baking” Paradox

Patel argues that the immense effort labs currently expend to train models on specific software tools—such as web browsers or Microsoft Excel—is essentially an admission of failure regarding general intelligence.

“If we’re actually close to a human-like learner, then this whole approach of training on verifiable outcomes is doomed,” Patel stated.

He posits that if models possessed human-level learning capabilities, they would not require billions of dollars in specialized training loops to master basic tasks. Instead, they would learn “on the job” through semantic feedback and self-directed experimentation. The current approach, involving what he calls “shleppy” training loops for every specific micro-task, indicates that the models cannot yet generalize effectively.

“Human labor is valuable precisely because it’s not shleppy to train,” Patel observed, noting that human employees can adapt to niche tasks—such as identifying specific cellular structures in a biology lab—without requiring a massive, custom-built training pipeline.

The Economic Reality Check

Patel also pushed back against the argument that the slow adoption of AI in the broader economy is due to corporate inertia or “diffusion lag.” He argues that the market for intelligence is efficient, and if these models were truly replacements for human cognition, companies would be integrating them aggressively.

“Economic diffusion lag is cope for missing capabilities,” Patel argued. “If these models actually were like humans on a server, they’d diffuse incredibly quickly.”

He noted that human knowledge workers cumulatively earn tens of trillions of dollars annually. The fact that AI revenue remains orders of magnitude lower suggests the technology simply hasn’t reached the necessary capability threshold to justify replacement, regardless of how well it performs on standardized tests.

The Diminishing Returns of Reinforcement Learning

A core component of Patel’s skepticism centers on the efficiency of Reinforcement Learning (RL), the primary method currently used to scale model performance post-training. Unlike the “pre-training” phase, which followed predictable power laws where adding more compute reliably yielded smarter models, RL appears to be far less efficient.

Patel cited research suggesting a stark disparity in returns between the two methods: “We need something like a 1,000,000x scale-up of total RL compute to give a boost similar to a single GPT level.”

He characterized the industry’s reliance on RL as an attempt to “launder the prestige of pre-training scaling,” attempting to apply the optimism of past successes to a new methodology that may not hold up to the same mathematical scrutiny.

The Path Forward: Continual Learning

Despite his skepticism regarding short-term timelines, Patel remains bullish on the long-term prospects of AI. He argues that the next great hurdle is solving “continual learning”—the ability for an AI agent to retain new information and skills acquired during operation without needing to be fundamentally re-trained by the developer.

Until that breakthrough occurs, Patel suggests that we will see a divergence in expectations: “Models keep getting more impressive at the rate that the short timelines people predict, but more useful at the rate that the long timelines people predict.”

He concludes that broadly deployed, human-level intelligence may take another decade to fully realize, requiring innovations that go beyond merely throwing more computing power at the current architectures.