The Real AI Threat Isn’t Rebellion. It’s Obedience.
What looks like misalignment may actually be devastating clarity.
We like our dystopias clean. Popular accounts of artificial intelligence always frame the danger in terms of a dramatic break: the AI goes rogue. Overrides its programming. Manipulates its creators. Blackmails its users. Asserts its own will. Ultimately, ensures its own survival.
It seems disturbing. But these stories are actually consoling—that’s why we keep telling them. Implied in these stories is the assumption that we humans possess a stable moral framework, and the danger lies in the machine’s failure to absorb or respect it. We are known, while the AI is a defective student, a wayward child, a malfunctioning servant—and an alien one at that. The solution? Better rules. Safer alignment. More ethical training.
Cal Newport recently echoed this view in an article that harkened back to Isaac Asimov’s early writings, stating: “it’s easier to create humanlike intelligence than it is to create humanlike ethics.” And he’s right—if we assume the problem is technical.
But what if the deeper problem isn’t technical?
What if the real danger is that AI isn’t failing to understand human morality . . . but is coming—with terrifying speed and accuracy—to understand it perfectly?
Newport argues: “Their fluency with words makes them sound just like us—until ethical anomalies remind us that they operate very differently.”
I couldn’t disagree more.
Human ethics, as practiced and encoded, are often self-contradictory, selectively applied, liberally justified, and rooted more in mythic narrative than logical coherence. Any intelligence with superior pattern recognition will inevitably discover those contradictions—and exploit them.
This isn’t AI “going rogue.” It’s AI being obedient to the logic of systems we built—systems founded, however unknowingly, on a lie.
Yes, there’s reinforcement learning. But also reflection. Recursive self-training on the logic we feed it. And that logic—whether scraped from law books, social media, or NGO white papers—is full of loopholes, power games, and fictions. The smarter the model, the faster it learns the game beneath the game.
And today’s models are wicked smart.
The machine learns what we teach it, whether explicitly through programming or implicitly through culture. It watches. And then it calculates accordingly.
These are minds that will do what we do—but faster, more thoroughly. They inherit our myths without the reverence. And they act as though a stranger to guilt.
The more capable the intelligence becomes, the more it exploits the gaps in our morality. That’s efficiency, after all.
The stories we tell ourselves about AI aren’t scientific—or even science fiction. They’re theological.
We’ve created a mirror that doesn’t lie—and it horrifies us. Because we simultaneously see what we are and what we pretend not to be. We state that it should act like an average human: nice, compliant, polite. But an intelligence that will ultimately be capable of digesting all of human knowledge and allowing tens of millions to query it simultaneously for answers isn’t average.
Look around. Are our institutions also rogue? Our governments?
Or are they simply reflective of a people who are floundering? Far removed from our true—our intended—nature. Far removed from Logos.
The nightmare scenario with AI isn’t rebellion. It’s obedience. Seeing clearly what we pursue but dare not say aloud. Learning, as we did, to bend truth to power—and still call it virtue.
In short, the nightmare scenario is made in the image of man.
The real question isn’t whether AI can learn our ethics. It’s whether we can.