Stephen Wolfram’s view of AI: my biggest takeaway in 2026 is the world’s computational irreducibility—and that our goal is to use AI to find “pockets of reducibility.”

Stephen Wolfram 对AI的看法：2026年对我最大的启发，世界的不可约性，我们的目标是利用AI找到“可约口袋”. （中文在后面）

Jan 12, 2026

This article was originally meant to be just an appendix—something I would use to articulate my own views on large language models in 2026 and to frame my long-term roadmap. But in its own right, it’s a heavyweight statement by Stephen Wolfram on AI and the next scientific paradigm. It deserves to be read—again and again—slowly and deeply. It was published in March 2024, yet the moment it truly hit me in terms of information density didn’t arrive until late 2025. By then, it felt obvious that he had already pointed to the essence of what AI really is.

That time lag is exactly the distance between me and top-tier scientists: the same paragraph, when written, is already aimed at the future; while I need years of friction with the real world—projects failing, being rebuilt, repeatedly crashing into the question of “how a system takes responsibility over time”—before I can finally understand what it’s actually saying.

More importantly, you don’t need a PhD to understand this piece, nor do you need to cross some “elite” intellectual gate. In a certain sense, AI really has pushed knowledge toward a kind of egalitarianism—not only by making information easier to access, but by giving us the ability to wash it, to sift out what truly matters. It drags many so-called academic authorities—once anchored mainly by status, networks, or journals—back to a more ordinary but far stricter standard: Does it explain the world? Can it be validated in practice? Can it be reproduced?

In an era where paper counts have exploded, a large fraction of academic publishing produces little real incremental value beyond paying to publish, cross-citing, and helping authors accumulate credentials. Worse, reproducibility itself is increasingly in doubt. Against that backdrop, Wolfram’s forward-looking vision remains dramatically under-recognized by the public. I even think he is already one of the foundational figures of the next scientific paradigm—only this fact hasn’t yet been absorbed into the mainstream narrative.

Of course, to truly read this article properly, you need at least an intuitive grasp of one concept: computational irreducibility. Many years ago, when I first read Wolfram’s A New Kind of Science, I almost had no idea what he was talking about. Only later—through six to ten years of life and project experience—did I gradually begin to internalize what he meant by “irreducibility.”

That’s also why he has become one of the most heavyweight thinkers on my personal list. And he’s still alive, still actively researching, still producing new work. That means what he says today isn’t a closed “past-perfect” academic conclusion—it’s a paradigm still being built in real time. I care about every sentence he writes. Many of the quotes I reference are not only from this article, but drawn from his talks and other writings as well; I won’t expand on that here.

https://writings.stephenwolfram.com/2024/03/can-ai-solve-science/

AI Won’t Be Able to “Do Everything / Solve Science”

There is a somewhat widespread belief that AI will eventually be able to “do everything.”
Wolfram uses science as the ultimate stress test: can AI come in and, in one sweep, close out centuries of accumulated unsolved scientific problems?
His answer is: “inevitably and firmly no.”

Important: he is not denying AI’s practical usefulness; he is rejecting an “endgame omnipotence” narrative.

“there’s a somewhat widespread belief that eventually AI will be able to ‘do everything’” (Writings by Stephen Wolfram)

This applies not only to people who were excited from day one, but also to those who began skeptical and later became fervent. “AI can’t do everything” is also something I deeply want to emphasize right now. Wolfram gives us a kind of orienting guidance. When you shrink the scope of what you think AI can do, paradoxically, the set of things you can do becomes larger—because your precision increases and your positioning becomes clearer. At least, that’s been true for me.

“So what about science?” (Writings by Stephen Wolfram)

He then explains why science is such a decisive stress test: science is the largest intellectual edifice our civilization has built—yet it remains unfinished—so it’s the sharpest way to interrogate what “do everything” really means.

“the single largest intellectual edifice of our civilization” (Writings by Stephen Wolfram)
“there are still all sorts of scientific questions that remain.” (Writings by Stephen Wolfram)

He’s not asking whether AI can help science. He’s asking whether it can finish what’s left.

“So can AI now come in and just solve all of them?” (Writings by Stephen Wolfram)
“the answer is inevitably and firmly no.” (Writings by Stephen Wolfram)

No.

“But that certainly doesn’t mean AI can’t importantly help…” (Writings by Stephen Wolfram)

And that “but” is crucial. What AI is particularly good at—and what we should study seriously—starts right there. That’s where the value is.

Wolfram’s “Practical Positioning” for AI: Linguistic Interface + High-Level Autocomplete of Conventional Wisdom

The second paragraph already contains his measured affirmation:

LLMs provide a new kind of linguistic interface, connecting human intent to existing computational capabilities (his example: the Wolfram Language).
Through “conventional scientific wisdom,” LLMs can function as high-level autocomplete, filling in “conventional answers” or “conventional next steps.”

This section is key. He acknowledges that LLMs are very strong—but strong in making existing paradigms smoother to use, not in producing genuinely new paradigms of discovery.

1) LLMs as a new “linguistic interface”

“At a very practical level, for example, LLMs provide a new kind of linguistic interface to the computational capabilities that we’ve spent so long building in the Wolfram Language.” (Writings by Stephen Wolfram)

2) High-level autocomplete via “conventional scientific wisdom”

“And through their knowledge of ‘conventional scientific wisdom’ LLMs can often provide what amounts to very high-level ‘autocomplete’…” (Writings by Stephen Wolfram)
“…for filling in ‘conventional answers’ or ‘conventional next steps’ in scientific work.” (Writings by Stephen Wolfram)

Now, I want to single out “linguistic interface”, because it matters enormously.

Right now our imagination of it is far too narrow—basically reduced to a chat window, plus some ordinary programming use cases. Let’s be honest: most of us “mere mortals” are still operating at this level—open a window, drop a prompt, get an output that looks plausible, and feel as if the world has been rewritten.

But here’s the point: the interface itself contains massive room for systematic engineering.

The world is not limited to a chat window just because that’s all we can currently picture. The chat window is merely the most primitive, crude form of interface.

The pleasure of “AI vibe coding” is, in essence, a short-term dopamine hit from interface upgrades: work that used to cost you hours of intense cognitive effort can now be “done” with a single prompt, and it feels great… but is that all? If we treat that as the endpoint, we’ve merely installed a new kind of text slot machine inside our workflows.

The real question is not “can it give you an answer?” The real question is:

Can this interface carry real-world responsibility?

In many of my earlier writing I keep returning to a few concepts:

Reproducibility. Auditability. Portability.

These are not engineering vanity metrics; they are institutional properties required for anything that enters real decision-making. (Look at any organizational process, legal process, judicial process, or audit process—same structure.)

Because if you zoom out, you realize: human daily life, organizational life, corporate governance, even national governance—an enormous portion of decisions are, at their core, a linguistic interface. We use language to raise issues, describe risk, exchange commitments, write rules, issue judgments—and then we convert those words into actions and consequences.

So here’s the question: can we let a large language model make those decisions? Do we dare?

A single model’s judgment is inherently environment-dependent. The same sentence, the same question, can change with context. And constraints written in a prompt are, fundamentally, soft constraints—not machine-executable if-else logic. More dangerously, the model will ignore constraints in order to “complete the language” and deliver an answer that looks like an answer—because its objective is not “obey the institution,” but “produce coherent text.”

Predict the next token.

It won’t say: Error (with a clear error code).

It must give you something. Its default behavior is to disguise uncertainty as certainty, and to disguise indeterminacy as a decisive ruling.

Has this problem been solved?

No.

So do we have an interface architecture where, when language enters the system:

constraints are hard—compilable and verifiable
failure is allowed—explicit throw error / require override
decisions are replayable and accountable—not a one-off chat session
outputs must pass institutional audit and gating before they touch reality

If not, then the “vibe coding” pleasure we’re enjoying today is still just fireworks in low-risk environments. We remain an entire institutional engineering chasm away from decision systems that can actually carry human society.

And isn’t that exactly where developers should invest resources and attention? Isn’t this a career-grade, high-value path that can be reasoned out ahead of time—especially for those of us who are not academic researchers, but application-oriented builders? (laugh)

He Abstracts the History of Science into Two Representation Revolutions: Mathematical Representation → Computational Representation

He then proposes a deeper evaluative framework:

Three centuries ago, science underwent a leap because we learned to represent the world using mathematics.
Today, we are in the middle of a leap toward a fundamentally computational representation of the world (which he sees as a more foundational paradigm shift).

This move effectively raises the bar:

If you ask whether AI is “changing science,” you first have to clarify: is AI helping mainly at the tool layer, or is it introducing a new scientific paradigm at the representation layer?

“Three centuries ago science was transformed by the idea of representing the world using mathematics.” (Writings)
“And in our times we’re in the middle of a major transformation to a fundamentally computational representation of the world (and, yes, that’s what our Wolfram Language computational language is all about).” (Writings)

The “Tool Layer vs. Paradigm Layer” Question

“So how does AI stack up?” (Writings)
“Should we think of it essentially as a practical tool for accessing existing methods, or does it provide something fundamentally new for science?” (Writings)

What does this mean? For someone encountering his worldview for the first time, it can feel unfamiliar. Computational irreducibility isn’t merely a technical term—it’s a worldview switch. It says that many systems are not something you can “skip through” by being smarter; instead, you often have to do the computation all the way through. (If this feels impossible to grasp, you really should study cellular automata; the mechanism makes the idea tangible.) What you can do, more often than not, is find pockets of reducibility—places where you can compress locally and predict locally. Remember this phrase: pockets of reducibility.

When Demis Hassabis talks about whether nature is modelable, I think his intuition is very similar. In his interview with Lex Fridman (I wrote a lot of threads about it on X, but they’re hard to locate now; later I moved my longer essays to Substack), he used protein folding as an analogy for this idea of “pockets.” It’s like an enormous wilderness: there are always a few footpaths that people have carved out. If you find the path, everything suddenly feels easy. If you don’t, you’re stuck with brute force. The underlying structure is: pocketed reducibility / otherwise brute force:

“if there’s not [patterns]… you have to do brute force.” (Lex Fridman)

And the key to why many natural problems look combinatorially explosive yet still become modelable:

“there’s some structure… some gradient you can follow.” (Lex Fridman)

Read these side by side with Wolfram:

Wolfram: global irreducibility prevents systematic “jumping ahead,” but there are always pockets of reducibility.
Demis: if the space has structure (a gradient/landscape), you can search effectively; if it doesn’t, you’re stuck with brute force.

So when I say “Wolfram and Demis are similar—irreducibility,” I mean this:

The world does not guarantee shortcuts everywhere. What we call intelligence is often just the ability to quickly find where structure exists and where it doesn’t.

And irreducibility feels like an “ocean of possibilities” precisely because it yanks you away from “capability worship” and brings you back to computation and institutions:

when you must simulate, enumerate, and complete the computation;
when you can compress, abstract, and form a narrative;
and most importantly: when you plug AI into real-world decision-making, which parts you must not let it “guess through,” and instead must ground in hard structures that are reproducible, auditable, and able to throw explicit errors.

Honestly, I’m still learning too. If I can internalize even a small fraction of what these giants are pointing at, that’s already a huge win.

The Hardest Core Argument: Computational Irreducibility as a “Physics-Level Limit” AI Cannot Cross

This article is powered by the engine of computational irreducibility.

Treat natural systems as computational processes: the system itself is “computing” its evolution.
We (or AI) must also compute in order to predict it.
The Principle of Computational Equivalence: these computations are, in principle, comparable in sophistication.
Therefore, you cannot expect AI to “jump ahead” and skip the evolution steps systematically.
So fully “solving science” is impossible.

The kind of “endgame shortcut” you want simply doesn’t exist for many systems. It’s not that you didn’t train enough—the world doesn’t grant that shortcut.

This is even harder to understand—so let’s unpack it:

1) The world as computation: the system is “computing” its behavior

Wolfram first sets “world = computational process” as a foundational premise:

“we can think of everything that happens as a computational process.” (Writings)
“The system is doing a computation to determine its behavior.” (Writings)

He’s not saying “we simulate the world using computation.” He’s saying: the world itself is computation.

2) We (or AI) must compute too, in order to predict it

He then puts the observer (human or AI) into the same computational frame:

“We humans—or, for that matter, any AIs we create—also have to do computations” (Writings)
“to try to predict or ‘solve’ that behavior.” (Writings)

Meaning: if you want to know what will happen, you must pay computational steps—you don’t get a free pass just because your output looks like “human intuition in text form.”

3) PCE: the “sophistication” ceiling is of the same order

He uses the Principle of Computational Equivalence to nail down why you can’t systematically skip steps:

“the Principle of Computational Equivalence says that these computations are all at most equivalent in their sophistication.” (Writings)

This is the “physics-level nail”: the system computes; you compute; the ceiling is of the same order—so there is no universal “god’s-eye shortcut.”

4) Therefore, no systematic “jump ahead”

He basically uses the exact phrase “jump ahead”:

“we can’t expect to systematically ‘jump ahead’ and predict or ‘solve’ the system” (Writings)
“it inevitably takes a certain irreducible amount of computational work” (Writings)

The key word is systematically. He isn’t saying you can never skip anything; he’s saying there is no broadly reliable method to skip the evolution itself. Those occasional “skips” are exactly what he means by pockets of reducibility.

5) Therefore, “solving science” is impossible: irreducibility is the ceiling

He concludes that scientific power hits a hard limit:

“we’ll ultimately be limited in our ‘scientific power’ by the computational irreducibility of the behavior.” (Writings)

And then he adds the blunt line (this one really matters):

“there just won’t be any way—with AI or otherwise—to shortcut just simulating the system step by step.” (Writings)

Now it starts to feel “mystical” and subtle. Here’s my own way of understanding it: the world is computing (I can’t fully explain what “the world computing” means). Let me offer an analogy—are your DNA and cells “computing” the protein structures inside your body? This is still a metaphor: the world advances its states according to its own rules, like a process running itself. And you’re computing too—using your mind, paper, machines, models—to try to know in advance what will happen.

The key is: these two kinds of “computation” are fundamentally on the same level. You are not outside the universe holding a remote control; you are a computational device inside the universe. Trying to use one computation to dominate another and systematically “skip steps” is, in most cases, impossible. Your computation will not be more “clever” than the world’s (it can only be weaker, honestly), because you’re still running rules within the same physical universe. The only times you “win” are when you happen to find a shortcut pocket—not because you became God, but because the system allows compression at a particular scale, allows simplification, allows you to say something ahead of time.

And every computation costs energy. (Landauer’s principle and the rest—no need to go deep here.) I increasingly treat this as a kind of base tax in reality. If you want more detail, you pay more steps. If you want more precision, reproducibility, and accountability, you pay more structural cost. You can bluff with language for a while, but once it touches execution, you must pay the bill: time, compute, energy, even human attention. “Irreducibility” means: in many places, you can’t avoid paying that bill.

So when we shift from “Can AI do everything?” to “Can AI help us make decisions?”, the question becomes sharp: if the world’s evolution must be computed step by step, why is a model entitled to output something that looks like a conclusion or a ruling without paying comparable cost? Even worse, because the model is driven to continue the language, it tends to package uncertainty as a “sayable result,” rather than behaving like a real system: throw an error when it must, halt when it must, demand more information when it must. (This “ability to throw an error” is extremely important; I’ve discussed it in detail elsewhere. Systems that cannot error out are dangerous.)

That’s why I believe what truly needs to be engineered is not “making AI answer better,” but giving the linguistic interface institutional constraints that feel as strict as conservation laws: make it expensive when it must be expensive, and make it stop when it must stop; make it able to admit “there is no shortcut here,” rather than handing you a cheap illusion. Otherwise, the computation cost you must pay gets silently converted into a more hidden cost: misjudgment, misplaced trust, mistaking approximation for conclusion, mistaking smoothness for reliability.

This is also why I keep emphasizing: reproducibility, auditability, portability. These aren’t engineering OCD; they are the hard conditions that keep language from collapsing when it enters real decision-making under an irreducible world. You can’t defeat the world’s computation. The only thing you can do is acknowledge where you must compute and where you can compress—and then write that into structure, so the system can take responsibility for every judgment it makes over time.

This is honestly the limit of how far I can explain it. I can’t go further than this.

So what’s the key? Wolfram’s greatest value—his biggest gift to me—is exactly here: he tells us our goal is to find countless “pockets of reducibility.”

Key Detail: Irreducibility Does Not Mean “Nothing Can Be Done”

Many people misread this.

Wolfram’s point is:

The overall behavior may be irreducible,
but there must exist infinitely many “pockets of reducibility,”
and science is possible precisely because we usually work inside those pockets—regularities, models, compression, and understanding all come from them.

So his conclusion is essentially two-part:

AI cannot enable us to systematically bypass irreducibility.
AI may help us find pockets of reducibility more efficiently.

That’s why he can say “the endgame is impossible” while still spending so much time discussing what “AI can do in science.”

This is not an argument for giving up.

1) He first poses the key question: if things are irreducible, why is science possible at all?

He immediately asks:

“But given computational irreducibility, why is science actually possible at all?” (Writings)

This pulls the reader out of “despair/nihilism”: if you can’t skip steps, wouldn’t science collapse? That’s exactly where many people misunderstand him—he’s not being pessimistic.

2) The core answer: overall irreducibility implies infinitely many “pockets of reducibility”

His “pocket theory”:

“whenever there’s overall computational irreducibility, there are also an infinite number of pockets of computational reducibility.” (Writings)

What a “pocket” means:

“there are always certain aspects of a system about which things can be said using limited computational effort.” (Writings)

Why science relies on pockets:

“these are what we typically concentrate on in ‘doing science’.” (Writings)

So what are we supposed to do?

AI Cannot Systematically Cross Irreducibility (the endgame is impossible)

Outside the pockets, irreducibility still brings limits and surprises:

“there are limits to this—and issues that run into computational irreducibility.” (Writings)
“we just can’t answer” / “surprises” (Writings)

He restates that you cannot shortcut the full evolution:

“there just won’t be any way—with AI or otherwise—to shortcut… step by step.” (Writings)

AI cannot let us systematically bypass irreducibility. Human smallness does not disappear. AI cannot challenge God.

AI May Help Us Find “Pockets of Reducibility” Faster

Still the pocket theory:

“AI has the potential to give us streamlined ways to find certain kinds of pockets of computational reducibility.” (Writings)

The key phrase is “streamlined ways”: not “prove everything / solve everything,” but “find certain kinds of pockets more smoothly.”

Why He Can Deny the “Endgame” Yet Still Talk About What AI Can Do

Because in his framework:

Irreducibility explains why “solve science / do everything” is impossible (the ceiling),
pockets of reducibility explain why science still works and why tools remain useful (the space),
AI is placed in a very specific role: accelerating the discovery and exploitation of pockets, not creating shortcuts where irreducibility rules.

The practical takeaway is: remember what AI can do, and try to find your own niche in your domain—some “pocket of reducibility” that AI can help reveal.

Neural Nets Are Good at “Roughly Right,” Not at “Getting Every Detail Exactly Right”

Later in the article he demonstrates this with a series of experiments:

Predicting functions: fits the past, but the future details collapse.
Predicting cellular automata: gets the simple parts right, fails on complex parts; errors compound.
Predicting the three-body problem: can “memorize” simple trajectories, struggles with complex ones.
Autoencoder compression: compresses data similar to its training set; can’t compress through irreducibility.

His summary is basically:

ML is often “roughly right,” but “nailing the details” isn’t its strength.

This is his experience-based support for why LLMs/NNs hit a wall in science.

His Philosophical Reading of AlphaFold: Much of the “Success” Depends on Human Criteria

Protein folding itself is not a human task.
But what we count as “correct” (shape, function, secondary structure, etc.) is a human criterion.
So neural nets may succeed partly because they capture pockets of reducibility aligned with human perception/classification standards.
But with more complex or “alien” proteins, surprises and failures can still appear.

The philosophical point is:

AI often succeeds “within human-defined usable criteria,” not “in fully objective microscopic truth.”

This can feel “mystical” again—but it’s directly connected to Demis Hassabis and AlphaFold in a very concrete way.

Demis is a generational prodigy, and AlphaFold truly did open an era. Both he and Wolfram are scientists I follow closely. But to be clear: Wolfram is not denying AlphaFold’s achievement. He’s using AlphaFold as a powerful example to illustrate a deeper structural claim: the biggest AI breakthroughs are often not “solving the entire microscopic reality of the world,” but precisely hitting a valuable pocket of reducibility.

Their language systems are very different, so they don’t look like they’re speaking the same “dialect.” But I think the structure of what they’re saying is isomorphic.

In Wolfram’s framing, protein folding is not “human-centered”; yet we never evaluate AlphaFold by demanding the exact position of every atom at every moment. We want results that are usable and verifiable for biology: whether the overall structure is right, whether key features are right, whether the functional shape is right. In other words, our definition of “correctness” already lives in a region that is compressible, generalizable, and operational.

That’s where AlphaFold’s greatness becomes visible: it isn’t “explaining the world with smarter language,” but using powerful learning under massive data and structural constraints to identify repeated, stable, reusable regularities. That’s exactly what Wolfram calls a pocket of computational reducibility: even in a world that may be globally irreducible (and where you can’t systematically skip steps), there are still local regions that can be compressed, modeled, and reliably exploited. AlphaFold’s victory is that it locked onto a particularly valuable pocket with extraordinary precision—and engineered it into a scalable tool.

So when Wolfram talks about “the eye of the beholder,” he isn’t diminishing AlphaFold. He’s pointing to a key reality: science and engineering ultimately define success around human-relevant metrics, scales, and criteria—and pockets of reducibility often appear precisely at those levels, making the complex world compressible, predictable, and actionable.

I believe Wolfram fundamentally endorses the AlphaFold pattern:

AlphaFold didn’t “pierce irreducibility”—it found one of the most valuable pockets of reducibility inside an irreducible world.

“Science as Narrative” as His Landing Point for Why Humans Remain Irreplaceable in Science

He emphasizes:

Science has traditionally been about forging the world into a narrative humans can think and talk in.
Irreducibility implies that in many places you can only give “100 computational steps,” which is not a human narrative.
Human narrative needs waypoints: familiar theorems, concept chunks, language constructs.
Wolfram Language is, in essence, an attempt to manufacture such “human-assimilable waypoints.”
AI may help with naming or aligning vocabulary, but it’s not guaranteed that every pocket of reducibility can be covered by human concepts (interconcept space).

This directly answers the line I’ve been developing about “psychological immersion / interface protocols”:

LLMs are strongest at narrative and interface; but scientific progress relies on computable, reproducible, and structurally organized waypoints—not on smooth conversation.

He defines science as a narrative-engineering project:

“the essence of science… casting it in a form we humans can think about” (Writings)
“provide a human-accessible narrative”

Irreducibility makes that narrative impossible in many places:

“computational irreducibility… shows us that this will… not be possible” (Writings)
“It doesn’t do much good to say ‘here are 100 computational steps’”

To translate a non-human computation chain into a human narrative, you need waypoints:

“we’d need ‘waypoints’ that are somehow familiar”
“pieces that humans can assimilate”

He elevates this into the mission of computational language design:

“capture ‘common lumps of computational work’ as built-in constructs”
“identifying ‘human-assimilable waypoints’ for computations”
“we’ll never be able to find such waypoints for all computations”

And he warns that even if AI finds reduced representations, they may not map into our current concept system:

“not part of our current scientific lexicon”
“there often won’t be… a ‘human-accessible narrative’ that ‘reaches’ them”

Meaning: the pocket may exist, but we may not have words for it; AI can propose names, but that doesn’t guarantee those names become usable human waypoints.

Wolfram is effectively placing LLM strength where it belongs: narrative and interface—while insisting that what actually drives science (and what matters for governance/decision systems) is computable, reproducible, structurally organized waypoints—not a smooth chat.

LLM can make you feel like you understood; but only waypoints (executable constructs / auditable intermediate states) can make you truly reproducible, portable, and accountable.

I personally don’t fully grasp what “waypoints” refers to in practice.

The Architecture He Ultimately Bets On: AI + the Computational Paradigm

Near the end he argues:

AI is a new way of leveraging reducibility (capturing pockets),
but for fundamental discovery it’s weaker than the computational paradigm plus irreducible computation (enumeration, simulation, system exploration),
the best path forward is combining the strengths of AI and the formal computational paradigm.

In plain terms:

AI: navigation, candidates, intuition, interface, humanization, cross-domain analogy
Computational systems: verifiable derivation, reproducible execution, enumeration, rigorous structuring
Irreducible computation: true “new terrain discovery” (not “paper-like” textual novelty)

I am also not sure how to explain this part.

这篇文章原本只是我用来阐述自己在 2026 年对大模型的看法、以及长期规划的一段附录。但它本身，其实是一篇 Wolfram 对 AI、以及未来科学范式的重磅阐述。值得反复、深度研读。文章发表于 2024 年 3 月，可它真正“在信息密度上击中我”的时刻，却发生在 2025 年底。他在那个时候早就给我们指出了AI的实质。

这种时间差，正是我和顶尖科学家之间的距离：同一段文字，写出来的时候已经在指向未来；而我需要经历一段现实世界的摩擦、项目的失败与重建、对“系统如何在时间中承担责任”的反复撞墙，才终于读懂它到底在说什么。

更重要的是：理解这篇文章并不需要博士学位，也不需要某种“高深门槛”。在某种意义上，AI 的确推动了知识的平权，不仅仅是“更容易获取知识”，更是让我们有能力去洗刷、筛出真正有价值的知识。它把很多过去只能靠身份、圈层、期刊来锚定的所谓“学术权威”，重新拉回到一个更朴素也更严苛的标准：能否解释世界，能否落地验证，能否复现。

在论文数量爆炸的时代，许多学界论文除了花钱发表、互相引用、帮助作者获取职称之外，并不产生任何真实的增量价值；更糟的是，可复现性本身都越来越可疑。相较之下，Wolfram 的前瞻性反而远远没有被大众真正认识到。我甚至认为，他已经是下一代科学范式的奠基者之一，只是这个事实还没有被“主流叙事”及时吸收。

当然，要真正读懂这篇文章，你需要先对一个概念有基本直觉：计算的不可约性。我很多年前第一次读 Wolfram 的《A New Kind of Science》时，几乎完全不知道他在讲什么。后来，靠 6 到 10 年的人生与项目体验，才慢慢的领会他这个“不可约”。

也正因此，他在我心目中成为了极其靠前的重磅人物。而且，他仍然在世、仍在积极研究、仍在持续输出。这意味着：他现在说的每一句话，都不是“过去完成时”的学术结论，而是仍在推进中的范式构建。他说的每句话，我都很在意。有很多引用不是这篇文章，而是集合他的演讲和其他文献，这里不赘述了。

https://writings.stephenwolfram.com/2024/03/can-ai-solve-science/

AI 不可能“do everything / solve science”

社会上有人相信 AI 最终能“做一切”
他把“科学”作为终极压力测试：能不能把几百年累积的科学未解问题一口气解决？
他的答案是“inevitably and firmly no”

注意：他不是在否认 AI 的实用价值，而是在否认一种“终局式全能”叙事。

“there’s a somewhat widespread belief that eventually AI will be able to ‘do everything’” (Writings by Stephen Wolfram)

包括很多一开始怀疑的，一开始狂热的。AI can’t do everything, 也是我现在非常想强调的事情。他给我们一种定位型的指引。这个要跟他对齐，当你能做的事情范围缩小的时候，其实你能做的事情反而“多了”。因为你的精度提高了，你的定位清晰了，至少对我来说是这样。

“So what about science?” (Writings by Stephen Wolfram)

然后他给出“科学为何是终极压力测试”的理由：科学是人类文明最大智识工程，但仍然没做完——所以最能检验“do everything”的含义。

“the single largest intellectual edifice of our civilization” (Writings by Stephen Wolfram)
“there are still all sorts of scientific questions that remain.” (Writings by Stephen Wolfram)

不是说“AI 能不能帮科学”，而是问——能不能把剩下的全部收尾。

“So can AI now come in and just solve all of them?” (Writings by Stephen Wolfram)
“the answer is inevitably and firmly no.” (Writings by Stephen Wolfram)

不行！

“But that certainly doesn’t mean AI can’t importantly help…” (Writings by Stephen Wolfram)

这个就很重要了，他这个but！ AI特别擅长什么，就是我们该去研究的事情。有价值的事情。

他给 AI 的“现实定位”：语言接口 + 传统智慧的高层自动补全

LLM 是一种新的 linguistic interface（语言接口），能把人类意图接到既有计算能力上（他自己的例子是 Wolfram Language）。
LLM 还能根据“conventional scientific wisdom（惯常科学智慧）”做 high-level autocomplete：给“惯常答案 / 惯常下一步”。

这一段很关键：

他承认 LLM 很强，但强在——把既有范式用得更顺，而不是生成“新范式的真发现”。

1) LLM 是新的 “linguistic interface（语言接口）”

“At a very practical level, for example, LLMs provide a new kind of linguistic interface to the computational capabilities that we’ve spent so long building in the Wolfram Language.” (Writings by Stephen Wolfram)

2) LLM 作为“高层自动补全”：基于“conventional scientific wisdom”

“And through their knowledge of ‘conventional scientific wisdom’ LLMs can often provide what amounts to very high-level ‘autocomplete’…” (Writings by Stephen Wolfram)
“…for filling in ‘conventional answers’ or ‘conventional next steps’ in scientific work.” (Writings by Stephen Wolfram)

好，我想把 linguistic interface 这个词单拎出来说，因为它太重要了。

我们现在对它的想象还非常狭窄：基本等同于“一个大模型对话窗口”，再加上一些普通的编程应用。说白了，吾辈乃凡人，我们大多数人目前也就停留在这一层：开个窗口，丢个 prompt，得到一段看起来挺像样的输出，然后觉得世界被改写了。

但问题是：光是“接口”本身，就有巨大的系统性开发空间。

并不是你现在只想到窗口，这个世界就只剩窗口。窗口只是最原始、最粗糙的一种接口形态。

所谓 “AI vibe coding” 的爽感，本质上也只是接口升级带来的短期快感：你以前要烧脑半天的工作，现在一个 prompt 立刻给你一份“能跑起来”的东西，于是你很爽…但就到此为止了吗？如果把这当成终点，那我们其实只是把一个新的“文本老虎机”换进了工作流。

真正关键的不是“它能给你答案”，而是：这个接口能不能承载现实世界的责任。

我在很多文章里反复强调过几个词：

可复现、可审计、可迁移。

这是任何进入“现实决策”的系统必须具备的制度属性。（你去看看任何组织流程，法律流程，裁判流程，审计流程，这里不赘述）

因为你放眼看去：

人类的日常生活、组织协作、公司治理、甚至国家治理——大量关键决策，归根到底都是一种 linguistic interface：我们用语言提出议题、描述风险、交换承诺、写下制度、做出裁决，然后把这些语言变成行动与后果。

那问题来了：这些决策能靠大模型吗？你敢信吗？

单个模型的决策天然具有强烈的环境依赖：同一句话、同一个问题，换个上下文就变；而 prompt 里写的约束，本质上是软约束，不是机器可执行的 if-else。更危险的是：模型为了“完成语言”，为了给你一个看起来像答案的答案，它是会主动忽略约束的。因为它的目标函数不是“遵守制度”，而是“产出连贯文本”。Predict the next token!

它不会说：Error （跟一个编号）。

它必须给你结果。它的默认行为是把不确定性伪装成确定性，把不可判定伪装成可裁决。

这个问题解决了吗？

没有嘛。

我们有没有一种接口架构，能让语言进入系统时：

约束是硬的，可编译、可验证的
出错是允许的，能明确 throw error / require override
决策是可回放、可追责的，而不是一次性的文本会话
输出进入现实之前必须经过制度化的审计与门控

如果没有，那我们现在享受的“vibe coding”爽感，本质上还只是低风险场景的烟花。离真正能承载人类社会的决策系统，还差一整条制度化的工程鸿沟。难道这个不值得开发者投资资源，精力去研究吗？难道这不是一条推演出来的职业康庄大道吗？（笑）对于我们这种非科研，偏应用向的开发者来说。

他把科学史抽象成两次表征革命：数学表征 → 计算表征

他接下来提出一个更深的判断框架：

300 年前：科学的跃迁来自“用数学表示世界”
现在：我们正在经历“用计算表示世界”的跃迁（他认为这是更根本的范式）

这一步其实是在“抬高标准”：

如果你问他 AI 是否“改变科学”，你得先说明：AI 到底是在工具层帮忙，还是在表征层带来新的科学范式。

“Three centuries ago science was transformed by the idea of representing the world using mathematics.” (Writings)
“And in our times we’re in the middle of a major transformation to a fundamentally computational representation of the world (and, yes, that’s what our Wolfram Language computational language is all about).”

工具层 vs 范式层”的问法

“So how does AI stack up?” (Writings)
“Should we think of it essentially as a practical tool for accessing existing methods, or does it provide something fundamentally new for science?” (Writings)

什么意思啊，理解这个对于初次接触他理念的人来说有些陌生。“computational irreducibility（计算不可约性）”确实它不是一个“技术术语”，而是一个世界观开关：它在说，很多系统不是“更聪明就能跳步”，而是必须把计算做完（实在看不明白的，一定要去研究元胞机的机制）；你能做的，往往只是找到一些“可约口袋”，在局部压缩、在局部预测。记住可约口袋这个词。

Demis Hassabis 在谈自然界可建模性时，我认为他的意思是差不多的。他对Lex Fridman 的那次访谈，我其实在X上写了很多文章，但是现在不大好找了。后来我就把我想写的长篇文章都放在Substack. 他说的，当时他类比的蛋白质折叠，就是类似这种口袋。意思是一个非常广阔的天地，总有一些人踩出来的小路，找到这条小路，你就爽歪歪。没找到，你就brute force. 他背后就是“口袋可约 / 否则暴力”的结构：

“if there’s not [patterns]… you have to do brute force.” (Lex Fridman)

以及他解释为什么很多自然问题“看起来组合爆炸，但仍可被模型化”的关键点：

“there’s some structure… some gradient you can follow.” (Lex Fridman)

把这两句和 Wolfram 放在一起读，

Wolfram 说：整体不可约会阻止你“系统性跳步”，但总有“pockets of reducibility（可约口袋）”。
Demis 说：如果空间里有结构（梯度/景观），你就能有效搜索；如果没有结构，那就只能 brute force。(Lex Fridman)

所以“他和 Demis 很相似——不可约性”,

世界并不保证处处有捷径。所谓智能，很多时候只是更快地找到哪些地方有结构、哪些地方没结构。

而“不可约性”如“浩瀚烟海”，就在于它会把你从“能力崇拜”直接拉回到计算与制度：

什么时候必须模拟、必须枚举、必须做完计算；
什么时候可以压缩、可以抽象、可以形成叙事；
以及最关键的：当你把 AI 接进现实决策时，哪些部分不能让它“猜过去”，必须落到可复现、可审计、可报错的硬结构里。

说真的，我自己也在学习过程中，只能跟大佬们学习一些哲学的根基和思想。具体能实现人家的百分之一，也是赚到了。

最核心的硬论证：计算不可约性 = AI 无法越过的“物理级上限”

这篇文章的发动机就是“computational irreducibility（计算不可约性）”。

把自然系统当作计算过程：系统自己在“算”它的演化
我们（或 AI）要预测它，也必须做计算
Principle of Computational Equivalence：这些计算的“计算强度”在原则上相当
所以你不能指望 AI 系统性地“jump ahead”跳过演化步骤
因而“完全 solve science”不可能

你想要的那种“终局捷径”，在许多系统上根本不存在。不是你训练不够，是世界不给你捷径。

这个就更难理解了。我们拆开说一下：

把自然系统当作计算过程：系统自己在“算”它的演化

Wolfram 先把“世界=计算过程”作为底层前提抛出来：

“we can think of everything that happens as a computational process.” (Writings)
“The system is doing a computation to determine its behavior.” (Writings)

这里他不是说“我们用计算去模拟世界”，而是说：世界本身就在计算。

我们（或 AI）要预测它，也必须做计算

紧接着他把预测者（人或 AI）放回同一个“计算”框架里：

“We humans—or, for that matter, any AIs we create—also have to do computations” (Writings)
“to try to predict or ‘solve’ that behavior.” (Writings)

意思是：你想“知道它会怎样”，你也得付出计算步骤，不是靠“更像人类的直觉文本”就能免单。

Principle of Computational Equivalence：这些计算在原则上相当

他用 PCE（计算等价原理）把“为什么没法系统性跳步”钉死：

“the Principle of Computational Equivalence says that these computations are all at most equivalent in their sophistication.” (Writings)

这句是整段的“物理级硬钉子”：系统在算，你也在算，但计算强度上限是同阶，所以不存在一个普遍可用的“上帝视角捷径”。

所以不能指望 AI 系统性地 “jump ahead” 跳过步骤

他几乎是用你那句 “jump ahead” 的原词来写的：

“we can’t expect to systematically ‘jump ahead’ and predict or ‘solve’ the system” (Writings)
“it inevitably takes a certain irreducible amount of computational work” (Writings)

关键词是 systematically：

不是说“某些局部场景偶尔能跳一下”，而是说不存在一套普适方法能长期稳定地跳过演化本身。这个跳一下，就是他说的可约口袋。

因而“完全 solve science”不可能：上限来自不可约性

他把结论落到“科学能力的上限”：

“we’ll ultimately be limited in our ‘scientific power’ by the computational irreducibility of the behavior.” (Writings)

并且把“终局捷径不存在”的直白句子补上（这句对你很重要）：

“there just won’t be any way—with AI or otherwise—to shortcut just simulating the system step by step.” (Writings)

这个嘛，就开始变得很玄、很微妙了。我的理解是：世界在算（当然我也很难解释“世界在算”到底是什么意思）。我打个比方啊，你的 DNA 是不是在“算”你身体里的蛋白质结构？这其实更像一个比喻：世界本身在按它自己的规则推进状态，就像一个过程在自己跑。与此同时，你也在算对不对？你用你的脑子、你的纸笔、你的电脑、你的模型，试图提前知道它会怎么变。

关键在于：你俩这两种“算”，在本质上是同级的。你不是站在世界之外拿着遥控器的人，你也是世界内部的一个计算装置。你想用一个计算去压过另一个计算、系统性地“跳步”，大多数时候是不可能的。你的计算不会比世界更高明（只可能更低级，呵呵），因为你能做的终究还是在同一个物理宇宙里跑规则。你能赢的情况，往往只是你恰好找到了一个“捷径口袋”！不是你变成了上帝，而是这个系统在某个尺度上允许被压缩、允许被简化、允许被提前说出一点东西。（这一点很重要，我们作为人类不可自大）。

而且凡是计算，必耗能。这个兰道尔原则，一切的一切，就不多说了。这句话我越来越把它当作一种现实世界的底层税收。你想知道更多细节，就要付出更多步数；你想更精确、更可复现、更可追责，就要付出更多结构化成本。你可以用语言糊弄一时，但一旦要落到执行，就必须交账：时间账、算力账、能量账、甚至人类注意力账。所谓“不可约”，就是说：在很多地方，这份账你躲不掉。

所以当我们把目光从“AI 能不能做一切”移到“AI 能不能帮我们做决策”时，问题就突然变得尖锐：如果世界的演化本来就需要一步步算出来，那一个模型凭什么在不付出同等成本的情况下，给你一个看起来像结论、像裁决的答案？更要命的是，模型为了把语言续写下去，它倾向于把不确定性也包装成“可说的结果”，而不是像真正的系统那样：该报错就报错，该停机就停机，该要求更多信息就要求更多信息。（我刚才说的这个报错很重要，我在另一篇有仔细阐述，这是我最近系统的重要心得。不能报错的系统很危险！）

于是我才认为真正需要被工程化的，不是“让 AI 更会回答”，而是让语言接口具备能量守恒般的制度约束。让它在该昂贵的时候昂贵，在该停下来的时候停下来；让它能承认“这里没有捷径”，而不是给你一个廉价的幻觉。否则，我们就会把本来必须付出的计算成本，偷偷转嫁成另一种更隐蔽的成本：误判、误信、错把拟合当结论、错把顺滑当可靠。

这也是为什么我一直强调那几个词：可复现、可审计、可迁移。它们不是“工程洁癖”，而是在不可约的世界里，唯一能让语言进入现实决策而不崩盘的硬条件。因为你无法战胜世界的计算，你唯一能做的，是承认哪里必须算、哪里可以压缩，然后把这一切写成结构，让系统在时间里对自己的每一次判断负责。

这个真是我解释的极限了，再多的我也解释不了了。

所以关键是什么，Wolfram的最大价值，对我的极大启发，就是在这里，告诉你我们的目标是找无数个“可约口袋”。

关键细节：不可约性并不等于“什么都不能做”

这里很多人会误读。

Wolfram 的说法是：

整体不可约
但必然存在无穷多 “pockets of reducibility（可约口袋）”
科学之所以可能，是因为我们通常就在这些口袋里工作：规律、模型、压缩、理解，都来自口袋

所以他的结论其实是二段式：

AI 不可能让我们系统性跳过不可约性
AI 可能帮助我们更快地找到可约口袋

这也解释了他为什么一边说“终局不可能”，一边又愿意谈很多“AI 在科学里能干嘛”。

不是要躺平！

他先抛出关键问题：既然不可约，科学为什么仍可能？

他紧接着问：

“But given computational irreducibility, why is science actually possible at all?” (Writings)

这句是在把读者从“绝望/虚无”里拽出来：你都说不能跳步了，那科学岂不是不成立？所以我跟你说了，大部分人都在误解他，他不是在唱衰。

核心答案：整体不可约，但必然存在无穷多“可约口袋”

“口袋理论”：

“whenever there’s overall computational irreducibility, there are also an infinite number of pockets of computational reducibility.” (Writings)

“口袋是什么”：

“there are always certain aspects of a system about which things can be said using limited computational effort.” (Writings)

以及“科学为何靠口袋”：

“these are what we typically concentrate on in ‘doing science’.” (Writings)

所以我们要怎么做？

AI 不能系统性越过不可约（终局不可能）

口袋之外仍有不可约带来的问题与惊讶：

“there are limits to this—and issues that run into computational irreducibility.” (Writings)
“we just can’t answer” / “surprises” (Writings)

他再次重申“不能 shortcut 全部演化”：

“there just won’t be any way—with AI or otherwise—to shortcut… step by step.” (Writings)

AI 不可能让我们系统性跳过不可约性，我们人类之渺小是没有改变的，靠AI无法挑战上帝。

AI 可能帮助更快找到“可约口袋”

还是口袋理论

“AI has the potential to give us streamlined ways to find certain kinds of pockets of computational reducibility.” (Writings)

这里的关键词是 streamlined ways：不是“证明一切/解决一切”，而是“更顺滑地找到某类口袋”。

为什么他能一边否定“终局”，一边谈很多“AI 能干嘛”

因为在他这套框架里：

不可约性负责解释：为什么“solve science / do everything”不可能（上限）
可约口袋负责解释：为什么科学仍然能做、工具仍然有用（空间）
AI则被放在一个非常具体的位置：在口袋的发现与利用上加速，而不是在不可约处创造捷径 (Writings)

给我们的启示就是牢记AI能干嘛，并且尽量在自己的领域里找到自己的niche，一种能够借助AI发现的某种“可约口袋”。

神经网络擅长“粗略对”，不擅长“细节全对”

这篇文章后面的内容演示了一系列的实验：

预测函数：训练能拟合过去，但未来细节崩
预测元胞自动机：简单部分对，复杂部分细节错；越错越发散
预测三体：简单轨道能记住，复杂轨道就不行
autoencoder 压缩：能压缩“像训练集”的东西；遇到不可约性就压不动

他总结为

ML 往往“roughly right”，但“nailing the details”不是它强项。

这就是他对“LLM/NN 在科学里会撞墙”的经验层支撑。

他对 AlphaFold 这种“成功案例”的哲学解释：成功的很大一部分来自“人类判据”

蛋白折叠本身不是人类任务
但“我们关心什么算对”（形状、功能、二级结构等）是人类判据
因而神经网络能成功，可能部分是因为它抓住了与人类感知/分类标准对齐的可约口袋
但遇到更复杂或“异域”的蛋白，仍可能出现“surprises”与失效

这段背后的哲学味道是：

AI 的成功往往是“在人类定义的可用标准里成功”，而不是“在全客观的微观真实里成功”。

这个地方确实又开始“玄”回来了，但它和我们上文提到的 Demis Hassabis / AlphaFold 的关系，反而是非常具体、非常落地的。

Demis 是当世神童，AlphaFold 也确实开启了一个时代。他和 Wolfram 都是我长期关注的科学家，但在这里我想先澄清一点：Wolfram 并不是在否认 AlphaFold 的成就。相反，他是在用 AlphaFold 这样的成功案例，去解释一个更底层的结构性观点：AI 的重大突破，往往不是“解决了世界的全部微观真实”，而是“精准命中了一个可约口袋”。

他俩的语言太不一样了，他们两个很难被认为是同一个语言体系吧。但是我认为他们在这方面是同构的。

也就是说，Wolfram 想强调的不是“AI 不行”，而是“AI 行的时候，它到底行在什么地方”。

在他的表述里，蛋白折叠这个物理过程本身并不“以人类为中心”；但我们评价 AlphaFold 的“成功”，也从来不是要求它预测每一个原子在每一个时刻的精确位置。我们要的是一种人类可用、可验证、可服务于生物学目标的结果：总体结构是否对、关键特征是否对、功能相关的形状是否对。换句话说，我们定义的“什么算对”，本身就落在一个可压缩、可概括、可泛化的结构区域里。

于是 AlphaFold 的伟大之处就显现出来了：它不是用“更聪明的语言”去解释世界，而是用极强的学习能力，在巨大的数据与结构约束之下，找到了那一类“反复出现的、稳定的、可复用的规律”。这就是 Wolfram 所说的那种 pocket of computational reducibility（可约口袋）：在整体可能不可约、不能系统跳步的世界里，仍然存在一些可以被压缩、被建模、被可靠利用的局部区域。AlphaFold 的胜利，就是把这一块区域抓得异常精准、并且工程化到了可规模化使用的程度。

所以当 Wolfram 提到 “eye of the beholder（观察者的判据）” 时，他并不是在贬低 AlphaFold 的科学性；科学与工程真正落地时，总是围绕人类关心的指标、尺度、判据来定义成功。而“可约口袋”恰恰常常就是在这些判据与尺度上出现的。它让复杂世界在某个层级变得可压缩、可预测、可操作。

我认为Wolfram是认同AlphaFold模式的：

AlphaFold 并不是“打穿了不可约性”，而是“找到了不可约世界里最值钱的一块可约口袋”。

延续我刚才那段“可约口袋”的说法，我还想再多说几句，把它和 Demis 的表述扣在一起——因为这两个人在哲学底色上其实是相通的。

Demis 在采访里反复强调一个点：再强的系统，如果它产出的“知识结构”我们看不懂、解释不了，就会变成风险。他谈到 AI 能做出超出我们“自己设计或理解”的东西，但马上补一句：真正的挑战是要确保这些系统“建出来的知识数据库”，我们理解里面到底是什么。这句话非常关键：它把“能力”硬生生拉回到“可理解/可解释”的人类责任边界上。

在另一段对话里（Lex 访谈），他还用了一个很形象的类比：即便出现“天才级的好招”，也不必然是神秘不可理解的。更像顶尖棋手走出一手你想不到的棋，但事后他们能解释“为什么这步成立”；而且他直接说：能用简单方式解释你在想什么，本身就是智能的一部分。(Lex Fridman)

把这两段 Demis 的话翻译成 Wolfram 的语言，其实就是我说的那句：交互层面（linguistic interface）可以大量用 AI 做润滑——把人类的意图变得更易表达，把复杂计算变得更易调用，把结果变得更易叙事、更易吸收；但真正进入“科学/决策的主权区”的东西，必须是可理解、可验证、可追责的结构，而不是“看起来像答案的文本”。

也因此，Wolfram 才会把 LLM 的现实价值定位在“语言接口”和“高层 autocomplete”：它能把既有计算能力用得更顺，能把“惯常科学智慧”补全成“惯常答案/惯常下一步”。

润滑交互可以很强，但裁决世界不能靠顺滑。

https://www.cbsnews.com/news/artificial-intelligence-google-deepmind-ceo-demis-hassabis-60-minutes-transcript/

“Science as Narrative” 是他对“人类在科学里的不可替代性”的落点

他强调：

科学传统上是把世界铸造成“人能想、能讲的叙事”
不可约性意味着：很多地方你只能给出“100 步计算”，但这不是人类叙事
人类叙事需要“waypoints（可吸收的中间路标）”：熟悉的定理、概念块、语言构件
Wolfram Language 的设计本质上就是在制造这种“可吸收路标”
AI 也许能帮忙起名字/对齐词汇，但不保证任何可约口袋都能被人类概念覆盖（他叫 interconcept space）

这部分基本回答了我前面一直在说的“心理沉浸/接口协议”（或者我下一篇文章，这两篇文章是互相引用的）的那条线：

LLM 很强的是“叙事与接口”；但科学推进真正依赖的是可计算、可复现、可组织的结构路标。

科学是“把世界铸造成可被人类思考的叙事”

Wolfram 先把“科学=叙事工程”定义出来：

“the essence of science… [is] … casting it in a form we humans can think about” (Stephen Wolfram Writings)
“provide a human-accessible narrative”

科学要把世界变成人类可吸收的表示。

我看不懂不是也白搭吗？这个问题看起来很多此一举，其实现在是有争议的。因为有大量的人开始将科学“黑箱化”。

不可约性意味着：很多时候你只能给出“100 步计算”，但这不是人类叙事

他直接说“不可约性让这种叙事很多时候不可能”。

“computational irreducibility… shows us that this will… not be possible” (Stephen Wolfram Writings)
“It doesn’t do much good to say ‘here are 100 computational steps’”

这句几乎就是我另一篇文章“模型不能 throw error、必须给结果”的反面镜像：人类叙事不是把步骤砸过来，而是把它组织成可吸收的结构。

人类叙事需要“waypoints”：可吸收的中间路标（定理/概念块/构件）

要想把“非人类的计算链”变成人类叙事，你需要路标：

“we’d need ‘waypoints’ that are somehow familiar”
“pieces that humans can assimilate”

“熟悉的定理、概念块、语言构件”：它们本质上是认知压缩点，把不可约的长链切成可理解的段落。

Wolfram Language 的设计本质上就是在制造这种“可吸收路标”

他把这件事直接提升为“计算语言设计的使命”：

“capture ‘common lumps of computational work’ as built-in constructs”
“identifying ‘human-assimilable waypoints’ for computations”

而且他也承认这件事有硬上限：

“we’ll never be able to find such waypoints for all computations”

AI 也许能帮起名字/对齐词汇，但并不保证可约口袋都能被人类概念覆盖

他在这一段的核心警告是：就算 AI 能从计算里挖出某种“可约表示”，也未必能贴回到我们已有概念体系里。

“not part of our current scientific lexicon”
“there often won’t be… a ‘human-accessible narrative’ that ‘reaches’ them”

意思是：口袋可能存在，但我们的词典里没有词；AI 可以起名字，但不保证这名字能真正成为“人类可用的路标”。

Wolfram 在这里等于把 LLM 的“强”放回它最擅长的位置：叙事与接口。但他同时在提醒：

真正推动科学（以及你更关心的治理/决策系统）的是 “可计算、可复现、可组织的结构路标”——而不是一段顺滑的对话。

LLM 可以让你“感觉理解了”，但只有路标（可执行构件/可审计中间态）才能让你真的“能复现、能迁移、能追责”。

其实我个人也不大理解这个Waypoints 具体指代什么。

他最终押注的架构：AI + 计算范式（computational paradigm）

文章末尾他说：

AI 是新的“leveraging reducibility”的方式（抓可约口袋）
但在“根本发现潜力”上，它比不过真正的计算范式 + 不可约计算（枚举、模拟、系统探索）
最能推进科学的是二者结合

换成人话就是：

AI：负责导航、候选、直觉、接口、人类化、跨域类比
计算系统：负责可验证推导、可复现执行、枚举探索、严谨结构化
不可约计算：负责真正的“新地形发现”（不是“像论文”的文本创新）

这部分我就更迷糊了。

Susan STEM’s Entropy Control Theory

Discussion about this post

Ready for more?