Seeing the World Through Stephen Wolfram’s Eyes: Twenty Years of “A New Kind of Science”

Stephen Wolfram一直在说他在做新科学啊，从他的角度看范式转移20年前就开始了（中文在后面）

Jan 20, 2026

A Person Who Has Been Talking About “Paradigm Shifts” in Science for 20 Years

Over the past few weeks, I’ve deliberately given my brain a bit of a break. I mentioned in earlier pieces that when you interact with AI at very high density—especially across multiple languages (Chinese and English) and multiple language modes (programming languages, formal languages)—you start to feel something like information overload and information drift, a kind of cognitive and even physical fatigue.

I realized I needed to switch gears, to change what I was thinking about and what I was working on. Once you drill yourself too deeply into a single tunnel, it becomes very hard to generate genuinely good ideas.

So I decided to seriously revisit someone I’ve been thinking about a lot lately: Stephen Wolfram. Why him? Because it suddenly struck me that Wolfram may be the only person over the past twenty-plus years who has consistently and explicitly talked about a paradigm shift in science. His book is literally titled The New Kind of Science.

I recently bought a copy of this book online. Before that, I had already encountered Wolfram in various forms: his work on cellular automata, Mathematica, and the concepts he keeps emphasizing—computational irreducibility, and ruliology(a term he coined himself). But to be honest, for a long time I didn’t really “get” him. I didn’t truly understand what he was doing, nor did I dig deeply into it. These ideas didn’t hit me at a fundamental, logical level; they didn’t trigger that feeling of “this is worth investing a serious amount of my limited time and mental energy into.”

I’ve also mentioned Thomas Kuhn’s The Structure of Scientific Revolutions before. Over the past few years, that book has become increasingly important to me. But what really turned “paradigm shift” into a widely discussed topic for my generation was the commercialization of AI in 2022–2023. And Stephen Wolfram had already recognized this reality very early on—and repeatedly articulated it in many of his essays, some written as early as 2005 or 2012, more than a decade ago.

In this piece, I want to share how I’ve started to re-approach Wolfram’s work. I want to try to understand computational irreducibility from his perspective, to get a feel for his way of thinking, while also using AI to track down relevant papers. Coincidentally, I also happen to have a children’s toy on my desk, brought to me by a friend from China—a code-breaking machine.

Behind this toy, there is actually an NP-complete problem. I want to study and understand this complexity problem using Wolfram’s language and worldview. Especially during these past months of high-intensity writing, I’ve often been reminded of Wittgenstein’s remarks on the relationship between language and the world. Only when you fully internalize a language system do you actually gain a new worldview and a new way of seeing problems.

Since the Day A New Kind of Science Was Published, Wolfram Has Been Surrounded by Intense Academic Controversy

In a sense, it helped that Wolfram is a commercial / entrepreneurial scientist. He largely funded himself, bypassed the traditional academic and publication systems, and didn’t need to survive on academic approval. This, in turn, has made me very interested in a certain type of people like him—corporate scientists. For example, Elon Musk is a member of the U.S. National Academy of Engineering, yet he doesn’t really participate in academia either.

Precisely because such people don’t depend on the academic system for survival, and are in fact genuinely capable scientists, their evaluations and criticisms of academia are often worth reading carefully. Especially today: when universities are flooded with low-value papers, when the meaning of academic titles is itself becoming questionable, when AI is steadily eroding universities’ monopoly over knowledge interpretation; and when governments—especially the U.S. government—are cutting research funding and openly questioning the real-world value of academic systems. Against this backdrop, perspectives from outside the system form a valuable counterbalance.

Without further ado, let me show you a few examples.

Here is an important critique of Wolfram from the mathematical community: a formal book review published in the Bulletin of the American Mathematical Society. It criticizes aspects of his argumentation, the rigor of his exposition, and the relationship between his work and existing literature. For instance, it argues that some explanations are “unclear,” “unsystematic,” or “lacking testable detail.”

https://www.ams.org/journals/bull/2003-40-01/S0273-0979-02-00970-9/S0273-0979-02-00970-9.pdf

In the complex systems and statistical physics communities, one of the sharpest criticisms can be summarized in a single phrase: “old wine in new bottles + overclaiming.” The core argument is that Wolfram repackaged many existing ideas from complexity science and computation theory, then declared a “paradigm shift” through a very strong narrative, while failing to adequately acknowledge or credit prior work. A representative voice here is Cosma Shalizi.

http://bactra.org/reviews/wolfram/

I’m not interested in adjudicating who is right or wrong. Whether it’s Wolfram or his critics, their level of scholarship is far beyond mine.

But after reading these critiques—especially after months of deliberate writing practice in both Chinese and English—I had two very strong intuitive impressions.

First, Wolfram and his critics are not even operating within the same language system. This isn’t just a disagreement of viewpoints; it’s a mismatch of language protocols, almost a case of people talking past each other entirely.

Second, these critical texts carry a very strong emotional charge. You don’t have to analyze them very deeply to sense a kind of dissatisfaction or resentment. That’s actually quite strange for a community that prides itself on being rational and dispassionate.

Over time, I started to understand this better. In the AI era, many people are labeled as “crackpot scientists” simply because they lack formal academic titles, even though they are seriously engaging with scientific problems. The emotional tone feels remarkably similar.

“Alright,” I can only say this: in the age of AI, I’d much rather be a billionaire ‘crackpot scientist’ like Wolfram than a professor or lecturer struggling to secure funding and facing an uncertain career future. 😂

How Does Wolfram Respond?

Thirteen years ago, Wolfram had already personally experienced the shock of a paradigm shift, and he described it with remarkable clarity. Calling him a “crackpot” is frankly absurd.

Let’s look at his own words, from a 2012 essay, where he describes this impact. By then, he had already largely exited academia—arguably as early as 1987, when he began building software. He explains why he stopped publishing papers: academic publishing had become overly formalistic, with form outweighing substance, while the sheer amount of real content in his work made the paper format impractical. He preferred blogging. That’s why, today, the best way to understand his thinking is simply to read Stephen Wolfram’s writings.

Why does he insist this is a “paradigm shift”?

When he recounts reactions to his work, many academics respond with unusually intense emotion:

“You’re destroying the heritage of mathematics…”

Why such anger? Would you get angry at a shaman or a yoga guru making scientific claims? No—you wouldn’t even engage. Anger only appears when something is taken seriously.

Wolfram’s response is blunt:

“This is what a paradigm shift sounds like—up close and personal.”

A paradigm shift is not primarily about new conclusions; it’s about a new evaluation system.

The emotion isn’t triggered because a theorem was overturned, but because:

The old standards of “what counts as science” are being threatened (papers, peer review, citations, academic networks).
Old “career investments” are being threatened (decades of training and accumulated prestige may suddenly depreciate).

He even breaks this down into two “core threatened groups,” drawing a sharp distinction between surface reasons and deeper reasons.

“There was a surface reason… and a deeper reason.”

A. Content-level fears (the career economics of paradigm conflict)

First group (mostly physicists):

“We’ve spent our whole careers barking up the wrong tree.”

If the computational perspective of NKS is right, then these researchers weren’t just “slightly wrong”—their entire research trajectory may have been a low-return investment. This is classic Kuhnian paradigm shift territory: what counted as success before may no longer count at all.

Second group (complexity researchers):

“It’ll overshadow everything we’ve done.”

This isn’t about truth; it’s about attention and authority structures—who defines the main narrative controls textbooks, funding, and disciplinary gateways.

Form-level conflict: doing something “academic-like” without following “academic rules”

“Academic-like, but you haven’t played by academic rules.”

The implicit message is that academia has a legitimacy stack:

Peer review as gatekeeping
References as network visibility
Journals and publishers as distribution channels
Academic identity as speech authorization

Wolfram’s move was to bypass this entire stack. Hence his insistence:

“I wasn’t an academic…”

This is the core tension of his essay: a new paradigm combined with a new distribution and validation mechanism, posing a dual threat to the old system.

His full essay is here:

https://writings.stephenwolfram.com/2012/05/living-a-paradigm-shift-looking-back-on-reactions-to-a-new-kind-of-science/

Kuhn Loss

Even though Kuhn’s The Structure of Scientific Revolutions is approaching a century old, it uncannily predicts the precise situation we’re discussing here. I can’t resist quoting it. The full discussion is in the SEP entry linked below.

A paradigm revolution does not merely solve more problems; it also discards—or even declares illegitimate—problems and explanations that the old paradigm valued and successfully addressed by its own standards.
This loss of explanatory power, problem sets, and evaluation criteria is what Kuhn called Kuhn loss.

Kuhn used this concept to undermine the idea that scientific progress is simply a cumulative approximation to truth.

1) What Is Actually “Lost”?

Problem sets change: what counts as an “important problem” changes.
Standards change: what counts as a “good explanation” or “good science” changes.
Concepts and worldviews change: the same words refer to different things, different sentences become expressible.

So Kuhn loss isn’t just “one less derivation.” It’s that things which had to be explained in the old paradigm become, in the new one:

“Not necessary to explain”
“Meaningless”
“Metaphysical”
Or outright “pseudo-problems”

This is why Kuhn says revolutions change the very definition of science.

2) A Classic Example: Why Newton “Lacked Explanatory Power” Yet Won

In Aristotelian and Cartesian mechanics, the question “How is attraction possible?” was mandatory—you had to provide a contact mechanism or ontological explanation.

Newtonian gravity looked like action at a distance and failed this test, so it was initially rejected. But once Newton’s paradigm won, that question was kicked out of the scientific agenda as illegitimate. That’s Kuhn loss.

Later, general relativity reintroduced the issue in a different form, reinforcing Kuhn’s point: progress isn’t linear accumulation; it’s repeated agenda rewriting.

3) Kuhn Loss and Incommensurability

You can think of Kuhn loss as an observable symptom of incommensurability.

Incommensurability doesn’t mean “incomparable,” but rather:

No shared metric: different conceptual nets, problem lists, and evaluation standards.
Hence no single unified measure of “closer to truth.”

Kuhn loss tells us that even if a new paradigm is stronger in some respects, it may be weaker in dimensions valued by the old paradigm—and whether it’s “weaker” at all depends entirely on which metric you use.

4) Why Kuhn Loss Shocks Scientific Rationality

This is why Kuhn and Feyerabend were once accused of being “anti-science”:

If revolutions rewrite problems and standards, does rational comparison collapse into politics?
If old successes are declared illegitimate, does science stop “approaching truth”?

Kuhn later clarified: incommensurability ≠ incomparability; Kuhn loss ≠ irrationality.

Paradigm choice lacks a neutral algorithm, but it can still be guided by values (accuracy, scope, simplicity, fruitfulness), allowing for rational disagreement.

Suddenly, that Cantonese phrase—talking past each other—feels very apt.

SEP link:

https://plato.stanford.edu/archives/fall2019/entries/incommensurability/#RevParThoKuhInc

Experiencing Irreducibility with a Decoder / Mastermind Toy

I picked up the Decoder/Mastermind toy simply because it happened to be on my desk when I was thinking about this.

A seemingly simple children’s toy hides an NP-complete problem. Originally called Mastermind, it’s a two-player board game: one player sets a secret code, the other tries to guess it.

Each round, the guesser submits a guess; the code-maker returns feedback:

Black pegs: correct color and correct position
White pegs: correct color, wrong position

This feedback channel must be perfectly correct and noise-free; otherwise, the entire reasoning chain collapses.

My electronic Decoder replaces the human code-maker with a hidden algorithm and adds variants: green lights (correct color and position), white lights (correct color, wrong position), and no light (color absent). In fully indirect hint mode, positional information is hidden, dramatically expanding the state space beyond brute-force feasibility.

Theoretical work confirmed this complexity: in 2005, Jeff Stuckman and Guo-Qiang Zhang proved that Mastermind is NP-complete.

https://arxiv.org/abs/cs/0512049

I spent a full day solving all 800 levels of this toy using a minimax strategy—choosing guesses that maximally reduce the candidate space in the worst case. The product itself is impressively reliable: a consumer-grade children’s toy with a completely noise-free feedback channel.

➜  giiker_super_decoder python3 minimax_generic_gwn.py --Y 7 --P 4 --R 7 --interactive-hints

=== Generic Minimax Bucket Solver (unique colors, tri-feedback) ===
Parameters: Y=7 colors, P=4 slots, R=7 total rows
Feedback: (G,W,N) with G+W+N=P
Initial candidates: |C0| = P(Y,P) = 840

--- Preset hints input (interactive) ---
Enter one hint per line as:
  guess:  a b c d    (P ints, unique, in [0..Y-1])
  fb:     G W N      (3 ints, G+W+N=P)
Type empty guess to stop.

hint guess (or empty to stop): 0 1 2 3
hint feedback (G W N): 2 1 1 
hint guess (or empty to stop): 0 6 2 1
hint feedback (G W N): 3 0 1
hint guess (or empty to stop): 

Applying preset hints: n=2  => remaining query budget T=R-n = 5
  hint#1: guess=0 1 2 3 fb=(G,W,N)=(2,1,1)  |C| 840 -> 36
  hint#2: guess=0 6 2 1 fb=(G,W,N)=(3,0,1)  |C| 36 -> 2

Step 1 (remaining queries: 5/5)
Suggested guess: 0 1 2 4
Current |C| = 2   worst_bucket=1   expected_remaining≈1.00
Enter feedback as 'G W N' (sum=4), or 'q' to quit: 2 2 0
Filtered candidates: 2 -> 1

✅ Unique solution determined without further queries: 0 4 2 1
➜  giiker_super_decoder

All solution code is available here, including extensible solvers that go beyond the device’s original parameters:

https://github.com/STEMMOM/giiker_super_decoder

By this point, there’s little “puzzle-solving” left to say. My goal was never to solve a toy, but to revisit complexity through Wolfram’s lens.

The complexity isn’t in the rules—they’re trivial—but in the structure: a hidden truth (s), accessible only through queries that yield limited feedback. You can’t compute the answer directly; you must extract it through interaction.

From an information-theoretic view, this is a pure black-box query model. Information leaks from the hidden truth through a constrained channel. This structure pervades reality:

Cryptography and security
Medical diagnosis
Scientific experimentation
Engineering parameter tuning

The toy is kind because it promises a noise-free oracle. Reality makes no such guarantee.

Of course, I’m fully aware that I’m still at a very early stage, and that this class of problems is by no means unexplored. In fact, it has long been studied in a systematic way within academia. In theoretical computer science and in research on “puzzle complexity,” Mastermind is typically formulated as a constraint satisfaction / consistency decision problem, in its standard form known as the Mastermind Satisfiability Problem (MSP), and it has been rigorously proven to be NP-complete—a point already mentioned earlier. For the classic board-game parameters (4 positions, 6 colors, with repetition allowed), the entire state space contains only (6^4 = 1296) possibilities; as early as 1977, Knuth provided a strategy that guarantees a win in at most five moves in the worst case. My own implementation essentially operates at this same level: no more than five steps in the worst case, with reported results in the literature showing an average number of moves around four-point-something.

So within this particular scale and framework, the aspects that can be formalized, proven, and optimized have already been studied quite thoroughly. If I continue to push further, I’m more likely to approach the problem from an information-theoretic perspective—recasting it in terms of how much effective information each query can extract. Coincidentally, I also came across a very interesting recent paper today: Gür (2025), “Weighted Entropy Approach,”which uses weighted entropy as a heuristic to approach the theoretically optimal average number of steps. At its core, it treats the strategy itself as a kind of measurement instrument, an idea I find highly valuable as a reference. I won’t go into details here; interested readers can consult the original papers directly.

https://arxiv.org/abs/cs/0512049?utm_source=chatgpt.com

https://www.cs.bu.edu/fac/best/res/papers/alybull86.pdf?utm_source=chatgpt.com

https://arxiv.org/abs/2511.19446?utm_source=chatgpt.com

What I’m Really Interested In: How Wolfram Would See This Problem

A more realistic plan for me this year is to devote time to cellular automata—the true starting point of Wolfram’s worldview, beginning with A New Kind of Science. He treats them as a minimal, clean computational universe for studying complexity, irreducibility, and the fact that simple rules can generate extreme behavior.

Wolfram’s systematic, large-scale, decades-long exploration of cellular automata is essentially unmatched globally. From exhaustive rule-space scans to behavioral classification and the development of ruliology, it’s hard to find a true second example.

What continues to impress me is that one person proposed such a sweeping new scientific narrative. The key isn’t his conclusions, but his questions. When faced with Mastermind, my instinct is to solve it. Wolfram’s instinct is to ask: what kind of system is this?

Does it exhibit universal computation? Which behavior class does it belong to? Is it computationally irreducible? How common is this behavior in rule space?

Here’s the chilling thought: cellular automata are Turing complete. You’re not dealing with a solver, but a computational universe. From this comes PCE—the Principle of Computational Equivalence. I’m still digesting it, but roughly: once a system passes a very low complexity threshold, its computational power is equivalent to that of most others. The real differences lie in predictability, compressibility, and whether simulation is unavoidable.

So Wolfram isn’t anti–problem-solving. He’s reshaping the problem itself. He’s less interested in “what is the answer to this instance?” and more in “what are the behavioral laws of this system?”

In the PCE view, the “unique solution” isn’t central. What matters is whether different systems share the same class of computational capability and behavior. This worldview—almost orthogonal to the solver’s instinct—is what I’m only beginning to grasp.

This series will continue. There’s a lot here worth learning, and I’m still very early in the process.

一个把科学的“范式转移”说了20年的人。

这几周，其实我给自己的大脑放了一些假期。我在前几篇文章里提到过：当你非常密集地与 AI 交互，而且还是多语种（中文、英文）、多语言模式（编程、形式化语言）的高密度交互之后，会明显感受到一种信息过大、信息漂移的状态，一种身心上的冲击和疲惫。

我意识到我必须切换一下，给自己换一个想法，换一件事情做。人一旦钻进牛角尖，是很难产生好的创意的。

所以我打算认真研究一下这段时间我非常在意的一个人：Stephen Wolfram。为什么是他？因为我突然意识到，Stephen Wolfram 可能是过去二十多年里，唯一一个一直在谈科学“范式转移”的人。他的书名本身就叫 The New Kind of Science。

最近我刚在网上买了这本书。其实在此之前，我已经接触过 Stephen Wolfram：他的元胞机研究、Mathematica，以及他反复提到的概念——计算不可约性、Ruliology（这是他自己造的词）。但说实话，我一直没有真正看懂他，也没有真正理解他，更没有深入下去。这些概念在当时并没有在底层逻辑上击中我，没有让我产生那种“我愿意非常严肃地投入自己宝贵的时间和脑力去弄懂他”的冲动。

我以前也提到过 Kuhn 的《科学革命的结构》，这本书这几年在我心里的重要性一直在上升。但真正让“范式转移”成为我这一代人普遍讨论话题的，其实是 2022–2023 年 AI 的商业化。而 Stephen Wolfram 很早就认定了这一现实，并且在他大量的文章中反复阐述过这一点——其中不少文章写于 2005 年、2012 年，距今已经十多年了。

借这篇文章，我想分享一下我是如何重新尝试理解他的。我想站在他的角度，去理解“计算不可约性”，去体会他的思维方式；同时借助 AI 查找相关论文。恰好我手边还有一个朋友从中国带来的儿童玩具——一台密码机。

这台密码机背后，其实隐藏着一个 NP-complete 问题。我想从 Wolfram 的全新视角和语言观出发，去研究和理解这个复杂度问题。尤其是在这几个月高强度写作的过程中，我常常会想起维特根斯坦关于语言与世界关系的那些话。只有当你完全接受了一种语言体系，你其实获得了一种全新的世界观和看问题的角度。

Wolfram从the New Kind of Science 成书那一天开始，就饱受学术共同体的极大争议。

也幸亏他是商业/企业科学家，基本完全自费，绕过了传统的学术体系和发表体系，也不需要靠学术共同体活着。这一点其实让我本人非常关注一类和他相似的人——企业科学家。比如 Elon Musk，美国国家工程院院士，但他同样不混学术界。

正因为他们不依赖学术体系生存，反而是真正“有实力”的科学家，所以他们对学术界的评价和批评，往往非常值得一看。尤其是在今天这个时代：大学高校论文大量灌水，职称体系的意义本身开始变得可疑，大学正在被 AI 逐步侵蚀知识解释权；再加上以美国为首的政府开始削减学术经费，公开质疑学术系统的现实价值——在这样的背景下，这种来自体系外的视角，本身就是一种很好的平衡。

话不多说，我直接给你看几篇文章。

这是对 Wolfram 的一篇重要批评，来自数学共同体——美国数学学会 American Mathematical Society 旗下的 Bulletin of the AMS，是一篇正式的书评。文章对他的具体论证方式、表述的严密性，以及与既有文献脉络之间的关系，提出了不少批评，比如：有些解释“说不清楚 / 不成体系 / 缺乏可检验的细节”。

https://www.ams.org/journals/bull/2003-40-01/S0273-0979-02-00970-9/S0273-0979-02-00970-9.pdf?utm_source=chatgpt.com

在复杂系统和统计物理相关的圈子里，最尖锐的一类批评则集中在一句话上：“新瓶装旧酒 + 过度宣称”。核心观点是认为 Wolfram 把复杂系统和计算理论中已经存在的很多思想重新包装了一遍，然后通过非常强的叙事方式宣称“范式跃迁”，但在对既有研究的承认和归功方式上存在问题。这一类批评中，比较有代表性的声音来自 Cosma Shalizi。

http://bactra.org/reviews/wolfram/

在这里我并不想讨论谁对谁错。因为无论是 Wolfram，还是这些写下批评的学者，他们的学识都远远高于我。

但我读完这些批评之后，尤其是这几个月我因为刻意练习写作，积累了不少中英文的语言语感，我有两个非常强烈的直观感受：

第一，Wolfram 和这些批评者，从语言体系上看，根本就不在同一个系统里。不是观点分歧，而是语言协议不同，几乎是一种“鸡同鸭讲”的状态（我们广东人常说的，俩都在自说自话，无效沟通）。

第二，这些批评文本里带有非常明显的情绪。你几乎不需要太仔细分析，就能读出一种“不满”的情绪。这一点对一个自我标榜为理性、冷静的科学家群体来说，其实是挺奇怪的。

后来我慢慢也体会到了这一点。AI 时代到来之后，很多人被指责为“民科”的感觉，其实是非常相似的——泛指那些没有教授头衔、却在认真做科学问题的人。

“好吧。”

我只能说，如果是在 AI 时代，我宁愿混成 Wolfram 这种超级富翁型民科，也比现在那些向上要不到经费、职业前景朝不保夕的教授和讲师来得更实在一些😂。

Wolfram怎么回答？他在13年前，就已经亲身经历了范式转移对他的个人冲击，而且非常有条理的跟你描述了出来。

要说他是“民科”….那就真的是可笑了。

我们现在引用他本人在2012年的文章，来看看他如何描述这种冲击的。因为他其实早就脱离学术界了，我认为在1987年他开始做软件的时候，他本人应该就有这个倾向。后来他不发论文，他自己也在此文有具体的描述，总的来说就是论文套格式，形式大于实质，而他的科学”真材实料“实质实在太多了，论文发不过来。他宁愿发博客。所以我现在理解他的想法，就是看他的本人的博客，Stephen Wolfram writings.

他为什么坚持：这是“范式转移”

他谈及对他的评论，大部分学者是一种情绪强度异常的学术现场：

“You’re destroying the heritage of mathematics…”

我就说这些批评者都是很强情绪的，对于一个学者来说，为什么要发脾气？你会对一个非严肃科学，比如撒满巫师，瑜伽师提出的科学发脾气吗？不会，因为你不会去理他。

Wolfram的回答是：

this is what a paradigm shift sounds like—up close and personal.”

范式转移首先不是新结论，而是新评价体系。

所以对方的情绪不是因为某个定理被推翻，而是因为：

旧的“何为科学”的标准被威胁了（论文、审稿、引用、学术网络）
旧的“职业人生投资”被威胁了（几十年的训练路径和声望资本可能失效）

他后面把这种深层原因说得非常赤裸，甚至分成两类“核心受威胁群体”。

他明确区分“surface reason / deeper reason”。

“there was a surface reason… and a deeper reason.”

A. 内容层面的两类恐惧（这就是范式冲突的“职业学”内核）

第一类（多为物理学家）：

“we’ve spent our whole careers barking up the wrong tree”.

这句话的含义是：如果 NKS 的计算视角成立，那些人不是“错了一点”，而是整个科研投资方向可能被判定为低收益路径。

这就是 Kuhn 式范式转移里最常见的：旧范式的成功指标，在新范式里不再值钱。

第二类（复杂性研究相关的人）：

“it’ll overshadow everything we’ve done”.

这不是“真假焦虑”，这是注意力与权威结构焦虑：谁来定义主叙事，谁就拥有学科入口、教材、基金评审的话语权。

形式层面的冲突：你做了“academic-like”，但不按“academic rules”

“academic-like, but you haven’t played by academic rules.”

这里的潜台词是：学术界有一套“合法性协议栈”，包括：

peer review 作为准入
references 作为关系网可见性
大出版社/期刊体系作为分发渠道
学术身份作为话语资格

而 Wolfram 的路线是：我把这套协议栈绕过去了。所以你会看到他强调：

我不是 academic，我不受它约束：

“I wasn’t an academic…”

这其实就是他这篇文章的核心冲突：新范式 + 新分发/验证机制，对旧系统是双重威胁。

他写书和博客….

具体内容可以去看他的原文。

https://writings.stephenwolfram.com/2012/05/living-a-paradigm-shift-looking-back-on-reactions-to-a-new-kind-of-science/?utm_source=chatgpt.com

我们这里也不再展开。

Khun Loss

我说一下Khun 库恩，哪怕他那本科学革命的结构这本书快成书一个世纪，还真的是精准的预言了我们这篇文章写到现在，这个时间点的精确过度观点。我真的忍不住要引用一下。具体内容还是要看SEP原文，链接我也贴出来了。

一次范式革命不只是“多解决了一些问题”，它也会“丢掉（甚至宣布不合法）一些旧范式曾经非常看重、并且在旧标准下算是成功的问题/解释”。
这种“丢掉的解释能力/问题集/评价标准”，就是 Kuhn loss。

它是 Kuhn 用来打掉“科学进步 = 累积式逼近真理”的关键楔子之一。

1) 到底“丢”了什么

问题集改变：什么算“重要问题”变了
标准改变：什么算“合格解释/好科学”变了
概念与世界图景改变：同一个词在两个范式里指向的东西、允许说的话、能表达的句子集合都变了

所以 Kuhn loss 不只是“少了一个推导”，而是更像：

旧范式里“必须解释”的东西，在新范式里变成了
- “不需要解释”、
- “没意义”、
- “形而上学”、
- 甚至“伪问题”。

这就是他为什么说革命会改变“科学的定义”。

2) 词条里的经典例子：牛顿为何“没解释力”，却赢了

SEP 用的例子很 Kuhn：

在 亚里士多德/笛卡尔 的力学传统里：
“吸引力如何可能？” 是硬指标（你必须给出接触机制/本体论解释）。
牛顿的万有引力在当时看起来像“隔空作用”，在旧标准下是不合格的，所以会被拒绝。

但一旦牛顿范式胜出，新共同体会把这类问题 从科学议程里踢出去（说它“不合法/不科学”）。

这就是 Kuhn loss：你从旧范式的评分表看，新范式“解释力变差了”；但从新范式的评分表看，这题根本“不该做”。

然后 SEP 也补了一刀：这个问题后来在广义相对论的框架下以另一种方式“重新出现并得到处理”。这更能体现 Kuhn 的意思：不是直线累积，而是议程反复改写。

3) Kuhn loss 和“不可通约”是什么关系

你可以把 Kuhn loss 看成 不可通约（incommensurability）的一个可观察症状。

不可通约在 Kuhn 这里不是“完全不可比较”，而是：

没有共同度量：因为两边在用不同的概念网、不同的问题清单、不同的评价标准。
因此你没法用同一个“统一指标”说：A 比 B 更接近真理、或“总体更好”。

Kuhn loss 就是告诉你：

即使新范式在某些方面更强，它也可能在旧范式曾经擅长的维度上更弱——但“更弱”这件事本身是否成立，取决于你站在哪个范式的度量体系里。

4) 为什么 Kuhn loss 对“科学理性”构成刺激

这就是当年 Kuhn/Feyerabend 被骂“反科学”的原因之一：

如果革命会改写问题与标准，那“理性比较”是不是变成了政治斗争？
如果旧成功会被新范式宣布不合法，那科学是不是不再“逼近真理”？

SEP 也强调了 Kuhn 后来的澄清：

不可通约 ≠ 不可比较；Kuhn loss ≠ 非理性。

Kuhn 的立场更像：

范式选择没有“中立算法”，但仍可用一组价值（accuracy/scope/simplicity/fruitfulness…）来做“有理由的争论”；
不同人对这些价值的权重不同 → 允许“理性分歧”。

又回到了广东人的那句俗语：鸡同鸭讲，是不是变得更贴切了？

https://plato.stanford.edu/archives/fall2019/entries/incommensurability/#RevParThoKuhInc

我随手拿了一个decoder/mastermind玩具来体验不可约

真的是因为我想到这一点的时候，这个玩具刚好就在我书桌上而已。

一个看似简单的儿童玩具，背后其实隐藏着一个 NP-complete 级别的问题。它最早的名字叫 Mastermind：一款需要两个人参与的桌游——一个人负责“设密码”，另一个人负责“猜密码”。

规则很直接：设密码者先选定一组颜色/位置组成的密码并隐藏起来；猜密码者需要在有限回合内（也就是棋盘给定的槽位/回合数限制）把它猜中。若能在限制回合内猜中，猜密码者胜；否则设密码者胜。

每一回合，猜密码者提交一个猜测，设密码者必须给出反馈。经典反馈形式是黑钉/白钉：

黑钉：颜色正确且位置正确；
白钉：颜色正确但位置不正确。

本质上，这是一条“反馈信道”，而且它必须严格满足规则——完全正确、无噪音，否则整个推理链就会崩掉。

我手上的这个电子版 Decoder 可以看作是 Mastermind 的升级版：它把“设密码的人”替换成了设备内部的隐藏算法，并且加入了更多变体反馈机制，比如：

绿灯表示颜色和位置都正确，
白灯表示颜色正确但位置不对，
不亮表示这种颜色根本不在密码里（或者等价地表示“颜色也不正确”的数量）。

在完全 indirect hint 的单机模式下，玩家甚至看不到位置信息——这会让状态空间急剧膨胀，远远超出 brute force 直接穷举的可行范围。而从理论上说，这背后的复杂性并不是“感觉上很难”这么简单：在 2005 年，Jeff Stuckman 和 Guo-Qiang Zhang 的论文证明了 Mastermind 是 NP-complete。

https://arxiv.org/abs/cs/0512049?utm_source=chatgpt.com

好，用了一整天的时间，我把这个玩具范围内的 800 个关卡全部解完了。我采用的是 minimax 策略：每一步选择在最坏情况下能最大幅度压缩候选空间的猜测，而不是追求“看起来聪明”的局部最优。

这里必须认真夸一句——这款来自中国计科公司的玩具产品做得非常扎实。它的反馈信道高度可靠、完全无噪声，严格遵守规则约定。能在一个面向儿童的消费级玩具里，把反馈一致性和规则执行做到这种程度，其实很厉害。

所有解法的 repo 我都已经放出来了：

不仅包含针对这台设备本身配置的求解器，也包括可扩展的通用解法，能够支持比原机更大的颜色空间和更多变体规则。可以直接复用、改参数就能跑。

就当是给订阅者的一个小福利吧。虽然我写的文章通常都很长、也不怎么“友好”，但这个玩具倒是个很好的切入口——你可以买一个给孩子，让他自己认真玩几天；等他真的卡住、开始怀疑人生的时候，你再掏出这套算法，几百关轻松带过。

他大概率会觉得你特别厉害。

从实用角度讲，这可能也是一种让小孩子突然很喜欢你的办法。

https://github.com/STEMMOM/giiker_super_decoder

玩到这个阶段，其实已经没什么“解题意义”可再展开了：一个儿童玩具，被完整求解，全部关卡通关。它在理论上当然是 NP-complete，但在这个被严格限制的规模内，五步之内必然可解。然而，这从来不是我的真正目标。我并不是想证明我解决了一个儿童玩具，而是想站在 Wolfram 的视角重新审视复杂性本身。

这里的复杂性并不来自规则——规则极其简单——而是来自这样一种结构：存在一个被隐藏的真值 (s)（密码、答案），你每一次只能提交一个 guess，系统返回一个反馈（G/W/N），你必须在有限轮数内把这个 (s) 反演出来。你想直接算出答案，却永远缺少关键信息；而你获得信息的唯一方式，是不断发起查询并接收反馈。答案不是从规则中推导出来的，而是从交互过程中被迫挖出来的。

从信息论的角度看，这个模型异常纯粹：信息单向地从隐藏真值向外泄露，困难不在于规则复杂，而在于信息被锁在一个 oracle（反馈器）后面；本质上，这是一个标准的黑箱查询问题——你每一步都在用一次 query，换取极其有限的 bit。这种结构并不只存在于玩具中，它在现实世界里广泛存在，而且复杂性往往呈指数级放大。玩具的“仁慈”在于：你被明确告知反馈信道是完全无噪声的，规则被严格执行，oracle 不会撒谎；而现实中，有谁说了反馈一定是真实的？无噪的？于是，真正值得思考的对象就变成了这个你无法看到内部机制的黑箱：你只能向它提出查询，它只返回一个反馈，而整个复杂性，正是从这种受限、单向、被遮蔽的信息交互中自然涌现出来的。

安全/密码学：你对系统内部状态未知，只能试探/探测（queries）
诊断/医学：你不知道病因，只能做检查（queries）来缩小候选
科学实验：你不知道规律，只能做实验（queries）得到观测，逐步收敛
工程调参：你不知道最佳配置，只能试运行（queries）拿到反馈再更新

当然，我也很清楚自己还处在一个非常早期的阶段，而且这类问题并不是无人涉足。事实上，在学界它早已被系统化地研究过。在理论计算机科学和“谜题复杂度”相关的研究中，Mastermind 通常被表述为一种约束满足 / 一致性判定问题，标准形式被称为 Mastermind Satisfiability Problem（MSP），并且已经被严格证明是 NP-complete——这一点前面其实已经交代过了。就经典桌游参数而言（4 位、6 种颜色、允许重复），整个状态空间只有 (6^4 = 1296) 种可能；早在 1977 年，Knuth 就给出了一个最坏情况 5 步必胜的策略。我的实现本质上也是这个级别：最坏不超过 5 步，而且我看到的文献里给出的统计结果，平均步数大概在 4 点多。所以在这个规模和框架之内，能被形式化、能被证明、能被优化的部分，其实已经被研究得相当充分了。我接下来如果继续推进，更可能会从信息熵的角度入手，把问题重新表述成“每一步查询能榨取多少有效信息”的问题。也正好，今天还偶然看到一篇挺有意思的新论文：Gür（2025）的 Weighted Entropy Approach，用加权熵作为启发式，逼近理论最优的平均步数，本质上也是把“策略”当成一种测量仪器来设计，这个思路对我来说非常有参考价值。这里就不展开了，有兴趣的可以直接去读原文。

https://arxiv.org/abs/cs/0512049?utm_source=chatgpt.com

https://www.cs.bu.edu/fac/best/res/papers/alybull86.pdf?utm_source=chatgpt.com

https://arxiv.org/abs/2511.19446?utm_source=chatgpt.com

我感兴趣的是 Wolfram会怎么看待这种问题

当然，更现实、也更符合我今年节奏的做法，是把一部分时间真正投向元胞机本身。因为在我看来，这条线索几乎是一切的起点——起点就在 A New Kind of Science。无论是那本书本身，还是此后 Wolfram 持续二十多年发表的大量文章、讲座与补充材料，核心对象的起点在元胞机。我感觉他是把它当作一种最小、最干净的计算宇宙，用来正面研究复杂性、不可约性、以及“规则极简但行为极端复杂”这一事实本身。在这条路线上，Stephen Wolfram 对元胞机的系统性、规模化、长期投入，几乎是全球独一无二的——不论是规则空间的全面扫描、演化分类、计算不可约性的提出，还是后来发展出的 ruliology 视角，在全球范围内都很难找到真正意义上的“第二家”。所以我肯定也会持续在这条上学习。

我一直觉得，Wolfram 所谓的“新科学范式”，在他当年以一个人的身份提出这样一个宏大的叙事，本身就已经非常了不起了——而且我绝对不是唯一一个对此产生强烈震动的人。关键不在于他说了什么“结论”，而在于他到底在问什么问题。你看，同样是拿到 Mastermind / Decoder 这个游戏，我的第一反应肯定是：求解。这当然很自然——计算的目的之一，难道不就是求解吗？世界里似乎存在一个“真相”，一个唯一的 secret code（或者一个极小的候选集合）；成功的指标也非常清楚：用尽量少的步数把答案锁定下来，追求收敛、收束、最短路径。这套思路在复杂度理论里再熟悉不过了：CSP / SAT / 搜索问题，加上一点策略优化，所有评价指标都围绕“最快定位唯一解”。（突然想感慨一下，好做题家的思维啊….)

但如果换成 Stephen Wolfram 的视角，问题的重心会发生明显偏移。他关心的往往不是“这个实例的答案是什么”，而是：这个规则系统本身会长成什么样。比如：它会不会涌现出普适计算能力？它的整体行为属于哪一类（简单、周期、混沌、复杂）？它有没有“捷径”，还是说本质上是计算不可约的？在整个规则空间里，这种行为到底是稀有的，还是普遍存在的？在这个框架下，成功的指标不再是“锁定唯一解”，而是发现结构现象、分类现象、识别生成机制，并回答“为什么这种现象在规则宇宙中如此常见”。

有一个让人稍微有点“背脊发凉”的念头, 我想到：元胞机是图灵完备的。这意味着你面对的不是一个“解题器”，而是一个潜在的计算宇宙。围绕这一点，Wolfram 提出了 PCE（Principle of Computational Equivalence）——说实话，这个原则我自己目前也还在消化之中，但它大致在说：一旦系统的行为超过某个极低的复杂性阈值，它们在计算能力上往往是等价的。换句话说，“能算的都差不多一样能算”，区别不在于是否强大，而在于是否可预测、是否可压缩、是否必须一步步模拟。

所以，Wolfram 并不是不“求解”，而是彻底换了问题的形状。他不是在问：“这个具体实例的答案是什么？”他更常问的是：“这个系统的行为规律是什么？”在同样是“给定规则 + 初态”的前提下，他未必在意你是否把系统推进到第 N 步、得到某一个确定图像；他可能更关心的是：能否预测某些统计量（密度、熵率、结构压缩率），能否证明或通过经验判断“只有模拟才能知道结果”，以及能否把一个看似完全不同的问题编译进这个系统，从而显示它的普适性。在 PCE 的视角里，“唯一解”从来不是中心问题；真正居中的，是不同系统是否共享同一类计算能力、同一类行为类型。这正是我开始慢慢意识到的、与“解题直觉”几乎正交的一种世界观。

这个系列我会持续更新，这是一个值得长期学习和探讨的议题。说实话我的认识目前还算很浅。

Susan STEM’s Entropy Control Theory

Discussion about this post

Ready for more?