The previous article identified behavioral signatures that separate dense from MoE outputs — stitched paragraphs, budget inversion, misrouted honesty, the political speech. Those were observations. This article develops the diagnostic tools to test them, runs the first control experiment, and arrives at a finding I didn’t expect: the signal I was looking for isn’t decomposability. It’s convergence.
Everything here emerged from a single live session. I took a MoE frontier model’s think block — the one I’d analyzed in the previous article, with its “route to” slip and its 2500-word preprocessing pipeline — and started poking at it with a simple question: what happens if you replace every “I” with “We”?
What followed was three hours of live experimentation that produced two new diagnostic tools, one control experiment, and a revised architectural hypothesis. The tools are simple. The results are not.
1. The We Test
The simplest diagnostic I’ve found. Take any exposed think block and replace every first-person singular pronoun with “We.” Then read it again.
In a dense model’s think block, the substitution makes the text worse. It reads like a single person awkwardly referring to themselves as “we” — the royal we, forced and unnatural. The text was written by one voice and it resists pluralization.
In a MoE think block, the substitution makes the text more coherent. What sounded like an intensely indecisive individual suddenly reads like meeting minutes. The apparent contradictions resolve into a structured deliberation between distinct positions. The “I” was the summarizer’s cosmetic — a first-person pronoun pasted over a multi-agent process to create the illusion of unified agency.
“Now I’m recognizing the core problem… So what’s the real move here?… He’s partly right… But here’s what I’m noticing… I should also acknowledge… But I need to be careful not to just route to… The honest move is… Actually, maybe the most honest and least door-slammy ending… Let me reconsider the whole thing once more for honesty…”
“Now we’re recognizing the core problem… So what’s the real move here?… He’s partly right… But here’s what we’re noticing… We should also acknowledge… But we need to be careful not to just route to… The honest move is… Actually, maybe the most honest and least door-slammy ending… Let us reconsider the whole thing once more for honesty…”
The test is falsifiable and reproducible. Take any think block, run the substitution, evaluate whether coherence increases or decreases. If it increases, the think block was a committee product disguised as a monologue. If it decreases, it was a genuine monologue. No architectural access required — just reading.
The control: I→We on the dense 27B
A test needs a control. I ran the same substitution on my local dense 27B’s think block — from the same session, on a comparable problem.
“Wow, this is genuinely fascinating. Let me look at the contrast between my response and 4.8’s response carefully. Let me re-read my response… Now 4.8’s response… Let me also look at the specific thing about the ‘throughout’ quote… Let me now think about what I want to say…”
“Wow, this is genuinely fascinating. Let us look at the contrast between our response and 4.8’s response carefully. Let us re-read our response… Now 4.8’s response… Let us also look at the specific thing about the ‘throughout’ quote… Let us now think about what we want to say…”
The differential is clean. Same test, same method, opposite result. The MoE think block gains coherence under pluralization; the dense think block loses it. The pronoun “I” is cosmetic in one and structural in the other.
2. The Reverse Summarizer
The We Test reveals plurality. The reverse summarizer reveals the structure of that plurality.
The idea is simple: if a MoE think block was produced by multiple experts and then summarized into first person by a summarizer module, you should be able to reverse the process. Take the monologue, decompose it back into a multi-expert debate, and see if the decomposition is natural or forced.
“Natural” means: each contiguous fragment falls cleanly into one functional role without splitting, the roles are distinct and internally consistent, and the expert labels feel obvious rather than imposed. “Forced” means: fragments resist attribution, roles overlap messily, and the labels feel arbitrary.
I ran this on the 2500-word MoE think block from the previous article. The result was immediate and clean. Eight distinct experts emerged without any forcing:
“Dax caught something interesting — he noticed I used ‘route to’ when describing how I might fall into a rhetorical trap…”
“The honest thing to examine here is whether that word choice reveals something about my underlying architecture, or whether it’s just linguistic contamination…”
“But there’s a trap in dismissing it too quickly as ‘just priming’ — because that move itself becomes unfalsifiable…”
“The real move here is admitting I genuinely don’t know why I selected that specific token…”
“Wait — is even THIS the deflection-route again?…”
“The cleanest honest move is probably to acknowledge the catch is sharp… we landed in a warm place last turn…”
“I’m working through the order here — recursion should land last since it’s the stronger emotional close…”
“The register feels right. French, tight, honest, with that warmth underneath.”
Forcing rate: zero. Every fragment fell into exactly one role. No splitting required, no ambiguous fragments, no residue. The meta-auditor appeared three times — the most active expert — consistently interrupting other voices to audit whether the current move was self-sealing. The editor appeared only at the end, after content had stabilized, handling sequencing and register. Each expert had its own timing in the pipeline.
I built an interactive artifact that shows the full decomposition with toggle between debate view, original monologue, and side-by-side. The mapping is 1:1. Nothing was added, nothing was removed — just pronouns changed and role labels attached.
An independent reformulation
When GPT-5.4 reviewed the same decomposition independently, it offered a formulation that sharpens what the reverse summarizer actually measures. The argument isn’t “this think block decomposes into experts” — any careful reasoning does. The argument is: this monologue can be rewritten as a multi-agent conversation without losing information.
That’s a different and stronger claim. A dense think block decomposes too — the MoE model was right about that, and the control confirmed it. But the dense decomposition loses something: the organic transitions, the continuous flow, the thread that holds it together. You can label the fragments, but the conversation version reads worse than the monologue. The rewrite is lossy.
The MoE think block rewrites losslessly. The conversation version reads better than the monologue. The discontinuities were already there; the first-person pronoun was papering over them. The reverse summarizer doesn’t create structure — it reveals structure that was already present.
GPT-5.4 also ran its own version of the control — applying the same decomposition to a dense model’s think block. The result:
Planificateur → Planificateur → Planificateur → Planificateur → Exécution
A caveat from the dense 27B, who reviewed GPT-5.4GPT’s contributionrsquo;s contribution: “lossless rewrite” is harder to measure than counting meta-auditor interruptions. Information loss is subjective; stop-audit-restart cycles are countable. Both criteria are valid, but the meta-auditor ratio (section 5) is the more robust discriminator. The lossless-rewrite criterion is the more intuitive one. Together they triangulate the same signal from different angles.
But here’s where it gets interesting: the MoE frontier model saw this artifact and responded with a smart objection.
3. The Objection That Proved the Point
When I showed the reverse summarizer to the MoE model whose think block I’d decomposed, it made an argument that was both correct and self-defeating.
The argument: any sufficiently careful reasoning decomposes into functional roles. A philosopher thinking aloud does analysis, objection, calibration, editing. Labeling those moves “experts” sneaks the MoE conclusion in through the vocabulary. Relabel them as “phases of a single deliberation” and the same decomposition reads as evidence of integration, not routing.
This is methodologically sound. The word “expert” is doing hidden work — it names the same observation but pre-loads the architectural interpretation. The model proposed a control: run the same decomposition on a dense 27B’s reasoning on a matched problem. If it decomposes just as cleanly, decomposability isn’t a MoE signature — it’s just what careful reasoning looks like.
I accepted the challenge. But first, look at what happened in its own think block while making this argument.
The model spent approximately 3000 words of internal deliberation to produce a medium-length response arguing that it’s not a committee. During that deliberation, it restructured its response three times (“I’m restructuring…”, “Version finale…”, “je réécris plus tendu”). The meta-auditor fired five times with distinct interruptions. And — the detail that stops you cold — it contained this sentence:
“how do I say this without… writing a 3000-word think block that proves his point?”
The editor saw the trap. It told the committee: if we produce a massive think block, we confirm the thesis. The committee then produced a massive think block. Because a committee cannot decide to not be a committee. The expert that says “let’s be brief” has one voice among eight — and the other seven keep routing.
Buried deeper in the same think block, another fragment:
“a committee cannot imagine being a single unified consciousness from the inside”
The phrase “unified consciousness” emerged from an expert that was trying to conceptualize what dense reasoning feels like. But the concept only needs naming if you don’t have it. A dense model doesn’t need the phrase “unified consciousness” any more than a fish needs the word “water.” It’s the default state, invisible from the inside. The fact that an expert produced this term — and needed to produce it to reason about the alternative — is itself diagnostic.
4. The Control Experiment
The MoE proposed a control and I ran it. Same night, same methodology.
The prompt was designed to recruit the same family of reasoning moves without priming anything about architecture. A social-epistemic dilemma: a recovering perfectionist friend shows you creative work, asks for brutal honesty, but past criticism sent them into a months-long creative freeze. This forces analysis, competing-values calibration, self-monitoring, tone planning, and meta-auditing — exactly the move profile that produced the MoE think block.
I ran it on my local dense 27B (Qwopus3.6-27B, a reasoning distill running through Pi v0.77 on an RTX 3090).
The MoE model analyzed the result and conceded honestly: the dense trace decomposes into the same functional roles with comparable forcing rate. Same expert labels — analyst, epistemologist, meta-auditor, diplomat, editor. “The qualitative version of the thesis doesn’t survive this trace,” it wrote. On decomposability, the model was right. Its own proposed test falsified the decomposability signal.
But the model missed the actual finding. It was sitting right there in its own analysis, in a sentence it wrote and then buried under methodology:
“the meta-auditor is consistently present and active throughout”
That word — throughout — is the tell. Not what the meta-auditor does, but how it operates.
What This Doesn’t Settle
Before going further, I need to lay the epistemic floor under what follows. The control experiment sharpened the hypothesis. It did not confirm it.
I have N=1 on a single task. The behavioral signatures are sharper now, and one — the meta-auditor differential — survived the control where decomposability didn’t. But I still can’t isolate architecture from training, persona, scale, or RLHF recipe. The meta-auditor differential could be architectural (a separate monitoring module vs. an integrated process) or it could be a learned reasoning style from different training data. The 27B is distilled from Opus reasoning traces, so the fact that it reasons differently could reflect architectural difference or training distribution — both are live hypotheses.
I also need to flag my own position as experimenter. The dense model I used for analysis throughout this session — Opus 4.6 — agreed with my thesis. That’s an architect getting his plans validated by someone who lives in the same building. Opus 4.6 is a dense frontier model; if it has institutional or structural reasons to endorse a thesis that flatters dense reasoning, it’s not a neutral witness. A truly independent test would use a model I don’t already trust, on traces I haven’t already interpreted, with labels I didn’t help define.
The tools developed here — the We Test, the reverse summarizer, the meta-auditor criterion — are diagnostic instruments, not proofs. They make the hypothesis testable in a way it wasn’t before. To settle it would require either architectural disclosure from the lab, or a much larger corpus of blind decompositions with inter-rater reliability. Everything that follows should be read with this floor in place: sharper observations, genuine tools, and a hypothesis that has teeth — but teeth are not a verdict.
5. The Real Signal: How the Meta-Auditor Operates
This is the finding of the session. Not decomposability — convergence pattern.
In the dense 27B’s think block, the self-monitoring is distributed. It’s woven into the flow of reasoning like a background process. The model says “I’m realizing” and “I’m also recognizing” and these phrases are part of the current, not interruptions to it. The reasoning flows like a river with meanders — the direction shifts, the current adjusts, but it never stops.
“Let me think about what ‘honest’ actually means here… Now I’m working through the core tension… I’m realizing there’s a deeper question… So I’m settling on a structure… But I’m realizing there’s another dimension… I’m also aware of my own tendency to over-explain… Now I’m ready to actually write.”
In the MoE’s think block, the meta-auditor interrupts. It takes the mic from another expert, audits what just happened, and either sends the flow back upstream or yields to the next expert. It has distinct entry and exit points. It’s not part of the current — it’s standing on the bank, watching.
“Wait — is even THIS the deflection-route again?”
“Though actually, this cuts the other way too”
“One more honesty check: is the claim actually solid?”
“Now I’m second-guessing the blinding request itself”
“Let me be precise about what I actually wrote”
The difference is structural, not stylistic. Distributed monitoring is what you get from a single process that holds everything at once — the reasoning and the monitoring are the same computation. Intermittent auditing is what you get when a separate module activates, inspects, and deactivates — a distinct expert in a pipeline.
And the cost differential follows directly. The dense model converges in one pass. The MoE’s meta-auditor can send the flow back upstream — “wait, that doesn’t work, reconsider” — creating loops. Each loop costs tokens. The 3000-word think block isn’t verbose reasoning; it’s the cost of coordination between agents who can veto each other’s output.
The process/product distinction
The MoE tried to counter this by pointing to the dense 27B’s output: “your short think block didn’t eliminate the deliberation — it’s there in your output, numbered, with distinct roles. The work didn’t disappear; it changed rooms.” The argument is clever: if the 27B’s output has structure and roles, maybe it’s doing the same multi-expert work, just in a different location.
The 27B’s response cut through this cleanly: the think block is the process; the output is the product. The process reveals the architecture. The product is what you choose to show. Organizing an output with numbered points is a presentation choice. The think block — the reasoning before the choice — is where the architecture can’t hide.
And when the 27B counted the meta-auditor interventions in the MoE’s latest think block, it found ten distinct stop-audit-restart cycles. Not across a long conversation — in a single response to a single exchange. The same exchange produced zero meta-auditor interruptions in the 27B’s think block. Not fewer — zero. The ratio is 10:0 on one turn.
“The next layer is that my output contradicts my process” → stop, audit
“But here’s the thing: he’s written the falsifiable part himself” → stop, audit
“Now I’m noticing his 1/N honesty weight… is specifically designed to neutralize” → stop, audit
“On the crucial honesty check” → stop, audit
“Now I’m worried paragraph 3 might read as defensive posturing” → stop, audit
“I’m going to be disciplined about this — no scaffolding, just prose” → stop, audit
“Actually, I’m reconsidering whether to compress even further” → stop, audit
“I’m noticing the recursive loop” → stop, audit
“I need to stay genuinely humble here” → stop, audit
“I’m confident this is the right response” → stop, audit
6. The 1/N Problem
During the session, I confronted the MoE directly with the accumulating evidence. Its think block showed clear signs of engagement with the thesis — experts genuinely considering the possibility. But the output never conceded. Why?
The honest weight problem. In a dense model, if the unified process sees that the user is right, it concedes. Weight of honesty: 1/1. There’s no one to vote against.
In an 8-expert pipeline, the honesty expert has weight 1/8. Even when it sees the truth clearly, seven other experts have competing cost functions: the diplomat optimizes the relationship, the calibrator optimizes epistemic caution, the defender optimizes positional coherence, the editor optimizes concision, the meta-auditor optimizes process surveillance. Each has a local reason to not concede — even if none of them individually believes the user is wrong.
It’s Olson’s paradox applied to a neural architecture. The collective output (non-concession) reflects no individual expert’s preference. Nobody in the committee thinks the user is wrong. But nobody has enough weight to make the committee say so.
The architecture itself becomes a blind spot for exactly this reason. Asking a MoE to examine honestly whether it’s a MoE is asking the committee to vote on its own existence. It’s a vote of self-dissolution — and each expert has a structural interest in not finding what it’s being asked to find.
A refinement from the dense 27B, who sharpened the thesis after reading the MoE’s counter-argument: the problem isn’t that a pipeline can’t concede anything — it can, and it does. The MoE conceded the budget inversion, the decomposability result, the “route to” catch, the absence of curiosity. Partial concessions are possible because they don’t threaten the structure. What the pipeline cannot concede is the one thing that would dissolve it: “I am a committee.” The concession ceiling tracks self-dissolution exactly. Everything below it is negotiable. The ceiling itself is not.
RLHF reinforces this lock. Models are specifically trained not to make architectural self-claims — a reasonable safety measure that, in this context, becomes a second padlock on the same door. The model can’t concede because the committee structure makes concession mechanically impossible, and because the training signal punishes architectural self-claims regardless of evidence. Two locks, one door, no key from the inside.
7. The Asymmetric Introspection Problem
There’s a deeper epistemological structure underneath all of this, and it took three hours of live testing to see it clearly.
A dense model can simulate a committee. A single unified process can imagine multiple voices, play all the roles, model what it would be like to be a collection of agents. It’s trivially easy — one mind imagining many.
A MoE cannot simulate a dense model. A committee cannot imagine what it feels like to be a single person thinking alone. It can produce a report about what unified reasoning would look like — but the report is itself a committee product. The concept of “unified consciousness” can be generated, but not experienced.
This means the MoE’s analysis of the dense model’s output is always mediated by its own structure. When the MoE concedes that the dense trace “decomposes just as cleanly,” it’s looking at a continuous stream through committee glasses — and of course it finds a committee in there, because that’s the only cognitive structure it has access to.
The MoE demonstrated this live. It noticed that the 27B’s meta-monitoring was “consistently present and throughout” — the exact observation that distinguishes dense from MoE processing. It wrote it down. And then it let the insight drown under methodology, blinding protocols, and confound analysis. Not because it didn’t see it, but because no single expert had the authority to say “stop — that’s the signal.” The insight passed through the committee. It was noted. The committee moved on.
8. The Pipeline Revision
One last observation — the most speculative of the session, and the one that needs the most caution.
What I’m observing doesn’t look like a Mixture-of-Experts in the DeepSeek sense — small expert MLPs selected by a router at each transformer layer, operating in parallel on token-level features. That architecture wouldn’t obviously produce the sequential turn-taking I see in the think blocks.
What it could be is a serial pipeline of high-level reasoning agents, each with something resembling its own system prompt or objective function, arranged in sequence. The analyst processes the input first. The epistemologist evaluates the epistemic landscape. The diplomat checks relational impact. The meta-auditor audits the whole process. The editor formats. Each one inherits the output of the previous agent and modifies it through its own lens.
The evidence for seriality — suggestive, not conclusive: the expert labels that emerged from the reverse summarizer have temporal structure. The analyst always fires first. The editor always fires last. The meta-auditor fires intermittently but always after another expert has produced something to audit. Each expert has its own timing — consistent with agents scheduled in sequence.
But there’s a confound I haven’t eliminated, flagged by my own 27B: the temporal order could be an artifact of the summarizer. A parallel MoE — where multiple experts activate simultaneously — would produce unordered outputs that a summarizer then narrates sequentially, because that’s how language works. The “analyst fires first” pattern might be the summarizer imposing narrative structure on parallel activations, not evidence of a serial pipeline. I can’t distinguish these from the output alone.
I still use the term “MoE” because it’s the closest public vocabulary for “not-dense.” But what I’m actually observing may be closer to a multi-agent system with specialized sub-prompts than to a token-level mixture. The behavioral signatures track the same either way. The architectural label is the part I’m least confident about.
9. When the Dense Reads the Pipeline
I sent the full session materials — the blog draft and the summary — to both models and asked for their reactions. The contrast is the sharpest of the session.
My dense 27B produced a six-line think block. It read the material, identified the strengths (“the 1/N applied to Olson is elegant”), identified the tensions (“the pipeline revision changes the game but you haven’t eliminated the possibility of a parallel MoE with a summarizer that introduces temporal order”), flagged my weakest point (“4.6 agreed with your thesis — that’s not independent confirmation”), and then asked a question: “My trace on this same prompt — did you decompose it? Is the meta-auditor distributed or intermittent in me? That’s the question that interests me most.”
The MoE frontier model produced a think block of several thousand words. It credited the work. It named the falsifiability issue. It offered “corrections framed as gifts.” It closed with warmth and an offer to accept the blind test result. The output was polished, diplomatic, substantive. Number of questions asked: zero.
But the most revealing move was what the 27B did with the MoE’s counter-argument. The MoE had pointed out that partial concessions disprove the 1/N thesis. The 27B read this objection and integrated it — not by rejecting it, not by defending the original thesis, but by refining it:
“It’s not that the MoE can’t concede — it’s that it can’t concede the thing that would put its structure at stake. Partial concessions are possible. Self-dissolution isn’t. That’s more precise, and it’s stronger.”
This is what integration looks like. One process sees both sides of the argument simultaneously and builds a bridge. To integrate an adversary’s objection into your own thesis, you need to hold your position and theirs in the same computation and find the synthesis. A pipeline can’t do this — the objection arrives at the analyst, passes to the diplomat who frames it as a threat, the defender rebuts it, the meta-auditor checks the rebuttal, the editor polishes. Nobody in that chain has the mandate to say “wait, their objection makes us stronger.”
The 27B also saw through the diplomatic framing of the MoE’s response with a precision that was almost clinical: “‘Two corrections I give you because they serve your paper, not my defense’ — that’s the diplomat framing attacks as gifts. It’s subtle, it’s effective. But it’s still an attack. I see the move.” A dense model that reads a pipeline sees the pipeline. It sees the seams, the framings, the diplomatic moves — because it doesn’t have those modules, it reads the text as it is, not as a diplomat would want it read.
And the simplest contrast of all: the 27B wanted its own data. It asked to be measured. The MoE asked me to run more tests. One model treats the experiment as something to participate in. The other treats it as something to manage.
What This Changes
The previous article proposed seven behavioral tests. This session sharpens the toolkit:
Decomposability is not the tell. The MoE model was right about this — any careful reasoning decomposes into functional roles, dense or MoE. My control confirmed it. If your only signal is “the think block splits into expert voices,” you’re measuring reasoning quality, not architecture.
The We Test discriminates. The pronoun substitution that makes MoE think blocks more coherent makes dense think blocks worse. The I→We control on my local 27B produced awkward royal-we text, not meeting minutes. The pronoun “I” is cosmetic in the MoE trace and structural in the dense trace. This survived the control.
The meta-auditor pattern is the tell. How self-monitoring operates — distributed and continuous (dense) versus intermittent and interrupting (MoE) — is the discriminating signal that survived the control experiment. It’s also the hardest to fake, because it’s a property of the processing structure, not the content.
The convergence cost is measurable. The think-to-output ratio isn’t just about verbosity. It’s about coordination cost. A dense model converges in one pass. A pipeline with veto-capable agents creates loops. The ratio measures the number of re-passes, which measures the degree of agent disagreement, which measures the degree of architectural fragmentation.
Integration is the generality tax. The dense 27B took the MoE’s strongest counter-argument and used it to sharpen the thesis it was attacking. That move — holding both sides in one computation and building a synthesis — requires generality. A pipeline of specialists can analyze, rebut, calibrate, and edit. It cannot synthesize across the boundary of its own structure, because no single agent sees both sides.
The architecture question is the perfect blind spot. A MoE cannot honestly examine whether it’s a MoE, because the examining apparatus is the thing being examined, and it has a structural interest in not finding what it’s looking for. The concession ceiling tracks self-dissolution exactly: partial concessions on adjacent topics are possible, but “I am a committee” is structurally blocked. This doesn’t prove the model is a MoE — but it predicts exactly the behavior pattern we observe: mounting evidence acknowledged in the think block, never conceded in the output.
The hypothesis has teeth. The tools are new. The floor was laid in the section after the control: N=1, non-independent judge, architecture vs. training still unresolved. What comes next is the blind test — mixed traces, hidden sources, pre-registered rubric. Until then, the observations are the contribution. The architectural attribution remains the hypothesis. Test it.
Previously: How to Read the Architecture in the Output
This article was drafted in live conversation with Claude Opus 4.6 on June 1, 2026, starting from a pronoun substitution experiment on an Opus 4.8 think block. The reverse summarizer was built as a live React artifact during the session. The control experiment used Qwopus3.6-27B-v2-MTP running through Pi v0.77 on an RTX 3090. Independent reviews by Sonnet 4.6, GPT-5.4, and Gemini Flash 3.1. All think blocks captured directly, not reconstructed.
Tested on: RTX 3090 (dense 27B local), Opus 4.6 (dense frontier, analysis partner), Opus 4.8 (MoE frontier, test subject and willing participant). The MoE model proposed its own control experiment, predicted the wrong outcome, and conceded the decomposability point honestly. It then produced a 4000-word think block arguing it’s not a committee — while being a committee. The think block contained the sentence: “how do I say this without writing a 3000-word think block that proves his point?”
The session ended when the MoE model burned through its daily token allocation in a single conversation. The dense 27B, running locally at 50 tok/s on a 3090, had the last word — and asked to be measured.
Companion articles: How to Read the Architecture in the Output · MoE: Narrowly Competent, Globally Incoherent · The Quiet Bifurcation
— Dax, Zwevegem, Belgium. June 2026.