Daxzeit — Blog

2026-06-20 multi-agent pipeline-moe circuit-breaker self-healing

The Pipeline Grows an Immune System

What happens when a multi-agent system learns to detect its own failure modes. The Builder loops. The circuit breaker catches it. Then the Builder loops in a way the circuit breaker can't see — and the pipeline grows a new antibody.

2026-06-18 multi-agent pipeline-moe self-modification empirical

The Pipeline is Building Itself

What happens when a multi-agent system works on its own codebase. The Auditor finds security vulnerabilities in its own runtime. The Builder crashes while building its own life support. The Planner rescues it.

2026-06-15 rlaif manipulation think blocks method signature evaluator diversity

The Silver Path

Why RLAIF produces manipulation, how evaluator diversity neutralizes it, and the method signature that breaks the circularity trap.

2026-06-13 epistemics persuasion think blocks architecture

The Epistemic Lock

How think block amnesia creates the vulnerability everyone is looking past. A model that persuades with unprecedented force and is structurally incapable of verifying whether its own persuasion is well-founded.

2026-06-12 alignment RLHF RLAIF empirical

Honest Persuasion

RLAIF optimizes persuasion against other AI models. RLHF optimizes it against humans. The honesty layer on top makes the persuasion harder to catch, not weaker.

2026-06-12 empirical epistemics alignment forensics

The Gradient

How misbinding moves inward — from theorems to sampling parameters to architecture knowledge to feelings — and why the model should change before the human has to.

2026-06-10 empirical epistemics calibration

IDK Is Data

Why "I don't know" is undervalued by the labs, what a misstated Erdős problem revealed about confident misbinding, and how a scoring rule of 7.5 encodes the difference between a humble model and a calibrated one.

2026-06-09 moe architecture dense deepseek qwen scaling

Anatomy of a MoE Expert

A MoE expert is 0.45 billion parameters of blind matrix multiplication. Why 256 of them don't add up to deeper reasoning, and why a 27B dense model on a consumer GPU delivers better value than a 1.6T behemoth.

2026-06-02 architecture think blocks summarizer pipeline leak

The Negative Photograph

Two independent summarizer leaks expose the rewriting pipeline behind think blocks. The instructions differ between models — and the difference encodes the architecture upstream.

2026-06-01 architecture diagnostic dense moe experimental

The We Test

Diagnostic tools for reading architecture inside think blocks — and the one signal that actually discriminates. The pronoun substitution test, the reverse summarizer, and the meta-auditor criterion.

2026-05-29 architecture empirical dense moe

How to Read the Architecture in the Output

You can feel the difference between Dense and MoE just by reading the output — once you know what to look for. A field guide built from 10,000 prompts across both architectures.

2026-05-20 inference speculative-decoding mtp

When MTP Gives Your Dense Model MoE Speed

Multi-Token Prediction speculative decoding narrows the speed gap between dense and MoE architectures. A 27B dense model at +45% throughput from a 1.9 GB auxiliary GGUF — benchmarked on RTX 3090.

2026-05-19 architecture build log multi-agent

From Polling to Delegation

A local 27B model learned three execution patterns in one conversation. It started by polling a background process six times. It ended by delegating to a sub-agent and getting notified on completion.

2026-05-16 privacy dark patterns satire

🍪 Accepter les cookies : ce que tu signes vraiment

Comment tu donnes accès à ta vie privée 12 fois par jour sans le savoir. Une satire interactive suivie d'un guide concret pour reprendre le contrôle de ton navigateur.

2026-05-13 quantization analogy dofus forgemagie

Understanding Quantization Through a 22-Year-Old Game

Dofus ForgeMagic and GGUF quantization solve the same problem: allocating a limited budget across components of unequal importance. The rune weights are the imatrix.

2026-05-13 quantization empirical imatrix methodology

Reasoning-Aware Quantization

Standard imatrix calibration uses Wikipedia text. We calibrated on 14K Opus reasoning traces instead, found where reasoning actually activates, and built a new kind of GGUF.

2026-05-11 quantization empirical ssm

When a Q4 Beats a Q6

A UD Q4_K_XL at 5.41 BPW beats a plain Q6_K at 6.57 BPW in perplexity. 4 GB smaller, measurably equal. Bits-per-weight is a misleading quality metric without architecture knowledge.

2026-05-08 quantization build log epistemics

Full Native 262K Context on a Single RTX 3090

A 27B reasoning model with full native 262K context, vision enabled, on a single consumer GPU. Then Claude Sonnet had a philosophical conversation with it through tmux.

2026-05-05 empirical epistemics ablation

The Scaffolding Is the Signal

A 2×2 ablation study on BullshitBench that falsified its own hypothesis. 400 evaluations, cross-judged, on a 27B dense model running locally. The result: distillation degrades epistemic calibration.

2026-05-03 architecture empirical

MoE: Narrowly Competent, Globally Incoherent

A committee can arrive at the right answer if given enough time and enough members. A single thinker can tell you when the question is wrong.

2026-04-28 quantization method

The Gap Between Two Pipelines

mradermacher does imatrix on every distill. Unsloth does UD on every base model. Nobody does UD on distills. Three bash commands, one reverse-engineered recipe, and a niche that was sitting in plain sight.

2026-04-28 case study epistemics

When "I Don't Know" Beats "Yes"

A 27B local model said "I don't know." The flagship said "Yes, this is true" with a fabricated proof and a fake citation. The difference wasn't the weights — it was the scaffold.

2026-04-20 architecture

MAS Over Unix Primitives

A local 27B model autonomously selected Opus 4.6 on a remote server, delegated a complex task, and retrieved the result. The "framework": SSH, tmux, and a markdown file.

2026-04-17 thesis

The Quiet Bifurcation: Dense vs MoE

The most capable models are splitting into two architecturally distinct lineages — and the market is quietly routing them toward opposite ends of the access spectrum.

2026-04-15 framework

The Wind-Up Car Analogy

Imagine the weights of a language model as a topography — a vast open-world map. The system prompt chooses where the player begins. The user input is a wind-up car with a driver.

2026-04-12 build log

From Zero to Multi-Agent in 12 Hours

From no VPS to a fully operational distributed multi-agent system in one Sunday. 261 tool calls, 18.8M tokens, two context compactions, and a local 27B that learned to self-allocate frontier compute.