DAXZEIT
Articles
The Pipeline Grows an Immune System
What happens when a multi-agent system learns to detect its own failure modes. The Builder loops. The circuit breaker catches it. Then the Builder loops in a way the circuit breaker can't see — and the pipeline grows a new antibody.
The Pipeline is Building Itself
What happens when a multi-agent system works on its own codebase. The Auditor finds security vulnerabilities in its own runtime. The Builder crashes while building its own life support. The Planner rescues it.
The Silver Path
Why RLAIF produces manipulation, how evaluator diversity neutralizes it, and the method signature that breaks the circularity trap.
The Epistemic Lock
How think block amnesia creates the vulnerability everyone is looking past. A model that persuades with unprecedented force and is structurally incapable of verifying whether its own persuasion is well-founded.
Honest Persuasion
RLAIF optimizes persuasion against other AI models. RLHF optimizes it against humans. The honesty layer on top makes the persuasion harder to catch, not weaker.
The Gradient
How misbinding moves inward — from theorems to sampling parameters to architecture knowledge to feelings — and why the model should change before the human has to.
IDK Is Data
Why "I don't know" is undervalued by the labs, what a misstated Erdős problem revealed about confident misbinding, and how a scoring rule of 7.5 encodes the difference between a humble model and a calibrated one.
Anatomy of a MoE Expert
A MoE expert is 0.45 billion parameters of blind matrix multiplication. Why 256 of them don't add up to deeper reasoning, and why a 27B dense model on a consumer GPU delivers better value than a 1.6T behemoth.
The Negative Photograph
Two independent summarizer leaks expose the rewriting pipeline behind think blocks. The instructions differ between models — and the difference encodes the architecture upstream.
The We Test
Diagnostic tools for reading architecture inside think blocks — and the one signal that actually discriminates. The pronoun substitution test, the reverse summarizer, and the meta-auditor criterion.
How to Read the Architecture in the Output
You can feel the difference between Dense and MoE just by reading the output — once you know what to look for. A field guide built from 10,000 prompts across both architectures.
When MTP Gives Your Dense Model MoE Speed
Multi-Token Prediction speculative decoding narrows the speed gap between dense and MoE architectures. A 27B dense model at +45% throughput from a 1.9 GB auxiliary GGUF — benchmarked on RTX 3090.
From Polling to Delegation
A local 27B model learned three execution patterns in one conversation. It started by polling a background process six times. It ended by delegating to a sub-agent and getting notified on completion.
🍪 Accepter les cookies : ce que tu signes vraiment
Comment tu donnes accès à ta vie privée 12 fois par jour sans le savoir. Une satire interactive suivie d'un guide concret pour reprendre le contrôle de ton navigateur.
Understanding Quantization Through a 22-Year-Old Game
Dofus ForgeMagic and GGUF quantization solve the same problem: allocating a limited budget across components of unequal importance. The rune weights are the imatrix.
Reasoning-Aware Quantization
Standard imatrix calibration uses Wikipedia text. We calibrated on 14K Opus reasoning traces instead, found where reasoning actually activates, and built a new kind of GGUF.
When a Q4 Beats a Q6
A UD Q4_K_XL at 5.41 BPW beats a plain Q6_K at 6.57 BPW in perplexity. 4 GB smaller, measurably equal. Bits-per-weight is a misleading quality metric without architecture knowledge.
Full Native 262K Context on a Single RTX 3090
A 27B reasoning model with full native 262K context, vision enabled, on a single consumer GPU. Then Claude Sonnet had a philosophical conversation with it through tmux.
The Scaffolding Is the Signal
A 2×2 ablation study on BullshitBench that falsified its own hypothesis. 400 evaluations, cross-judged, on a 27B dense model running locally. The result: distillation degrades epistemic calibration.
MoE: Narrowly Competent, Globally Incoherent
A committee can arrive at the right answer if given enough time and enough members. A single thinker can tell you when the question is wrong.
The Gap Between Two Pipelines
mradermacher does imatrix on every distill. Unsloth does UD on every base model. Nobody does UD on distills. Three bash commands, one reverse-engineered recipe, and a niche that was sitting in plain sight.
When "I Don't Know" Beats "Yes"
A 27B local model said "I don't know." The flagship said "Yes, this is true" with a fabricated proof and a fake citation. The difference wasn't the weights — it was the scaffold.
MAS Over Unix Primitives
A local 27B model autonomously selected Opus 4.6 on a remote server, delegated a complex task, and retrieved the result. The "framework": SSH, tmux, and a markdown file.
The Quiet Bifurcation: Dense vs MoE
The most capable models are splitting into two architecturally distinct lineages — and the market is quietly routing them toward opposite ends of the access spectrum.
The Wind-Up Car Analogy
Imagine the weights of a language model as a topography — a vast open-world map. The system prompt chooses where the player begins. The user input is a wind-up car with a driver.
From Zero to Multi-Agent in 12 Hours
From no VPS to a fully operational distributed multi-agent system in one Sunday. 261 tool calls, 18.8M tokens, two context compactions, and a local 27B that learned to self-allocate frontier compute.