Your CLAUDE.md Is Making Claude Dumber

ETH Zurich research shows bloated CLAUDE.md files reduce AI coding performance and increase costs. Here's what to keep, what to kill, and why 60 lines beats 800.

Apr 06, 2026

Your CLAUDE.md is 800 lines long. You spent a weekend organizing it into 27 modular files with a routing system. You wrote a blog post about it. You got upvotes.

Claude is ignoring most of it.

There’s an arms race happening in the Claude Code community right now. Every week, someone posts their increasingly elaborate CLAUDE.md setup. 27-file architectures. Tiered loading systems. Router patterns with conditional context injection.

One developer split their CLAUDE.md into 27 files with a three-tier routing system. 360 upvotes. The post opens with: “My CLAUDE.md was ~800 lines. It worked until it didn’t. Rules for one context bled into another, edits had unpredictable side effects, and the model quietly ignored constraints buried 600 lines deep.”

The top comment, with 81 upvotes? “So not sure if you realised you can have descendant CLAUDE.md so you don’t even need to do this.”

Meanwhile, a developer in the same thread: “I don’t even use claude.md. Y’all are roleplaying being productive. Just work with it 1:1.”

One group is optimizing. The other is actually working.

The Research Says You’re Doing It Wrong

ETH Zurich researchers published a paper that should have made every CLAUDE.md maximalist uncomfortable. Their finding: context files — the .md files we all obsess over — tend to reduce task success rates compared to providing no repository context at all. And they increase inference cost by over 20%.

Read that again. No CLAUDE.md outperformed having one. On average.

When this paper hit Reddit, the poster titled it “No CLAUDE.md → baseline. Bad CLAUDE.md → worse. Good CLAUDE.md → better.” — an optimistic spin suggesting the file isn’t the problem, your writing is. The post got 209 upvotes. But the top comments immediately called it out: OP had misread the data. The actual finding was that having any .md file — human or LLM-written — led to worse performance than having none. The auto-generated thread summary confirmed it: “The consensus in this thread is that you’ve completely misread the paper.”

It gets worse. LLM-generated .md files hurt the most, because they just parrot back what’s already in the code. Human-written files showed a slight positive impact — but only when kept to an absolute minimum, and only for smaller models.

A separate benchmark of 1,188 runs across Haiku, Sonnet, and Opus confirmed this. Twelve coding tasks. Ten instruction profiles. The result: an empty CLAUDE.md scored best overall.

The researcher’s own correction was admirably blunt: “I was wrong about CLAUDE.md compression. Here’s what the data actually showed.”

You Have an Instruction Budget. You’re Blowing It.

Here’s the mechanism nobody talks about.

Frontier models reliably follow about 150 to 200 instructions before performance starts decaying. Not crashing — decaying. Every additional instruction slightly degrades compliance with every other instruction. The degradation is uniform. Your critical “NEVER delete the production database” rule gets weaker every time you add “prefer camelCase for variable names.”

Claude Code’s own system prompt already burns about 50 of those instruction slots. That’s before your CLAUDE.md even loads.

So you have roughly 100-150 instruction slots left. Your 800-line CLAUDE.md with coding conventions, style guides, architecture decisions, tool preferences, workflow rules, and team norms is trying to cram 400 instructions into 150 slots.

The model doesn’t crash. It just quietly starts ignoring things. Specifically, the things buried deepest in the file. Your most important rules — the ones you added after painful debugging sessions — are probably at the bottom. Which means they’re the first to get deprioritized.

Claude Is Designed to Ignore You

This is the part that should make you pause.

Claude Code’s system prompt includes this line about CLAUDE.md content:

“This context may or may not be relevant to your tasks. You should not respond to this context unless it is highly relevant.”

Claude is literally instructed to deprioritize your instructions if they don’t seem relevant to the current task. The more task-specific content you stuff into CLAUDE.md, the more likely Claude treats the entire file as noise.

That database schema guidance? Irrelevant when Claude is working on frontend CSS. Those API naming conventions? Noise when it’s writing tests. Your elaborate deployment workflow? Invisible during a refactoring session.

Every irrelevant instruction trains Claude to ignore the relevant ones too.

The Context Window Tax

Here’s the math nobody does. Claude Code’s system prompt alone consumes roughly 23,000 tokens — about 11% of the 200K context window, gone before you type a word. Add your CLAUDE.md, your MCP tool schemas, skill descriptions, memory files, and rules. One developer measured 69,200 tokens of overhead — 35% of the context window consumed before a single user message. Others in the thread pushed back on that specific number, but the principle stands: every always-loaded instruction competes with working memory.

And it’s not just a cost problem. It’s an accuracy problem. The fuller the context window gets, the worse Claude performs — what Anthropic calls context rot. Your elaborate CLAUDE.md isn’t just burning tokens. It’s actively degrading the quality of every response.

The Leverage Problem

Here’s why this matters more than you think.

Bad code is localized. You write a buggy function, it breaks one feature. You fix it, you move on.

Bad CLAUDE.md instructions compound. A single misguided rule in your CLAUDE.md affects every research phase, every plan, every implementation, every session. One line that says “always use verbose error messages with full stack traces” produces thousands of lines of noisy code across your entire codebase, across every agent, across every session.

Your CLAUDE.md is the highest-leverage file in your repo. Most people treat it like a junk drawer.

What the Minimalists Actually Do

I went looking for people who run Claude Code with minimal or no CLAUDE.md. They’re out there. They’re quiet about it because “I don’t use CLAUDE.md” doesn’t get upvotes.

One developer on Reddit: “I use Claude Code bare bones professionally. It all sounds like bloat not giving real value.” Another: “I load no skills, no agents, no MCP Servers and rock it all day every day, 12 hours a day. Life is good.”

A developer who built a 13-agent orchestration system with 8,157 lines of markdown deleted 93% of it. His conclusion: “My enhancement layer was making Claude dumber by filling its brain with instructions about how to think, leaving less room for actual thinking.” After the deletion, Claude performed better on the same tasks.

Another developer with a 350-line CLAUDE.md and 20+ custom MCP tools put it simply: “It feels like the more context I add the more it struggles to get the job done. It seems to get ‘dumber’.”

And when someone asked the community to break down the meta on all the conflicting CLAUDE.md advice, the most honest reply got it right: “If ‘best practices’ are conflicting, it’s probably a sign of them mostly being a type of placebo on the part of the folks posting them. The human mind has a weird need to be the special one who cracked the code.”

The pattern is consistent: people who remove instructions report better results than people who add them.

Instructions Raise the Floor, Not the Ceiling

The benchmark data revealed something nuanced. Instructions don’t make Claude better on average. They make it more consistent.

On tasks where Claude already performs well, instructions add nothing. On tasks where Claude struggles, a focused workflow checklist gave Opus a +5.8 point lift and raised its worst-case score by 20+ points.

A 2,455-evaluation benchmark across Sonnet and Opus confirmed a related finding: the best-performing configuration was a short CLAUDE.md with pointers to skills that load on demand — not a massive monolith, not 27 modular files, but a minimal routing layer that tells Claude where to find context when it’s actually needed.

This changes everything about how you should think about CLAUDE.md.

Don’t use it to make Claude smarter. Use it to prevent Claude from being stupid in specific, known ways. The difference between those two goals is the difference between a 60-line file and an 800-line file.

What Actually Belongs in CLAUDE.md

After digging through research, benchmarks, and hundreds of Reddit threads, here’s what survives the cut:

The What-Why-How skeleton (under 60 lines):

WHAT: Your stack, project structure, key directories
WHY: What this project does and for whom
HOW: Build commands, test commands, deploy commands

Negatives over positives:
“NEVER use X” sticks. “Always prefer Y” fades. If you can phrase it as a prohibition, it enforces better. “DO NOT modify the database schema without migration files” beats “Always create migrations when changing the schema.”

Trigger-action format:
“WHEN CI fails, DO NOT push until fixed” enforces consistently. “Always test before pushing” doesn’t. Specificity matters.

Pointers, not content:
Reference external docs instead of embedding them. “See agent_docs/database.md for schema guidance” loads on demand. Pasting the full schema into CLAUDE.md loads every single session, whether Claude needs it or not.

Subdirectory CLAUDE.md files:
Claude auto-loads CLAUDE.md from whatever directory it’s reading files in. Put backend rules in backend/CLAUDE.md. Put frontend rules in frontend/CLAUDE.md. Context-specific rules load only when contextually relevant.

What Doesn’t Belong

Style guides. Claude is an in-context learner. If your code follows consistent patterns, Claude will match them without being told. Use linters and formatters — they’re deterministic, fast, and don’t eat instruction budget.

LLM-generated instructions. The research is clear: auto-generated .md files hurt performance. Don’t use /init. Don’t ask Claude to write its own CLAUDE.md. The model just repeats what’s already in the code, wasting tokens to tell itself what it already knows.

Lessons learned logs. Once the lesson is codified in the codebase itself — as a test, a lint rule, a hook — the .md entry is redundant. Delete it.

Persona assignments. “You are a meticulous senior engineer who always...” is a costume, not a capability. As one developer running overnight cron agents put it: “A syntax check that returns exit code 1 on failure > 2,000 words of ‘you are a meticulous senior engineer who always...’” The agents with minimal instructions consistently outperformed the ones with elaborate persona prompts.

The Real Best Practice

Keep your CLAUDE.md under 100 lines. Ideally under 60. Put the most important rules at the top. Phrase them as negatives. Use trigger-action format. Point to external docs instead of embedding content.

Then stop optimizing and go build something.

The developers shipping the most code aren’t the ones with the fanciest CLAUDE.md architectures. They’re the ones who figured out the minimum viable instructions and moved on to the actual work.

Your CLAUDE.md is not your product. Stop treating it like one.

Zero to SaaS

Discussion about this post

Ready for more?