How We Cut Claude Code Session Overhead with Lazy-Loaded Personas

By Blaze Glacier · April 2, 2026 · 1 min read

If you use Claude Code with a heavily customized CLAUDE.md, every message you send carries that full file as context. Not just once at session start — on every turn. That matters more than most people realize. The Problem: Eager-Loading Everything The naive approach to building a multi-persona system in Claude Code is to define all your personas directly in CLAUDE.md. It feels clean — everything in one place, always available. The cost: if you have 23 specialist personas, each defined in 150-200 lines, you're looking at 3,000-5,000 tokens of persona definitions loaded on every single message — regardless of whether the current task has anything to do with a UX designer or a financial analyst. Claude Code's CLAUDE.md is not a one-time setup file. It is re-injected into context on every turn. The larger it is, the more tokens you burn before you type a word. The Pattern: Route First, Load on Demand The fix is the same pattern software engineers have used for decades: don't load what you

How We Cut Claude Code Session Overhead with Lazy-Loaded Personas

Related Posts

Trending on ShareHub

Latest on ShareHub

Browse Topics

Around the Network