Two different walls. Two different reasons you hit them. Two different doors out. A working guide for teams who want to stop hitting either.
Every limit conversation begins with a confusion. People say "I hit the limit" as if there were one wall, when there are two, and the fix for each is the opposite of the fix for the other. Get this distinction right and most of the rest follows.
Quantity over time. How much Claude you get to use before the meter resets.
A budget spanning all surfaces — claude.ai, Claude Code, Claude Desktop share one allowance. Counts are shaped by message length, attachment size, tool use, model choice, and artifact activity. Resets on a 5-hour session cadence plus a weekly ceiling.
Depth of one conversation. How much Claude can hold in mind at once.
The context window. 200K tokens on every paid plan; 500K on some Enterprise models. Includes everything in the conversation: your messages, Claude's replies, attached files, project knowledge, tool definitions, and extended thinking.
The context window is not just "your messages so far." Every tool you enable, every file you attach, every project instruction, and every extended-thinking step consumes tokens from the same budget. Toggle features below to watch the window fill in real time.
Toggle features on the right. Bar on the left reflects what's consumed before you've typed a single character.
Most of what applies to individual plans still applies here, but a few things change. The context window is larger on some Enterprise tiers, billing shifts from flat allowance to consumption, and admins carry responsibility for per-seat usage shaping the whole org's spend.
Token-intensive features (Research, web search, extended thinking, MCP connectors) are always on unless users disable them. If half your org leaves three connectors enabled for every conversation, that appears on the invoice. The optimization tactics in § 04 become financial controls, not just productivity tips.
Eight things your team can do this week to stretch both the context window and the usage allowance. Each one is worth doing on its own. Together they're the difference between running out on Thursday and having slack to spare.
Most over-spend is a thousand small clarifications. Decide what you need, what context Claude must have, and whether adjacent questions belong together — before sending turn one.
Three related questions in one message beats three separate messages. Caching handles the rest — similar prompts are partially cached across your use.
Every clarifying round-trip burns through your allowance twice: your follow-up and Claude's reply. Front-load the specifics — audience, length, format, constraints — and it's one exchange, not four.
Content in project knowledge is cached. Repeated references don't keep costing tokens. This is the single highest-leverage tactic for teams that work on the same materials regularly.
Projects' RAG mode pulls only the relevant chunks into context instead of the whole corpus. Essential once project knowledge crosses the size where it wouldn't all fit in the window anyway.
Project instructions load with every chat in that project. Pages of standing orders tax every conversation. Keep them to role, scope, and non-negotiable rules. Put task-specific detail in the chat itself.
Old reference docs loiter. If you haven't touched a file in a project for two months, remove it. Every chat in that project is quietly carrying its weight.
Web search, Research, extended thinking, and MCP connectors all consume tokens even when idle. For conversations that don't need them, flip them off under Search and tools.
Each follow-up costs tokens both ways and stretches the context window with junk turns that don't add information.
Specification density paid once. Claude does the work on turn one; you iterate on output, not on scope.
Give full environment context up front: language, framework, constraints. Paste the entire relevant file in one message instead of dribbling snippets across five turns.
Declare audience, length, tone, and key points in the opening message. Send the whole draft at once when asking for edits — chunking it loses cross-cutting feedback.
State the research question clearly. Provide all the relevant data in one well-structured message. Use projects to hold reference material across multiple research sessions.
Every pattern here burns either usage, context, or both — and every one is common. Most are habits from chat apps where tokens don't exist as a concept. Click each to expand the fix.
Keeping one conversation running for weeks because it "remembers" your project. The older turns stay in every future request, dragging the window down and slowing responses. At some point you're paying tokens for pleasantries from last Tuesday.
Web search, Research, and every connected MCP server each carry a tool-definition tax that loads into every request. Multiply that by your whole team, and on a usage-based Enterprise plan it's a line item on your invoice.
Attachments in a one-off chat are charged to that chat. The same three docs uploaded to forty chats costs you forty attachments' worth of tokens. This is the single most common waste in teams that haven't adopted projects.
Four pages of "always do this, never do that, here's how we name variables, here's a history of the project, here's my communication preferences..." loaded into every single chat in the project. Every turn pays for all of it.
Extended thinking is powerful for hard reasoning problems. Enabling it for "write a friendly reminder email" is pure overhead: reasoning tokens you never needed eating your context window.
"Here's our 80-page contract, can you check clause 14?" — pastes all 80 pages. Claude burns tokens reading 79 pages it doesn't need.
Intuitive but often false economy. A weaker model may take three tries to produce what a stronger one gets right first time. Three mediocre generations cost more total tokens than one good one.
Teams that don't use projects end up re-pasting context paragraphs into every new conversation. That's pure waste — the same tokens paid over and over across the week.
On usage-based Enterprise plans, consumption only gets attention after it spikes. By then you've taught the org habits that are hard to unwind. The dashboard exists for a reason.
Everything you can't see, you can't manage. The Usage settings panel is the single most-ignored surface in a Team or Enterprise account. Here's what lives there and what to watch.
Session limits renew on a rolling five-hour window. The bar shows what you've consumed this session and how much time remains before it resets.
A full session bar at hour one predicts hitting the weekly limit fast.
Separate caps for Opus (power model) and all other models. When Opus runs out, other models often still have capacity — switch tasks accordingly rather than waiting for the weekly reset.
A decision tree for the moment you can't send another message. The right move depends entirely on which wall you've hit — usage or length. Diagnose first, then act.
Read the error. If it mentions a reset time, it's usage. If it mentions conversation length, it's length.