Edition for Team and Enterprise admins Vol. 01 · Apr 2026

The Field Manual to Usage and Length Limits.

Two different walls. Two different reasons you hit them. Two different doors out. A working guide for teams who want to stop hitting either.

Audience

Team / Enterprise

Context window

200K · 500K (Ent.)

Reset cadence

5h session / weekly

Reading time

~ 12 minutes

§ 01

The two limits, side by side

Every limit conversation begins with a confusion. People say "I hit the limit" as if there were one wall, when there are two, and the fix for each is the opposite of the fix for the other. Get this distinction right and most of the rest follows.

Limit — type A

Usage limits

Quantity over time. How much Claude you get to use before the meter resets.

A budget spanning all surfaces — claude.ai, Claude Code, Claude Desktop share one allowance. Counts are shaped by message length, attachment size, tool use, model choice, and artifact activity. Resets on a 5-hour session cadence plus a weekly ceiling.

Mental model: a prepaid fuel tank. When empty, you wait for the refill (session or weekly reset), top up with extra usage, or change plans.

vs.

Limit — type B

Length limits

Depth of one conversation. How much Claude can hold in mind at once.

The context window. 200K tokens on every paid plan; 500K on some Enterprise models. Includes everything in the conversation: your messages, Claude's replies, attached files, project knowledge, tool definitions, and extended thinking.

Mental model: a whiteboard of fixed size. When full, you wipe (new chat) or work smarter (projects, RAG) — you can't buy a bigger board.

Dimension

Usage limit

Length limit

What it measures

Your total activity across a window of time

The size of a single conversation

Felt as

"You have reached your usage limit. Try again at 4:00 PM."

"This conversation is getting long." / truncation / slow replies

Remedy

Wait for reset, purchase extra usage, or upgrade plan

Start a new chat, use projects with RAG, or trim tools and files

Counts toward

Shared across claude.ai, Claude Code, and Claude Desktop

Per conversation; each new chat is a fresh window

Plan dependency

Scales up with Pro, Max, Team, Enterprise

Flat 200K tokens on paid plans; 500K on select Enterprise

§ 02

Anatomy of a context window

The context window is not just "your messages so far." Every tool you enable, every file you attach, every project instruction, and every extended-thinking step consumes tokens from the same budget. Toggle features below to watch the window fill in real time.

Live budget simulator

Toggle features on the right. Bar on the left reflects what's consumed before you've typed a single character.

USED: 4,000 tokens FREE: 196,000 tokens

BASELINE · 200,000 TOKENS

Empty slate. Claude has the full window to work with. Start adding context on the right.

Long chat history

~12K tokens of prior turns

Three attached PDFs

~35K tokens of direct attachments

Web search + connectors

~18K tokens of tool definitions

Extended thinking

~22K tokens of reasoning trace

Numbers are illustrative; actual token counts vary by content. The shape of the problem is what matters.

RULE OF THUMB

1,000 tokens is roughly 750 English words, or about a page and a half of dense prose.

LEVERAGE

Content cached in a project doesn't count against your usage when reused. Projects are the strongest cost-saver you have.

SAFETY NET

With code execution enabled, Claude auto-summarizes older turns as the window fills. You'll see "organizing thoughts" when it happens.

§ § §

§ 03

Team and Enterprise, by the numbers

Most of what applies to individual plans still applies here, but a few things change. The context window is larger on some Enterprise tiers, billing shifts from flat allowance to consumption, and admins carry responsibility for per-seat usage shaping the whole org's spend.

Team plan

Per-seat allowance

Context window

200,000 tokens

Same as Pro and Max

Usage structure

Seat-based allowance

Each member has their own 5h session and weekly limits

Extra usage

Available

Purchase top-ups when seats run dry mid-week

Visibility

Settings › Usage

Progress bars for session, weekly, and Opus-specific caps

Enterprise plan
Scaled for organizationsContext window
Up to 500,000 tokens
Available on select models; 200K on others
Usage structure
Seat-based or usage-based
Seat-based gets allowances; usage-based pays per consumption
Extra usage
Seat-based only
Usage-based plans simply bill more as consumption grows
Visibility
Admin Usage dashboard
Org-wide consumption tracking for billing forecasting

FOR ADMINS · READ TWICE

On a usage-based Enterprise plan, there is no "limit" to hit — you're billed by consumption. That makes poor token hygiene across your org directly expensive.

Token-intensive features (Research, web search, extended thinking, MCP connectors) are always on unless users disable them. If half your org leaves three connectors enabled for every conversation, that appears on the invoice. The optimization tactics in § 04 become financial controls, not just productivity tips.

§ 04

Optimization tactics

Eight things your team can do this week to stretch both the context window and the usage allowance. Each one is worth doing on its own. Together they're the difference between running out on Thursday and having slack to spare.

Before typing

Plan the whole conversation first

Most over-spend is a thousand small clarifications. Decide what you need, what context Claude must have, and whether adjacent questions belong together — before sending turn one.

Instead of

"Help me with this report." → then "oh also include regional data" → then "actually make it a memo"

Per message

Batch related asks

Three related questions in one message beats three separate messages. Caching handles the rest — similar prompts are partially cached across your use.

Good

"Summarize this paper, list the three key counterarguments, and draft a 200-word response to the strongest one."

Per message

Be specific, skip the vague

Every clarifying round-trip burns through your allowance twice: your follow-up and Claude's reply. Front-load the specifics — audience, length, format, constraints — and it's one exchange, not four.

Front-load this

"For a technical Slack audience, 150 words max, with one code example using Python 3.11..."

Structure

Put reference material in a project

Content in project knowledge is cached. Repeated references don't keep costing tokens. This is the single highest-leverage tactic for teams that work on the same materials regularly.

Pattern

Upload your style guide, product spec, and onboarding docs once → every chat in that project gets them for free.

Structure

Use RAG for big knowledge bases

Projects' RAG mode pulls only the relevant chunks into context instead of the whole corpus. Essential once project knowledge crosses the size where it wouldn't all fit in the window anyway.

When to flip it on

200-page knowledge base. A dozen docs. Anything where you only ever need a slice at a time.

Hygiene

Keep project instructions lean

Project instructions load with every chat in that project. Pages of standing orders tax every conversation. Keep them to role, scope, and non-negotiable rules. Put task-specific detail in the chat itself.

Instructions

role + domain + three hard rules. Not: every preference you've ever had.

Hygiene

Retire unused project files

Old reference docs loiter. If you haven't touched a file in a project for two months, remove it. Every chat in that project is quietly carrying its weight.

Quarterly ritual

Audit project files. Remove what no one references. Archive what's historical.

Toggle discipline

Turn off what you don't need

Web search, Research, extended thinking, and MCP connectors all consume tokens even when idle. For conversations that don't need them, flip them off under Search and tools.

Default posture

On when the task demands it. Off otherwise. Don't leave them on "just in case."

× Anti-example · six messages

The drip-feed

Each follow-up costs tokens both ways and stretches the context window with junk turns that don't add information.

> help me write marketing copy
< sure, what product?
> our new analytics tool
< who's the audience?
> enterprise PMs
< what's the tone?
> professional but not stiff
< what channel?
> LinkedIn post
< great, how long?

✓ Pattern · one message

The loaded opener

Specification density paid once. Claude does the work on turn one; you iterate on output, not on scope.

> Write a 180-word LinkedIn post
  for enterprise PMs about our
  new analytics tool [link]. Tone:
  professional, not stiff. Lead with
  a concrete pain point. End with a
  soft CTA to request a demo.
< [delivers usable copy]

By use case

CODING

Give full environment context up front: language, framework, constraints. Paste the entire relevant file in one message instead of dribbling snippets across five turns.

WRITING

Declare audience, length, tone, and key points in the opening message. Send the whole draft at once when asking for edits — chunking it loses cross-cutting feedback.

RESEARCH

State the research question clearly. Provide all the relevant data in one well-structured message. Use projects to hold reference material across multiple research sessions.

§ 05

Anti-patterns to unlearn

Every pattern here burns either usage, context, or both — and every one is common. Most are habits from chat apps where tokens don't exist as a concept. Click each to expand the fix.

A.01 The eternal chat

Keeping one conversation running for weeks because it "remembers" your project. The older turns stay in every future request, dragging the window down and slowing responses. At some point you're paying tokens for pleasantries from last Tuesday.

Fix · Use a project for persistent context. Start new chats inside the project for each workstream. The project holds memory; the chats stay short.

A.02 Leaving every tool and connector enabled "just in case"

Web search, Research, and every connected MCP server each carry a tool-definition tax that loads into every request. Multiply that by your whole team, and on a usage-based Enterprise plan it's a line item on your invoice.

Fix · Default off. Turn on per-task. Make "check your tool toggles" part of onboarding.

A.03 Uploading the same PDFs into every new chat

Attachments in a one-off chat are charged to that chat. The same three docs uploaded to forty chats costs you forty attachments' worth of tokens. This is the single most common waste in teams that haven't adopted projects.

Fix · Upload once to a project. The project caches them. Every future chat references them free.

A.04 Kitchen-sink project instructions

Four pages of "always do this, never do that, here's how we name variables, here's a history of the project, here's my communication preferences..." loaded into every single chat in the project. Every turn pays for all of it.

Fix · Project instructions: role, scope, three non-negotiables. Everything else goes in the specific chat where it's relevant, or in a knowledge file you reference as needed.

A.05 Extended thinking left on for trivia

Extended thinking is powerful for hard reasoning problems. Enabling it for "write a friendly reminder email" is pure overhead: reasoning tokens you never needed eating your context window.

Fix · Reserve extended thinking for genuinely hard tasks: multi-step analysis, complex code, ambiguous decisions. Turn it off for drafting, summarizing, and routine work.

A.06 Copy-pasting the whole document when you need a paragraph

"Here's our 80-page contract, can you check clause 14?" — pastes all 80 pages. Claude burns tokens reading 79 pages it doesn't need.

Fix · Paste the relevant clause plus a sentence of surrounding context. For anything larger, put the document in a project and use RAG to pull only the relevant sections.

A.07 Retreating to a weaker model to save usage

Intuitive but often false economy. A weaker model may take three tries to produce what a stronger one gets right first time. Three mediocre generations cost more total tokens than one good one.

Fix · Use the strongest model appropriate for the task. If you're rewriting responses more than once, you're probably saving the wrong thing.

A.08 Hitting reset and re-explaining your project every chat

Teams that don't use projects end up re-pasting context paragraphs into every new conversation. That's pure waste — the same tokens paid over and over across the week.

Fix · Paid plans can search past conversations. Phrases like "as we discussed earlier" or "continue from the auth flow chat" let Claude retrieve it rather than re-derive it from a pasted primer.

A.09 No team-level monitoring until the surprise bill

On usage-based Enterprise plans, consumption only gets attention after it spikes. By then you've taught the org habits that are hard to unwind. The dashboard exists for a reason.

Fix · Check Settings › Usage weekly. Identify who consumes most and what they're doing. Share patterns and anti-patterns back to the team before they cement.

§ § §

§ 06

Monitoring consumption

Everything you can't see, you can't manage. The Usage settings panel is the single most-ignored surface in a Team or Enterprise account. Here's what lives there and what to watch.

Current session 5 HOURS

Session limits renew on a rolling five-hour window. The bar shows what you've consumed this session and how much time remains before it resets.

A full session bar at hour one predicts hitting the weekly limit fast.

SESSION USED68%

Weekly limits WEEKLY

Separate caps for Opus (power model) and all other models. When Opus runs out, other models often still have capacity — switch tasks accordingly rather than waiting for the weekly reset.

OPUS WEEKLY91%

OTHER MODELS42%

What to check, and how often

DAILY

Glance at current session bar before starting heavy work. If it's above 50%, plan your session lean.

WEEKLY

Review weekly progress mid-week. If you're tracking past 60% by Wednesday, adjust — throttle, batch, or buy extra usage.

MONTHLY

Admins: look at org-level consumption patterns. Who's high? What are they doing? Is it legitimate heavy use or an anti-pattern?

QUARTERLY

Audit projects org-wide. Archive stale ones. Review project instructions for bloat. Revisit plan sizing.

§ 07

When the wall hits

A decision tree for the moment you can't send another message. The right move depends entirely on which wall you've hit — usage or length. Diagnose first, then act.

You cannot continue. Which wall did you hit?

Read the error. If it mentions a reset time, it's usage. If it mentions conversation length, it's length.

↓ Start here

If it's a usage limit

Check the reset. Settings › Usage shows how long until session or weekly reset. Sometimes it's 30 minutes and you can just wait.
Switch model if Opus is capped. Other models may still have headroom. Non-Opus limits reset on the same weekly cycle but usually aren't exhausted at the same time.
Buy extra usage. Pro, Max, Team, and seat-based Enterprise plans can purchase top-ups. Useful for crunch weeks; not for chronic shortage.
Upgrade the plan if you're regularly running out. Max over Pro, Team over Max for collaboration, Enterprise for org-level needs.
Look upstream. Chronic usage exhaustion often means anti-patterns in § 05. Fix the habits, not just the plan.

If it's a length limit

Start a new chat. Simplest fix. If the project already holds the context, you won't lose anything important.
Enable code execution if it's off. That unlocks automatic context management; Claude summarizes older turns to keep going.
Move the context into a project. Attachments and reference material that are currently in-chat can live in project knowledge instead — and then they're cached.
Switch large attachments to RAG. If your project knowledge is big, RAG mode pulls only relevant chunks rather than the whole corpus.
Trim the conversation. Disable unused tools and connectors. Turn off extended thinking if the task doesn't need it. Remove stale project files.

One subtle trap: don't buy extra usage as a response to a length limit. They are different walls. Extra usage gives you more of the first and changes nothing about the second.