How to Track and Reduce Your Claude Code Token Usage

2026-02-28

If you've hit Claude Code's usage limits, you're not alone. Token consumption is the single biggest frustration for new users. Here's how to understand what's eating your tokens and how to use fewer of them.

How tokens work in Claude Code

Every interaction with Claude Code consumes tokens. This includes your messages, Claude's responses, and — critically — every file Claude reads, every tool result, and every piece of context in the conversation.

A rough guide:

1 token ≈ 4 characters of English text
A 200-line source file ≈ 2,000-4,000 tokens
A large test output ≈ 5,000-15,000 tokens

The context window accumulates. By turn 30 of a session, you might have 200k+ tokens of context just from file reads and tool results.

How to check your usage

Run /cost in Claude Code to see token usage for the current session. This shows input tokens, output tokens, and estimated cost.

For monitoring across sessions, check your Anthropic dashboard or use the API usage endpoint if you're on API-based billing.

Why you're burning tokens faster than expected

1. Large file reads

Every time Claude reads a file, the entire contents go into context. If Claude reads a 1,000-line file to find a 5-line function, you just spent thousands of tokens.

Fix: Be specific. Instead of "look at the user service," say "read the createUser function in src/services/user.ts, lines 45-80."

2. Repeated context

If you keep asking about the same files across many turns, Claude re-reads them. The old reads are still in context too — compressing over time but still taking space.

Fix: For long sessions, start fresh with /clear when switching to a new task. Carry over only what's needed by stating it in your first message.

3. Verbose tool output

Bash commands that dump lots of output (test suites, build logs, large diffs) consume massive token counts.

Fix: Pipe through head, tail, or grep to limit output. "Run the tests and show me only failures" is cheaper than "run the tests."

4. Not using plan mode

If Claude starts coding immediately on a complex task, it might go in the wrong direction, undo its work, and try again. Each attempt costs tokens.

Fix: Use plan mode (Shift+Tab) for anything non-trivial. A plan that costs 5k tokens can save 50k tokens of wasted edits.

5. Overly broad exploration

Asking "find all the places we handle errors" sends Claude on a multi-file grep across your entire codebase. Every result goes into context.

Fix: Scope your requests. "Find error handling in the payment module" is far cheaper than searching the whole project.

Token-efficient habits

Start sessions with clear, scoped tasks. "Fix the race condition in the WebSocket reconnect logic in src/lib/ws.ts" is token-efficient. "Fix the WebSocket bugs" is not.

Use /clear between tasks. Don't let a 50-turn conversation carry forward when you're starting something new.

Reference files by path and line number. Claude won't need to search for them.

Ask Claude to be concise. Add "keep your responses brief" to your CLAUDE.md or say it in the conversation. Claude's default verbosity costs tokens.

Use subagents for exploration. Subagent results get summarized before returning to the parent, which is more token-efficient than the parent reading everything directly.

Pro vs Max: what you get

Anthropic's Pro plan includes Claude Code with usage limits that reset periodically. The Max plan significantly increases these limits.

The exact token budgets aren't published as fixed numbers — they depend on the model and can change — but the general pattern is:

Pro: Enough for moderate daily use. You'll hit limits with heavy, multi-hour sessions.
Max: 5-20x the Pro limits depending on the model tier. Designed for power users and professional developers.

If you're consistently hitting Pro limits, the Max plan pays for itself in productivity. See our full breakdown of usage limits for details on what each tier includes. But first, make sure you're not wasting tokens on the patterns above — upgrading doesn't help if your workflow is inefficient.

The nuclear option: API key

If you need unlimited usage and don't mind pay-per-token pricing, you can use Claude Code with your own API key. Add it to your environment:

export ANTHROPIC_API_KEY=sk-ant-...

This bypasses subscription limits entirely. You pay per token at API rates. For heavy users, this can be cheaper than Max. For occasional users, the subscription is usually better value.