Themata.AI
Themata.AI

Popular tags:

#developer-tools#ai-agents#llms#claude#code-generation#ai-ethics#openai#ai-safety#anthropic#open-source

AI is changing the world. Don't stay behind. Clear summaries, community insight, delivered without the noise. Subscribe to never miss a beat.

© 2026 Themata.AI • All Rights Reserved

Privacy

|

Cookies

|

Contact
claudeanthropicdeveloper-toolsbug-reports

Pro Max 5x quota exhausted in 1.5 hours despite moderate usage

[BUG] Pro Max 5x Quota Exhausted in 1.5 Hours Despite Moderate Usage · Issue #45756 · anthropics/claude-code

github.com

April 12, 2026

6 min read

🔥🔥🔥🔥🔥

67/100

Summary

The Pro Max 5x plan for Claude Code has reportedly exhausted its quota in just 1.5 hours despite moderate usage. Users are experiencing unexpected quota depletion issues.

Key Takeaways

  • The Pro Max 5x plan's quota was exhausted in 1.5 hours despite moderate usage, raising concerns about token accounting.
  • Cache read tokens appear to count at full rate against the rate limit, negating the benefits of prompt caching for quota purposes.
  • During heavy development, the system consumed 24.4M effective tokens per hour, while moderate usage led to an unexpected consumption of 70.5M tokens per hour.
  • Each API call sends the full context as input, resulting in significant quota consumption, especially when cache read tokens are not discounted as expected.
Read original article

Community Sentiment

Negative

Positives

  • The recent UX improvements to prompt caching are a step in the right direction, potentially reducing the frequency of costly cache misses during long sessions.
  • Users report that switching to the Codex plan has resulted in a more generous quota and improved accuracy for their specific tasks, indicating better performance in certain use cases.
  • Despite some issues, the Codex model is perceived as a better deal compared to Claude Code, suggesting a competitive edge in terms of quota management.

Concerns

  • Many users are experiencing significant performance regressions with Claude Code, including prolonged exploration loops that lead to rapid quota exhaustion.
  • The recent change in prompt caching behavior has left users frustrated, as it appears to negatively impact token usage and overall efficiency.
  • There are concerns that the quota limits are becoming increasingly restrictive, making it difficult for users to utilize the service effectively for their work.

Related Articles

Cache TTL silently regressed from 1h to 5m around early March 2026, causing quota and cost inflation · Issue #46829 · anthropics/claude-code

Anthropic silently downgraded cache TTL from 1h → 5M on March 6th

Apr 12, 2026

[Bug] Excessive token usage in Claude Code 2.1.1 - 4x+ faster rate consumption than previous versions · Issue #16856 · anthropics/claude-code

Excessive token usage in Claude Code

Feb 21, 2026

GitHub - liorwn/claudetop: htop for your Claude Code sessions — real-time cost, cache efficiency, model comparison, and smart alerts

Claudetop – htop for Claude Code sessions (see your AI spend in real-time)

Mar 14, 2026

Anthropic admits Claude Code quotas running out too fast

Claude Code users hitting usage limits 'way faster than expected'

Mar 31, 2026

GitHub - drona23/claude-token-efficient: Universal CLAUDE.md - cut Claude output tokens by 63%. Drop-in. No code changes.

Universal Claude.md – cut Claude output tokens

Mar 31, 2026