Table of Contents
Have you ever been stopped mid-task in Claude Code by an error like this?
API Error: 529 {"type":"error","error":
{"type":"overloaded_error","message":"Overloaded"}}
# or
API Error: 500 Internal server error.
529 Overloaded means Anthropic's API is temporarily over capacity (busy), and 500 means an unexpected error occurred inside the server. Both are server-side — and, most importantly, they are not a mistake in your request or settings, and not your usage running out. The official docs state plainly that "a 529 is not your usage limit and does not count against your quota." In other words, these are the kind of error that usually clears with "wait a moment and retry."
Key points up front. (1) 529/500 are server-side — not your fault (and do not consume your quota). (2) Claude Code already auto-retries up to 10 times with exponential backoff before showing you anything — when the friendly message appears, those retries are already exhausted. (3) The fix is "check the status page → wait → switch model with /model." Capacity is tracked per model, so even when Opus is busy, Sonnet often goes through.
This is server-side, not your fault
— Claude Code is already retrying before it shows you anything
So the fix is "wait and retry / switch with /model / check status.claude.com."
There is essentially no code or setting to fix.
1. What this error is telling you
HTTP 529 (overloaded_error / message "Overloaded") is a sign that Anthropic's API is temporarily over capacity. The official description is literally "the API is temporarily overloaded" and "can occur when APIs experience high traffic across all users." It means not any one person's fault, but that overall demand briefly exceeded supply.
HTTP 500 (api_error) is an unexpected internal error on Anthropic's side. The docs say it is "not caused by your prompt, settings, or account." Related is 504 (timeout_error) when a long request times out (note that Anthropic documents 504, while 502/503 usually come from upstream infrastructure like gateways).
The crucial point: "529 and 500 are server-side problems and do not consume your usage quota." They are entirely different from the plan-quota usage limit reached and from your own rate limit 429 (we disambiguate in §4). So there is no need to brace yourself and fix code or settings — the default is "wait and retry."
2. Claude Code is already retrying for you
In fact, before you ever see the error message, Claude Code has been retrying behind the scenes. Per the official docs —
The auto-retry behavior
Server errors, overloaded responses, request timeouts, temporary 429 throttles, and dropped connections are all retried up to 10 times with exponential backoff. While retrying, the spinner shows a Retrying in Ns · attempt x/y countdown. By the time the friendly API Error: string appears, those 10 retries are exhausted.
So "a 529 flashed but it kept going" is normal — the auto-retry absorbed it. Conversely, if you reach the friendly message ("Repeated 529 Overloaded errors … try again in a moment. If it persists, check https://status.claude.com"), it is a sign the load is bad enough that even retries did not recover. You can tune retries with CLAUDE_CODE_MAX_RETRIES (default 10) and the per-request cap with API_TIMEOUT_MS (default 600000 ms = 10 minutes) — lower the count to fail fast in scripts, raise it to wait through a longer incident.
3. What you can do
The moves for a 529/500 are actually very simple. Try them in order.
Wait, switch, check
/feedback (include the request_id to speed up investigation).
Unsure? 1) wait → 2) switch with /model → 3) check status.
Shifting to off-peak hours helps too. There is essentially no setting to fix.
Note: the message "Server is temporarily limiting requests" is also officially described as "a short-lived server-side throttle unrelated to your usage limit." It too clears with a short wait, and is a different thing from the plan-quota usage limit.
4. Telling it apart from similar errors
The "it stopped" family can have opposite causes. First split by "server-side or your side?"
| Error | Whose problem | Uses quota? | Main fix |
|---|---|---|---|
| 529 Overloaded | Server-side (capacity, affects everyone) | No | Wait & retry, /model, status check |
| 500 / 504 | Server-side (internal error / timeout) | No | Retry; if persistent, /feedback |
| 429 Rate limit | Your side (your API key rate limit) | Yes (your rate) | Slow down, raise tier, wait the retry-after |
| usage limit reached | Your side (Pro/Max plan allowance) | Yes (plan) | Wait for reset; fixes |
| 400 Invalid request | Your side (a bad request) | No | Fix the request body |
A mnemonic: 5xx (including 529) is server-side = it clears if you wait. 429 and usage limit are about your "amount" = adjust rate or plan. 400 is about your "content" = fix the request. 429 and 529 are especially easy to confuse, but 429 carries a retry-after header and consumes quota, whereas 529 has no header and consumes no quota — different things. For other common Claude Code errors, see the error roundup.
5. For developers (API/SDK)
If you run your own app on the API/SDK, the right design treats 529/500 as "a transient event that can normally happen."
(1) The official SDKs raise typed exceptions (OverloadedError, InternalServerError, etc.) and auto-retry transient errors with exponential backoff — catch the exception classes, not string matches. (2) If you retry yourself, use "exponential backoff + jitter." (3) The retry-after header is present on 429 but NOT on 529, so on a 529 wait with your own backoff, not header-driven timing. (4) Have a fallback model (Claude Code has --fallback-model). (5) Ramp traffic gradually to avoid the 429 "acceleration limit" after a usage spike. If you need steady availability, Priority Tier and the Message Batches API are also options. For the basics, see What is an AI API.
6. Transient spike or an incident?
The same 529/500 means different things depending on whether it is "a spike that vanishes instantly" or "a continuous outage that repeats."
A transient spike (one or a few that clear on retry) is within the normal range of demand fluctuation. The auto-retry usually absorbs it, and there is nothing to fix on your side. On the other hand, "Repeated 529," or a 500 that survives retries, is a sign to suspect an active incident — first check status.claude.com, and if an outage is posted, waiting for recovery is the only right move. If a 500 persists with no posted incident, report via /feedback with the request_id. Either way, all a user can do for 529/500 is "retry, switch with /model, check status, and report" — and that is genuinely enough.
Summary
Claude Code's "API Error: 529 Overloaded" and "500 Internal server error" are server-side events where Anthropic's API is temporarily overloaded or hit an internal error. They are not a mistake in your request or settings, not your usage running out, and they consume no quota. Claude Code auto-retries up to 10 times with exponential backoff before showing you anything; the friendly message means those retries are spent.
The fix is simple: (1) wait and retry -> (2) switch model with /model (capacity is per model) -> (3) check status.claude.com -> (4) /feedback if a 500 persists. They are different from 429 (your rate) and usage limit (your plan), and 529 carries no retry-after. Developers should design around it with the SDK's auto-retry, exponential backoff + jitter, and a fallback model. If it repeats, suspect an incident and check the status page — either way, there is essentially no code or setting to fix. Related: usage limit fixes, Opus/Sonnet/Haiku comparison, Claude Code error roundup.
FAQ
Q. Is "529 Overloaded" caused by something I did wrong, or my code?
A. No — it is a server-side problem. A 529 means Anthropic's API is temporarily over capacity (congestion across all users); your request, settings, and account are not involved. The docs state plainly that "a 529 is not your usage limit and does not count against your quota." It usually clears if you wait a moment and retry.
Q. It keeps telling me to retry — should I spam it myself?
A. Generally no. Claude Code already auto-retries up to 10 times with exponential backoff before showing the error (Retrying in Ns · attempt x/y). The friendly message appeared because those 10 retries were used up. Wait a little, and for a long prompt just type "try again" to re-run with the original context. You can tune the count with CLAUDE_CODE_MAX_RETRIES.
Q. What is the difference between 529 and 429?
A. 529 is server-side overload (affects everyone; consumes no quota of yours), while 429 is your own rate limit (you exceeded your API key's RPM, etc. — about your rate allowance). A tell: 429 carries a retry-after header, while 529 does not. A 429 needs your-side adjustment (slow down, raise tier); a 529 just needs wait-and-retry or a /model switch.
Q. Why does switching with /model sometimes work?
A. Because capacity (congestion) is tracked per model. Even if Opus is under high load, Sonnet may go through instantly if it has headroom. Claude Code itself sometimes prompts a switch under load ("Opus is experiencing high load, please use /model to switch to Sonnet"). When you are in a hurry, switching to a lighter or different model with /model is a quick workaround.
Q. I keep getting 529/500 nonstop. What should I do?
A. Suspect an active incident and check status.claude.com. If an outage is posted, all you can do is wait for recovery. If a 500 persists with no posted incident, report it via /feedback with the request_id so Anthropic can investigate. Since 529/500 are server-side events, there is essentially no code or setting for you to fix.