Hacker News Clone

palcuMar 27, 2026, 5:38 PM

Hey folks, I'm Alex from the reliability engineering team at Anthropic. We've just posted the retrospective for this incident:

> On March 26–27, 2026, customers experienced elevated error rates when using Claude Opus 4.6 and Claude Sonnet 4.6. The issue was caused by a networking performance degradation within our cloud infrastructure that disrupted communication between components of our serving stack. We resolved the incident by migrating the affected workloads to healthy infrastructure, restoring normal service by 9:30 AM PT on March 27.

https://status.claude.com/incidents/b9802k1zb5l2

dogleashMar 27, 2026, 8:29 PM

Is someone hurting you? Should we send help? Post more mealymouthed corporate nothingness if you need help.

halJordanMar 27, 2026, 6:33 PM

Is it really an answer to say "network disruption" with a bunch of $10 words? Certainly it doesn't belong here of all places.

yreadMar 27, 2026, 3:24 PM

At this point you can stop worrying about downtime-free deployments so the devops becomes easier

michaelcampbellMar 27, 2026, 3:17 PM

> Our uptime has a '9' in it! -- Anthropic

adgjlsfhk1Mar 27, 2026, 3:51 PM

Github this month is very close to having 0 9s reliability. (unless they want to argue that 89% has a 9 in it)

marcosdumayMar 27, 2026, 3:56 PM

The comment you are replying is carefully written in a way that allows 23.19%

littlestymaarMar 27, 2026, 4:10 PM

I'm not sure I've had a day without Github hiccups this month, so that feels right.

ACCount37Mar 27, 2026, 3:54 PM

By now, I'm nearly certain that they'd be down to 0 9s of uptime if they counted it conservatively.

leosanchezMar 27, 2026, 3:41 PM

Or as the British would say "9 innit ?"

bwbMar 27, 2026, 3:19 PM

We had a ton of traffic coming in to check them: https://downforeveryoneorjustme.com/anthropic

Not one of the usual ones that has service problems :)

timperaMar 27, 2026, 2:40 PM

https://status.claude.com/

steveBK123Mar 27, 2026, 3:15 PM

Remember when putting your entire life & business into the cloud was good because they were all offering 5 9s of uptime?

Very few cases these days.. feels like we are lucky to get 2 9s anymore.

dehrmannMar 27, 2026, 3:41 PM

I wonder how much is due to supply constraints, how much is standard growing pains, and if over-reliance on AI was the cause for any outages.

tracker1Mar 27, 2026, 6:07 PM

I know they tend to get much slower early evenings in the Western US... Not sure if this is everyone on the west coast going home and working on stuff, or the early people in the Asia region coming online.

TrufaMar 27, 2026, 3:02 PM

I honestly feel like it's more honest status measure than many status pages I know.

yomismoaquiMar 27, 2026, 4:44 PM

Maybe they are gunning for 5 nines (9.9999%)

aubanelMar 27, 2026, 4:03 PM

I wouldn't be too harsh, scaling x10 YoY is a bit hard on the infra!

timperaMar 27, 2026, 4:17 PM

OpenAI managed it way better, but we might have Microsoft to thank for that.

BoredPositronMar 27, 2026, 8:28 PM

We don't know any numbers.

gherkinnnMar 27, 2026, 5:41 PM

But isn't GitHub's perpetual demise Microsoft's fault?

whateveracctMar 27, 2026, 6:13 PM

isn't serving Claude embarrassingly parallel tho?

verdvermMar 27, 2026, 3:13 PM

You can access Claude models with Google Cloud reliability via VertexAI. The caveat is that you cannot use your subscription, per-token pricing only.

I personally prefer per-token, it makes you more thoughtful about your setup and usage, instead of spray and pray.

You can also access the notable open weight models with VertexAI, only need to change the model id string.

Scene_Cast2Mar 27, 2026, 3:26 PM

I also use them per-token (and strongly prefer that due to a lack of lock-in).

However, from a game theory perspective, when there's a subscription, the model makers are incentivized to maximize problem solving in the minimum amount of tokens. With per-token pricing, the incentive is to maximize problem solving while increasing token usage.

verdvermMar 27, 2026, 3:32 PM

I don't think this is quite right because it's the same model underneath. This problem can manifest more through the tooling on top, but still largely hard to separate without people catching you.

I do agree that Big Ai has misaligned incentives with users, generally speaking. This is why I per-token with a custom agent stack.

I suspect the game theoretic aspects come into play more with the quantizing. I have not (anecdotally) experienced this in my API based, per-token usage. I.e. I'm getting what I pay for.

limaMar 27, 2026, 6:10 PM

We tried this, but the quota for Opus models defaults to 0 on VertexAI and quota increase requests are auto-rejected.

Any tips?

perfmodeMar 27, 2026, 3:42 PM

You can use your subscription for Anthropic-hosted Claude models?

limaMar 27, 2026, 6:09 PM

No, unless you count tricks which are explicitly against ToS

verdvermMar 27, 2026, 3:43 PM

Don't know. I tried Anthropic directly a long time ago and was frustrated by their uptime issues. Seems it has not improved in the years since.

chewbachaMar 27, 2026, 3:35 PM

You mean Google Chaos Services as we call them?

joe_mambaMar 27, 2026, 3:18 PM

I saw a funny skit where if free Claude instance was down for you, you could just ask Rufus, Amazon's shopping AI assistant, your math/coding question phrased as a question about a product, and it would just answer lol.

Tade0Mar 27, 2026, 3:35 PM

In my region a certain small bank had an AI assistant which someone neglected to limit, so you could put whatever there and not even phrase it as a question about a product.

scuff3dMar 27, 2026, 3:56 PM

Probably vide-coded their infrastructure

littlestymaarMar 27, 2026, 4:07 PM

If you don't pay attention 99% may sound high but it means up to 20 hours of downtime in over the quarter.

Anthropic has had more than that.

Yikes.

senecaMar 27, 2026, 3:07 PM

They seem to be a victim of their own success. Their response times are quite bad, and it's widely believed they are doing something to degrade service quality (quantizing?) in order to stretch resources. They just announced that they're cutting their usage limits down during peak hours as well.

They're in serious risk of losing their lead with this sort of performance.

sva_Mar 27, 2026, 3:13 PM

It can't be worse than gemini-cli using a Pro account.

senecaMar 27, 2026, 3:23 PM

Oh really? Do they have availability problems too?

nsingh2Mar 27, 2026, 3:26 PM

Gemini CLI has been broken for the past 2-3 days, with no response from Google. Really embarrassing for a multi-trillion dollar company. At this point Codex is the only reliable CLI app, out of the big three.

https://www.reddit.com/r/GeminiCLI/comments/1s49pag/this_is_...

sva_Mar 27, 2026, 6:39 PM

Last time I tried it a single prompt ran for over an hour, mostly doing nothing/waiting on availability.

ACCount37Mar 27, 2026, 4:03 PM

> it's widely believed they are doing something to degrade service quality (quantizing?) in order to stretch resources

God, I wish this inane bullshit would just fucking die already.

Models are not "degrading". They're not being "secretly quantized". And no one is swapping out your 1.2T frontier behemoth for a cheap 120B toy and hoping you wouldn't notice!

It's just that humans are completely full of shit, and can't be trusted to measure LLM performance objectively!

Every time you use an LLM, you learn its capability profile better. You start using it more aggressively at what it's "good" at, until you find the limits and expose the flaws. You start paying attention to the more subtle issues you overlooked at first. Your honeymoon period wears off and you see that "the model got dumber". It didn't. You got better at pushing it to its limits, exposing the ways in which it was always dumb.

Now, will the likes of Anthropic just "API error: overloaded" you on any day of the week that ends in Y? Will they reduce your usage quotas and hope that you don't notice because they never gave you a number anyway? Oh, definitely. But that "they're making the models WORSE" bullshit lives in people's heads way more than in any reality.

internetterMar 27, 2026, 3:15 PM

I can't speak on Gemini but OpenAI is far worse for free accounts at least

danelskiMar 27, 2026, 3:31 PM

GeminiCLI is absolutely terrible, nothing comparable to the browser access. I've started using the 'AI Pro' tier lately and I get 15 minutes response times from Gemini 3 'Flash' on a regular basis.

orpheaMar 27, 2026, 3:12 PM

  > this sort of performance

They've been very proud of it.

faangguyindiaMar 27, 2026, 3:46 PM

i just use gemini 3 flash via api with custom agent.

only people who do not even look at code anymore need anything more than that.

ramesh31Mar 27, 2026, 3:24 PM

>"They're in serious risk of losing their lead with this sort of performance."

Nobody goes there anymore, it's too crowded.

senecaMar 27, 2026, 4:59 PM

You'll notice I specifically said "victims of their own success". Obviously these problems are induced by the fact that they have so many users. Blowing a lead due to inability to handle the demands of success is still a path to losing the lead.

no_shadowban_3Mar 27, 2026, 3:15 PM

[dead]

3yr-i-frew-upMar 27, 2026, 3:33 PM

Victim of success.

They are the best.

ChatGPT is walmart.

Gemini is kroger.

Claude is... idk your local grocer that is always amazing and costs more?

quentindanjouMar 27, 2026, 3:42 PM

The local grocer that isn't amazing and cost more and actually isn't really that local in the sense that none of the products sold are from local businesses/producers?

3yr-i-frew-upMar 27, 2026, 4:26 PM

No bud, Opus is the best model at this current moment.

GPT4.5 + COT would have been the best, but OpenAI got cheap.

claudiugMar 27, 2026, 3:49 PM

MAKE NO MISTAKES! DO NOT HALLUCINATE! FIX IT!

maplethorpeMar 27, 2026, 3:56 PM

I find it's more reliable if you write "you are a highly experienced software engineer".

nurettinMar 27, 2026, 7:08 PM

I start every prompt with "we have been going in circles". It is the shibboleth for anthropic to A/B test you with their secret new model.

yubainuMar 27, 2026, 3:57 PM

[dead]

boxingdogMar 27, 2026, 5:25 PM

[dead]

mastabadtommMar 27, 2026, 2:45 PM

[dead]

rvzMar 27, 2026, 3:06 PM

This is not an outage, Claude just gets lazier on Fridays.

Sometimes Claude wants more lunch breaks, takes a half day and leaves the desk early just like any human would. (since AI boosters like comparing LLMs to humans all the time) /s

sebastiennightMar 27, 2026, 3:15 PM

If you're concerned about humans anthropomorphizing AI models, you might want to steer well clear of Anthropic, as their entire positioning (starting with the product name and continuing with UX choices and model releases) is built to attract the kind of researchers who are prone to believe in sentient machines.

They are going in the "Claude is alive" direction already and that line of communication is likely going full throttle in the nearby future.

SpicyLemonZestMar 27, 2026, 3:08 PM

You joke, but I think that's a fair summary of why people don't mind one 9 of uptime in a key component of their development workflow.

Claude loses its >99% uptime in Q1 2026

Comments