But to your point, that is exactly how American companies like to play now. No one is stopping them from screwing over the consumer.
I have a Micron near me and they are building another chip facility but we are years away still so I suspect China will beat them to the punch.
China doesn't have EUV fabs... They've pushed DUV impressively far... but until they get EUV working industrially (and reasonable timelines are at least 2-4 years for that) it shouldn't be possible for them to compete for that market.
> China is the great equalizer of the world.
China is hardly an egalitarian society...
Another way China is a great equalizer is their willingness to do business with anyone that can pay.
Nations often impose trade barriers for various reasons. This is a very old tactic.
Chinese per-capita emissions have peaked at lower level than US and are already falling.
drop the bs
Does this not count as soon? How often do you buy new computers? That seems pretty soon. I remember a year or few ago being told it'd never happen so they're already infinity years ahead of schedule if we accept that as reasonable. The rate they're pulling ahead of expectations appears to be so sharp there is a risk they leapfrog EUV to go on to the next big thing.
Would’ve been nice if the United States had built a rail system to north to Alaska or even a rail system to Chile to the south?
I guess doing things like that are hard to do when you’re busy fighting multiple wars since the early 1950s.
If you think that South Africa is absolutely egalitarian, you're wrong.
If you think that Norway is absolutely egalitarian, you're also wrong.
But if you think that "South Africa is egalitarian" is as wrong as "Norway is egalitarian", then your views are more wrong than both of them combined.
To state that no country is absolutely egalitarian does not mean that "China is hardly egalitarian" has to be wrong. And even if some other country (say Norway) were to be as hierarchical as China, that would not disprove the claim that China is hardly egalitarian. It would just mean there exist other inegalitarian countries too.
India is not even trying despite its size and we as germans do not push the EU as a union.
The big three memory makers will probably face their last big payday. I hope they enjoy it, as China will dominate the global memory market in three-five years due to their short term greed.
Apple will likely bring memory in-house, like they did with CPUs and GPUs. Anyone questioning the time it took to replace Intel and Qualcomm should consider the Chinese expansion in the memory market, which makes it a long-term necessity.
Apple has the money, and while its competitors have spent/squandered $1 trillion on the AI data center fiasco. Apple made a decision to stay away from the blast crater.
Meanwhile Apple which also has the expertise in engineering and chip design can do what is necessary and bring memory in house. Note: Nvidia and Broadcom have also been replaced along the way by Apple also.
Who knows maybe Intel will condescend to do memory too?
There is a certain amount of capacity to produce memory. They are building new facilities but it takes a long time. They have been burned going down this route many times in the past (e.g., losing money, firms that are no longer in business).
What would you have them do instead?
SK Hynix and Samsung are South Korean.
The Korean memory makers are playing the same game as Micron and simply moving existing capacity up-market.
GP was referring to upstart Chinese memory manufacturers like ChangXin, who - if their yields manage to catch the wave - could not have asked for a more favorable market after the big 3 have abandoned the consumer segment. Consumers who would have otherwise turned up their noses at CXMT will not have the luxury.
hm interesting
https://www.tomshardware.com/pc-components/ddr5/chinese-memo...
However with corsair giving it their blessing, and their technology having matured a bit (a lot?), and more reviews showing good stability (longevity I suppose, is TBD) they're definitely worth recommending these days.
Thanks, please give my regards to Kash Patel.
And I know you don’t care about it. It’s understandable because you have your own concerns to devote to. Never mind. But just don’t put all these stuff in a simple way. Please remember the problems beneath those achievements are much more and make a larger numbers of people suffered than those who are benefited through those achievements.
I guess we might have more similarities than disagreements if those issues are abstracted out the country level.
Normal people who have to wait for an event were doctors do free health care in a sport gym for a handful of days.
Up until 2005, roughly 10% of the population couldn't even access healthcare, at which point the PRC built out more care centers and invested in training more doctors, but there's still a significant shortage, such that scalpers sell outpatient appointment tickets for 10-15x markup over the actual appointment cost.
There's plenty of ways the two countries are different, but healthcare seems like an odd choice to try to "one-up" the US on, even if its programs like medicare, medicaid, social security disability and others still leave gaps.
First of, even per capita, the USA is at 8th place while China is 74.
For sure China has a problem due to its gigantic size and amount of people to even be able to reach its people, but the health care costs are nowere as high as USA has. USA is actually the country with the highgest % of GDP spend for health care alone.
Just checkout a YT video from an US American going to a normal chinese hospital and then compare the bill.
And in parallel the USA is dismantling medicare, medicad and co.
This is also directly reflected in the life expetency: US Americans are getting less old than Chinese people.
China and the US have the same life expectancy of 79 years, which is a very recent phenomenon due to the 2005-2018 changes I mentioned earlier. Obesity, lack of exercise and other cultural factors weigh down the US life expectancy compared to all other Western nations. China's use of abortion during the one child policy era also prevented a lot of people who would have had chronic medical conditions and disabilities from being born.
It is not yet true, however, that Americans are getting "less old", though it may soon depending on how China manages it's own growing obesity problem and tobacco use.
Feel free to us europe than or germany. We pay less and more people are better of than the US Americans.
US Americans pay 2.5 times more than the avg high income nation: https://www.commonwealthfund.org/international-health-policy...
Were is the benefit of paying that much more if you are not getting older than others?
My main argument, which is in my first comment, is that healthcare is a bad way to show the difference between China and the US, since they are actually have a lot of similarities, especially with access at the lower end of the income spectrum.
There's literally no reason to bring other countries into the conversation other than to say "US is bad", which does nothing to change the reality of healthcare access in China.
India needs to first figure out the absolute basics
At least we have the CIA to blame on religious fundamentalism.
The Indian Government is heavily pushing for domestic capabilities.
To understand why India failed to replicate the Chinese or East Asian model, I recommend A Sixth of Humanity by Devesh Kapoor and Aravind Subramanian.
I had a similar view to you ~2 weeks ago. Spending some time there very quickly made me realize that there’s a lot of other things that are much more pressing.
You really can’t expect the same bureaucratic setup to think in terms of the decade+ it would take to be competent at something like chips
The whole system needs to be dismantled while an alternative system gets built; given the nature of Indian politics (freebies/jobs/reservations to certain groups; monthly stipends to certain groups by borrowing money while at the same time looting public funds), it is impossible.
Higher education want to move or distant themselves from the poor, dirty or just caste separation.
If you have the feeling that certain things are not your problem, you are not rising.
Basically:
China floods the market with cheaper but less QA'd parts, makes a gazillion dollars, is able to spend said money to fix yields / QA issues and streamline operations, by the time that happens Micron and maybe a few other existing players will have new memory production, and then we'll have a flood of cheap, reliable memory. 4yr, maybe?
Seems it's mostly useful for LPDDR modules which are predominantly used in battery-powered devices and to improve margins.
[1]: https://www.tomshardware.com/pc-components/dram/micron-sampl...
https://aeon.co/essays/what-chinese-corner-cutting-reveals-a...
The US did it when it was a bigger steel supplier, good steel was sold domestically, crappy steel was sold elsewhere. If you got crappy steel in Africa at the time you might have thought US steel was garbage with poor QA, but in reality US steel was great and they just shipped the crappy stuff because people still kept buying it.
It's quite difficult to make general statements at such a gargantuan scale encompassing every single sector.
China has an abundance of terrific QA in electronics and advanced technologies as much as it has an abundance of the opposite, just simply due to its sheer size.
My endlessly excellent Chinese gear (Dahua cameras, XikeStor switches, etc) doesn't know what you're referring to.
I remember reading about it in Linux contexts decades ago, and these days it's something that Windows does automatically.
When can I expect this flood of cheaper RAM with less QA? I'd like to contribute to the gazillion dollar pile as soon as possible.
China is very far away from flooding the DRAM market.
I'm guessing you are also probably unfamiliar with the terms like "chicken game" which refers to the cutthroat, high-stakes price wars where dominant semiconductor manufacturers intentionally overproduce and slash prices. This is literally how the industry went from dozens to just three majors today since the 80's.
I'm not sure what world we live in when the scheming capitalists are all hunched around their table working out how to dodge selling their products into an enormous price boom. Do they not like money all of a sudden?
The industry is so naturally prone to oversupply that the only stable equilibrium is undersupply. Aggressive expansion kicks off a price war, which immediately undercuts the logic of the expansion.
This only changes with new entrants, which will come, especially from China. But it takes time to build fab capacity, so the medium-term modal outcome is consistent undersupply.
Corsair DDR5 DIMM modules with CXMT RAM started appearing on Friday.
The DRAM fabs have been on a roundabout for 40 years going from getting accused of price fixing and cartel behavior, to struggling to keep the lights on.
And imo it's not really their fault, it's all the lead time of advanced semiconductors, combined with the commodity dynamics of oil. And the goal is to match that supply to the demand of everything from consumer electronics to more datacenters than you can shake a stick at.
It's maddening to try and solve that, so at this point I really don't fault them for prioritizing survival.
"Accused" makes it sound like these things may still be up in the air, when they very much are not. I would choose instead the much clearer "A number of those involved in DRAM production have a proven history of cartel behavior and price fixing."
For those who may not be familiar with some of the history in this area:
For all I know, maybe they are dumb enough to try and actually coordinate again, my hunch would be no, or they've tried something new and inventive. Like Matt Levine talked about how so many landlords were using the same software to set prices, that one was pretty shady.
But it is interesting where it is popping up at the moment, like power transformers is another area. These companies have lived through these cycles before, and know there is no one to save them if they overleverage and get it wrong.
What?! If they did an anti competitive agreement sure. Otherwise no as each supplier is incentivized to produce more than its competitor and less than the demand, while divesting just enough to survive the oversupply risk.
Memory in particular ... https://en.wikipedia.org/wiki/DRAM_price_fixing_scandal
The entry-cost to getting into memory is on the order of $billions and years - you can do just about anything...
Increasing the availability doesn't mean decreasing the price ... people think those are intrinsically related - not so much.
You can get a prada shirt for $2,000 ... as many as you'd like, for $2,000 a piece. No problem. They'll make the factories go burr all night long. Still $2,000.sweeping
There's a bunch of things like this. $100 bills for instance ...
a new entrant might yield a price drop, or, it might not.
> why not? i'm sure they can jump into the hustle.
Not so quick. Critical difference is the relationship between enterprises and the state. In China, the state owns the enterprise, in one way or another. High costs of memory is a threat to the established Chinese electronics manufacturers. The Chinese state can optimize returns at a higher level than the one some petty chip manufacturer operates at, especially if doing so means it could gain coercive geopolitical strength, aka blackmailing.In the past when memory supply was short and then rebounded, many companies went out of business because making memory was no longer profitable.
The companies have two choices. They either produce RAM cheaply and in large quantities, or they get replaced by someone who will produce RAM cheaply and in large quantities. Current incumbents are free to pick which of those two scenarios they prefer.
If it costs you $1B and five years to build out new supply and you think demand will not sustain for more than three years, it does not make sense to expand supply.
Instead you will maintain your margins currently and await demand to decrease back to your current supply.
This is pretty common and as others have pointed out is even more common in markets where competition is slow and lead times are long.
Ammunition is a great example over the last decade or so as political turnover caused relatively short lived demand spikes and manufacturers didn't expand supply because they knew once political winds shift, demand would decrease.
The thing is they tend to only do that when they can get a technological competitive advantage. The priority access gives them a locked in competitive edge, for a while. It’s not clear there is an opportunity like that in memory.
There’s a lot to criticize Sam Altman for saying or popularizing culturally but I’ve come to think his “this is the worst it will ever be” is, in the long run, actually a very intriguing and underrated point.
In a decade training LLMs to the current level of sophistication, which is in my opinion rather advanced and probably has lots of additional upside just from constructing better RL training regime independently of hardware advancement, will become just as table stakes as running a database is now. I highly recommend everyone look into the Allen Institute’s projects in GitHub and HF because they have open source training materials (including an LLM from scratch off common crawl, and some quite interesting tunes of qwen) to get a taste for what will be in the near future afternoon projects or educational material. The future is going to be wild
Until everything matures, most likely the current iteration of OpenAI and Anthropic will be long gone, along with their current business models.
Posits do a little better if your numbers are biased enough toward 1, but not much better. A 16 bit posit in a near-ideal situation matches an 18 bit IEEE float, and in a pretty wide range of situations loses to either fp16 or bf16.
Training anything at 8 bits is going to be tough, and it's hard to say if the flexible exponent is worth the precision tradeoffs.
Unsure what you mean by this... A posit16 has up to 11 bits of precision. There's no such thing as an 18 bit IEEE float.
> and in a pretty wide range of situations loses to either fp16 or bf16
Many papers have compared neural networks at 16 bits or 8 bits, and posits beat the hell out of floats and it's not even close. Which is very much expected. As they're particularly suited to this task. But also in other domains, like numerical weather simulations, where tests have shown 16-bit posits can replace 32-bit floats.
Is this excluding the implied bit?
In that case a short float has 10, but if you're messing with formats you can staple on an extra bit of precision and an extra bit of exponent.
> There's no such thing as an 18 bit IEEE float.
There's a lot of custom sizes out there. But if you keep following IEEE rules then there's no special circuitry needed, just a small scaling factor.
nVidia also laid out a 19 bit format that's a superset of both fp16 and bf16.
> Many papers have compared neural networks at 16 bits or 8 bits, and posits beat the hell out of floats and it's not even close.
Can you link a paper that shows posits beating floats at different sizes?
I found a 2021 paper that compares various posits to 32 bit floats, and finds that the model quality is close for some of them. It does not compare any smaller floats.
> Which is very much expected. As they're particularly suited to this task.
Posits show their value when you need a huge exponent range and your numbers focus very closely around 1. How strongly do neural nets fit that pattern?
And how often is their advantage better than 1 or 2 bits?
If you can keep your weights within a range of 9 orders of magnitude, I expect fp16 to do just fine since it loses a bit on some numbers and gains a bit or two on other numbers.
> But also in other domains, like numerical weather simulations, where tests have shown 16-bit posits can replace 32-bit floats.
Can you link this too? I found a 2019 paper that shows them beating fp16 and falling short of fp64, but no fp32 comparison. They also noted that 16,0 posits and bf16 did badly.
They did conclude that 16 bit posits were probably good enough to beat out measurement error and be suitable for the bulk of simulation, but that same chart showed that fp16 was almost good enough. So again I wonder how many bits you'd actually need, since if you're considering rebuilding your FPUs it would be silly to exclude "float sizes that aren't powers of two".
The difficult question is more whether foreseeable memory demand will remain at the current level, grow even further, or shrink again.
Original source (paid): https://asia.nikkei.com/business/tech/semiconductors/memory-...
> No new DRAM fabs are being built.
See https://manufacturing.economictimes.indiatimes.com/news/hi-t...
Also, inference cost predictions were made before this price jump, so we really haven't started paying for it yet. Inference will not be getting cheaper.
The up-front investment of a memory fab is measured in billions, and takes years to construct and get running. The margin on the chips themselves is terrible, so without scale its not worth even trying. DDR5 is a industry standard that takes some effort to conform to, but the licence fees is a drop in the bucket to the cost of creating a fab.
The fabricators were cautious about increasing production, and slow to start planning. It takes further time to build up capacity, and if the demand drops down, they may end up producing dram at a loss when the market flips over to oversupply. The demand whiplash could kill any company that dared betting on increasing production. See the "bullwhip effect" https://en.wikipedia.org/wiki/Bullwhip_effect which has killed semiconductor fabricators before.
There is a discussion to be had about how to maintain national semiconductor production in Europe and US as a strategic industry, but historic attempts have all failed.
Also that's not what the bullwhip effect is - although I know what you are saying. The bullwhip scenario is about the effect of communication and batching through various layers in the supply chain, this is more similar to the cobweb effect/theory.
If it was just variable costs and new capacity was available today they’d do it. But there are substantial fixed costs and delays to increasing capacity, and that uncertainty makes it risky.
The current RAM manufactures were convicted of conspiracy to manipulate prices back in the 2000s or thereabout, doing so is their modus operandi, but this time the government is participating in the racket.
BTW all RAM is severely overpriced, not only the one using the latest process nodes.
Look up Qiminda, ProMOS, Elpida. They invested in capacity during booms.
Now it's 2021 and someone gets a tanker stuck in the Suez, sending the price of oil sky-high. How long does the ship have to be stuck before you spend those billions of dollars on a bet that it'll recoup before someone gets the ship out?
It's always easier to see the right move in hindsight!
Question is, without hindsight, 2022 rolls around, Ethereum moves to PoS, do you sell NVDA?
In a world where TSMC is functionally capable of the same level of production but not in such a complicated geopolitical situation regarding semiconductor manufacturing, things would be quite different.
It's the main reason outsourcing fabs is so much more economical. If NVIDIA built fabs just for itself, the fab's CAPEX would be amortized over fewer components than if a third party did, even if NVIDIA was the largest customer. It's also one of the main reasons Intel fell behind. So much of their cashflow was to build fabs that made an order of magnitude less chips than TSMC. Even worse, they had to write down the CAPEX for the fabs, which affected their financial statements.
Anyways, companies like apple and nvidia have very long term horizons and contracts, which probably have first right of refusal contracts on capacity, etc. In the short to medium term, apple probably isn't paying much more for most components. If this memory shortage lasts decades, they'll eventually end up paying more.
You can get 100x the output with the same energy use.
It would be economically unviable to run the older ones if the supply of newer ones were unconstrained, but that’s not the world we live in.
Edit: It looks like they replaced INT8 and INT4 with FP8 and FP4, with the same speedups of 2x and 4x relative to FP16. That's an improvement but not that big of an improvement.
Maybe the capabilities of newer GPUs allow AWS to charge higher margins for them? I don't actually know.
Likewise it's probably dwarfed by improvements in how we make dram - continuing the roughly exponential (maybe a bit less recently) scaling of chips - but not necessarily.
The 2x from returning to previous costs is interesting because it's practically guaranteed, and it's on top of everything else. We're just currently "overpaying" (relative to the stable market price) for the manufacture of dram because of a sudden increase in demand.
> this is just not true at all, there are massive leaps from algorithms, data, etc. every year. scale is one axis of many and you need to get them all correct.
Or the more likely scenario that the AI bubble bursts and the hyperscalars realize they have built too many data centers.
Really?
How long do we have to wait until that ... cost reduction hits us?
Safe to say at least a year or two. It'd be shocking if it took a decade.
It just takes that long to get a fab up and running.
Bought an extra one by accident, paid $218.99 March 2025
Goes for $1400 now. I haven't gotten around to selling it.
That is to say at least you were able to buy them at $350 today, with the current trajectory there will be no supply at all in few months.
Optane is a technology I’m still mad never became mainstream. It would be particularly useful today when trying to run local models.
[0] - https://www.45drives.com
Looking at the current prices, even of the same RAM, is just insane. Those companies really need to pay us compensation damage here. The whole "free market" notion does not work when you have de-facto monopolies and mega-corporations abuse average Joe and average Jane.
Those are not normal pricing. Before the pricing collapse in early 2020, 96GB DDR5 would have cost about $450 to $500. And I will need to restate again the cost of DRAM hasn't really changed much in the past 20 years. Its price just goes up and down in cycles.
So in reality it is more like going from $500 to $1300. But consumer felt it was more like going from $200 to $1300.
Crucial are already selling DRAM made by CXMT. And China are already throwing money at it. I doubt the memory bubble will burst in next 12-24 months. As in going back to money losing DRAM pricing. As they will all pivot to HBM or other money making products. But the bulk of lower end consumer DDR5 or LPDDR5 will goes to Chinese Foundry. Assuming they have figure out how to do them well. Which they have improved but are still so far away from industry leaders.
Normally memory maker will push the next DDR standard to market just to push out Chinese competitors, I am not sure it will work the same this time around. DDR5 have plenty of other usage / demands.
Historically the price has always trended downward. When I first got into computing $200 could buy you 128 MB (yes M) of ram. Really nice systems had 512 MB.
That's obviously changed over the decades as process shrinks have lead to higher memory density. We should generally expect that ram will cheaper up and until the point where process shrinks stop happening. They've definitely slowed, but they haven't stopped.
I am keeping a piece of paper that came with my Tex Murphy game which stated that one could get 32MB of RAM for as little as $700 (1990s dollars) which would drastically improve the game!
Yes if you span into 40 years. But the spot price for DRAM floor was ~$2/GB in 2008 and touched that 2-3 times over the next 15 year. It wasn't until early 2020s it broke that into $1.
Process shrinks happen but majority of DRAM part can't be shrinked by process any more.
Crucial was disestablished this year.
i compensate by never paying for AI
I don't see it going away. I mean, it may not grow as fast as now, but I don't see it growing away either. I get why the memory makers do not want to bankrupt themselves, but it feels like there's got to be some way to push that risk off onto model providers and other people in the ecosystem to allow us to grow ram capacity more like 50% per year.
I don't actually know what the rate of growth before October was, I'm sure someone round here will though.
As for 20-25% growth not being enough, I think it's not that far off, if we assume data center build out plans hit a wall and slow down significantly, and the AI heat starts to cool off.
I don't think 20-25% may be enough in the short term but if the AI build out stops within this year, we have a massive oversupply instead of a under supply.
Let me explain, imagine CXML grows massive and builds a lot of fabs, so much so that it becomes the leader in multiple segments, then the market demand cools off.
Then CXML the company that invested massively has oversupply so it undercuts every other memory company.
Aka, Samsung, SK Hynix are dead, and to protect Micron now US has 10000% tariff on the supply of memory.
Imagine. Because that has happened, if you don't play the boom and bust game someone will because the market is very large during a boom, and generally the player scaling more isn't the one with margins to protect and generally has the ability to undercut others.
Asian memory chip giants were made by under cutting European and American companies, American companies adapted by moving manufacturing to Asia, and European ones got bought for pennies or dissolved.
But can massive gains still be made? Definitely.
The entire AI hype is based on the paper Attention is all you need, and Attention is basically loading a huge matrix of all the tokens in memory, how well you can optimize this attention layer is basically how most architectures are trying to solve for performance and memory usage.
Only one with significant gains in it is DeepSeek (or so I would like to believe because others don't make their work open for folks like me not in Big AI Labs to read). Their MLA architecture reduced KV-cache memory requirements by upto 90%, ofc that's purely architectural change.
With some quantization like Turboquant from google you could push it down to ~1/3 of that. So 96% memory savings when talking about kv-cache.
But the models are close to being saturated for quantization based memory optimizations. We will have to see some architectural changes for a significant shift now.
24b param models today are way more powerful than 24b param models 2 years ago.
We just haven’t reached the diminishing return of gen AI capabilities yet.
Models will get more useful if you have higher context size or higher param size. Then people will just use the models even more, leading to even more memory demand.
They are drowning in money but they don't invest in new production in order to maintain high prices. By doing so, they form a virtual trust with monopoly control over pricing. What you call "risk" for them is our best hope, China can't enter the market soon enough.
Oops, the US government is blocking the Chinese chip industry in every way possible and thus becomes a factual member of the aforementioned anti-competitive and anti-consumer trust.
Micron doesn't make RAM for the consumer market, they serve corporations only. That's been the case for about 1.5 years now.
> and US did the same against Japan in the past
And the USSR self-isolated from China like 20-30 years before they... disappeared.
I got my RAM rma'ed 6 months ago, yes it was intense
What if its in everyone's interest to buy computers at say 1/3rd the rate and switch everything over to HBM?
the discrepancy between compute and memory has been growing for ages, perhaps a painful switch to HBM is exactly what we need?
Would you rather have 3 intermediate computers with low memory bandwidth, or wait a little longer statistically so that we can all enjoy a new computer at 1/3rd the rate but much higher bandwidth than the area ratio?
As always, some interpret certain recent events as reason to conclude "but this time it's different." Occasionally they are correct. But that doesn't change the fact that it's reasonable to assume some of the recent extreme, rapid price inflation is due to shorter term market distortion. It's also pretty clear that some of the recent increase in demand represents a stable increase in the long-term trendline. The question is how much is long-term stable and how much is short-term distortion.
People used to get into gaming pcs as an affordable hobby, now it’s making general aviation look like plan B.
The only hope left is really Apple, but even apple has conspicuously delayed the launch of M5-gen mac minis and mac studio. Mostly because even Apple can't source enough DRAM to fully supply all their product lines.
You don't even have to drop down to old indie games. You just have to turn off the FPS counter and stop pixel peeping screenshots.
You can still play fantastic games with amazing gameplay, great storytelling, and even requiring quite a GPU. But you won't upgrade your GPU or RAM. If it gets broken, people have already gotten their money back instead of replacement (whether that is legal or not, depends on your jurisdiction, and regardless: it is happening). So the demand and adoption of say 240 Hz 4k OLED gaming is going to slow. I currently sport two 1440p IPS capable of 144 Hz, with an AMD 6700 XT, 64 GB DDR4, and a 5700X3D. I'll wait upgrading that to a 4k rig.
What I will do is buy a Nintendo Switch 2 before the price increase hits. Why? Great gameplay for kids.
Prices haven't risen THAT much and are quite affordable. And if you look at the improved quality of upscalers (DLLS 4.5 for example), gaming is now more affordable than ever, despite the increased cost of components.
Of course, the 5090 prices are insane, as are for SOME memory models, but that's nothing new and represents a fairly small market share.
> When I started building gaming pcs, the top top card was 750$ (NZD)
When I started building gaming PC, the top $700 cards didn't even provide comfortable performance or graphics. Back then, you were supposed to have several of this connected SLI or somethin. And even then, it wasn't always reliable, and it resulted in stuttering, lags, and graphical artifacts (in cases when it worked). Today, even $700 graphics cards are a much better product from a user perspective than the high-end cards of that time (and that's not even taking into account that $700 cards back then were much more expensive).
As for how much the prices have actually risen, it’s not hard to see if this is true or not. If doubling of prices doesn’t raise your eyebrows, I’m not sure what will.
When would this have been? I can not remember a time this was accurate for the games of the time, outside of a handful of meme titles like the original crysis that made bad hardware bets. Most of them fulfilled the needs of the software and hardware of the time. I'd say the biggest issue was that for a time, software and hardware were advancing so rapidly that you wouldnt get very long out of your hardware, but that's just the reality of rapid development and not the fault or failure of any specific hardware release.
> Back then, you were supposed to have several of this connected SLI or somethin.
SLI was aimed squarely at enthusiasts, not at joe-average PC gamer and it was certainly never a requirement. It existed as a halo feature for people chasing maximum performance, benchmark scores, and bragging rights.
I just don't see the cost savings of sharing a GPU overcoming the extra expense + profit such a service would need.
Of course, less latency is always better although running a traceroute between my IP and major city (Sydney) from 1,500 km equates to about 11ms latency with optimal routing. (Real life test, traceroute via an ISP Looking Glass).
Not if Nvidia is running the service.
Seems quite possible to me that Nvidia sells to the public just enough graphics cards to keep any frisky antitrust investigators off its back and reserves the rest for GeForce NOW, its "pay monthly for limited access to a remote gaming PC" service. The cards for NOW are billed to the BU running NOW at or below cost, the few cards available to consumers and System Integrators naturally have a huge markup due to extremely constrained supply, and Nvidia uses the fact that they are the thing behind the LLM Boom to ensure that they have -what a System Integrator in 2022 would recognize as- a reasonable price for just enough RAM for the computers that NOW rents access to.
Downvoters: notice the speculative nature of the previous paragraph. I'm not claiming that this is happening right now. I'm claiming that it's quite possibly more profitable for Nvidia to bill monthly for limited remote access to computers with Nvidia graphics cards in them than it is to sell those cards at retail and to SIs.
It's just far more likely that these GPUs actually do cost a ton to make right now.
No, only Nvidia makes and sells Nvidia GPUs. They're the sole supplier of the GPUs used in 95% of the graphics cards sold in the US.
> If both AMD and nVidia teamed up, it would leave a gap that either intel or some Chinese startup would jump on.
Fascinating.
a) Explain why the only even vaguely-recent cheap video cards were made by Intel, and why it looks like Intel has pretty much stopped making video cards? [0]
b) Tell me how that Chinese startup gets past USian Sinophobic/protectionist trade barriers?
c) Tell me how that Chinese startup convinces the big gaming development houses to ignore the advice of Nvidia's driver engineering team that just so happens to make their games work great on the hardware in NOW and really, really poorly on that unknown-to-US-customers Chinese startup?
> It's just far more likely that these GPUs actually do cost a ton to make...
You seem to have not been paying much attention to the reports of Nvidia, AMD, and major RAM and storage suppliers changing focus from the consumer market to the far more profitable datacenter (read as "LLM") market. Several such suppliers have exited the consumer space entirely. As any residential renter in San Francisco [1] can tell you, extremely limited supply drives price up to obscene levels.
[0] This shift in Intel's focus may or may not be related to Nvidia becoming the third- or fourth-largest Intel shareholder.
[1] ...or any other "hot" market with large, artificial barriers to entry...
Nvidia happens to have been the best option for a long time. But there are many alternatives, game consoles for example aren't particularly tied to the Nvidia/amd market and the ARM space offers tons of options. Apple makes a powerful GPU for their macbooks that isn't dependent on either of the major two.
Valve, Sony, and Nintendo are in a good position to move away from AMD in the future if they aren't providing competitively priced GPUs. Valve has been working on an x86 emulator for ARM for their Steam Frame which would pave the way towards PC games running on ARM chips.
This whole situation is largely like this because demand for hardware spiked rapidly. Processes and production take a long time to change, and no one knows if these prices are long term or if it's going to crash back to normal in a year. If the elevated prices remain for the future, competitors will move in. But they aren't going to develop new products and production in the case where it all crashes back to normal and nvidia continues selling affordable GPUs to gamers.
I just don't see any scenario where nvidia remains the only option while also not selling their GPUs to consumers and requiring them to rent them. By the time that happens the competition would have crushed them.
There's no hostility. I'm of the opinion that you're ignorant of the wider political and economic factors that have lead to us being in the situation under discussion. I know it's uncommon for the younger generations to believe that one can say "You're either ignorant or willfully blinding yourself to the entirety of the situation." as a statement of plain fact rather than an insult, but everyone would be better off if they'd permanently load that possibility into their brains.
Regardless, there's nothing two Internet Nobodies can say or do that will have any meaningful effect on the situation under discussion... so I guess we'll wait and see if -in five or ten years- "market forces" have made it so the overwhelming majority of "P"Cs are Chromebook-esque thin clients that are pretty much exclusively used to access subscription -or ad-laden- SAASes.
Can’t afford a computer because they bought up all the supply? They’ll conveniently sell it back to you with a subscription!
You’ll own nothing and be happy.
They've intentionally crafted an unsustainable business model in an effort to get users in the front door and raise their MAUs. We've seen this story before. We should know precisely where it's headed.
Sorry that “it is going unused”? From what I've read, most AI providers are capacity constrained.
However, that the hyperscalers and AI companies aren't doing this says a lot about their true beliefs about how much future demand AI will have.
AI companies claim they will need a ton of massive expansion, but are unwilling to take on the risk of the capital needed for that expansion.
I'm hearing a lot of sad whining from AI folks about how these chip makers are holding them back, but who actually has the money to finance the expansion easily? Chip makers have been through this game far longer, when Sam Altman went around claiming it was time for $7T of fabs the AI companies made it clear that they were willing to make ridiculous claims, eliminating credibility.
What's needed now is for them to funnel a tiny amount of their massive piles of cash into financing fabs directly.
With what money? They have to spend the money they get on hardware ASAP else they are left behind.
Just look at how Intel has struggled to compete in recent years, and they have been in the business for decades.
They forgot Moore's main lesson: only the paranoid survive. They thought they could coast, and it nearly killed them.
"Only the Paranoid Survive" is rather a quote and book title by Andrew S. Grove.
Most users don't seem to care about storing everything they generate in cloud services and this could easily be sold as an alternative to owning "expensive" desktop or laptop hardware.
If hyperscalers are using more RAM, and that RAM is not available for consumers, it means all the heavy stuff will happen in the cloud. Why would we want both the hyperscalers and consumers to have RAM simultaneously? Consumers would want more RAM to run local models but then hyperscalers capacity will be unused.
Most memory companies have backroom deals to exchange tit-for-tat patent violations against each other.
Not sure how a new memory manufacture comes into being without getting sunk from licensing costs?
And by doing this, they ensure local LLMs never become feasible for the vast majority of people and AI companies solidify subscriptions forever.
The reason memory prices can stay high for years in this mega cycle is because the 3 players will be very cautious on overbuilding. They’d rather under build, make great profit (not maximum) and reduce the risk of going bust if this suddenly ends.
Same for TSMC in chips.
Great opportunity for Chinese companies though. This shortage is exactly what Chinese companies need to scale.
Then why do only 3 companies make it?
When Samsung had to sell memory at a loss after COVID, no one came to save them. They buffered their memory division using profits from their other businesses. That’s how Samsung survives memory downturns.
According to some stories, this is how Samsung convinced TSMC to not enter the memory business - that you need a nation or other lines of business to prevent bankruptcies.
The market has stabilized to 3 players.
Because it's an incredibly capital intensive process, involving billions of dollars of investment into manufacturing infrastructure.
That is to say, making memory is quite hard.
Other examples from outside of tech of easy but capital intensive processes are power generation and railroads. Very easy to do, but easy to end up broken by overbuilding for demand that fails to materialize or stay stable for the duration of your financing.
I didn’t say owning a memory business is easy.
Placing the bet isn't as hard as making an accurate prediction.
These two aren't related.
Dram is a commodity because the you can replace a chip from hynix with a chip from micron, the have the same behaviour.
And a price competitive Dram isn't easy manufacture, or China would have made it already.
Exactly, so what’s the incentive for anyone to sink half a billy into building out more capacity.
The existing players get to rest on their laurels and succeed whether or not the AI bubble busts.
Samsung, SK Hynix, and Micron all have to balance between capex spending, making as much profit as possible, and risk of bankruptcy.
Heck, the US is now pressuring ASML to not sell even DUV machines to China, period.
Right now their opportunity cost is too high.
> risky it is to spin up a new fab
You don't need a new fab. You can build memory in 20 years old fab.
This boom is magnitudes higher than before. The attention will be endless.
I’m sure Nvidia, Elon, Tim Cook, OpenAI, Anthropic are already whispering in Trump’s ears to do something.
Journalism is dead. There will be no more investigative journalists like the type you describe.
You can't expect me to believe that any of those would want any kind of antitrust action against anybody.
Memory prices and shortages directly impact all of their profit margins and revenue.
Memory is a commodity, so I think you will be very lonely in your quest.
The VRAM in the 5090 is only made by one country in the world.
The 50xx series is special, because its ram is so dependent on a single commodity. It’s not like a 4090 or a 3090; their VRAM chips have been around for years.
If there’s a shortage or interruption in DDR7 VRAM, it seems like every GPU that requires it would explode in value.
I hope I don’t regret posting this because I’d really like to buy one myself…
I really need to shut up, or bite the bullet and by one.
If you graph the tokens per second on the 5090, your jaw will hit the floor at how cheap it is
What's the use case here? Churning out massive amounts of slop code through autonomous agents? Running openclaw 24/7? I think the proliferation of codex and claude code, compared to any of the cheaper open models suggests that at least for most software development, the 50-75% discount of open models isn't worth the hassle of the decreased intelligence.
My use case would primarily be in search, integration, and indexing other software projects with my own, as well as transcription/indexing of interesting video and audio content (eg Dwarkesh interviews) that I don’t have time to watch but want to easily search and apply to my projects, and search/indexing for useful information from things like Linux kernel and security mailing lists. Basically there is a lot of stuff that, if the cost were low enough, I would point a reasonably intelligent AI at to distill out useful information and apply it to my projects, or just cherry pick the interesting things out and surface them to me so I don’t have to wade through all the mundane stuff and man-made slop getting in the way.
All of that feels like something that a $20 chatgpt pro subscription is for, maybe with slightly better tool use capabilities. There's no way that a $4000 purchase on a GPU would ever be worth it if all you're doing is running a handful of queries per day.
It’s ok if you don’t want to do the same kind of thing but I find it weird how dismissive so many people get about wanting to use LLMs for large projects, or how anybody who says they’re using them for these kinds of things (I’m doing similar for other stuff) gets challenged on what they’re doing it for.
Free for approximately 8 hours (assuming perfect weather conditions) and excluding unit cost and maintenance cost.
It has a cost.
I don't use it for coding, I have $20 Gemini, $20 codex, etc.
But then I got the framework board for $1700, now it's $2700
(Intuitively, that's because the issue of whether any active weights are being shared among requests - thus, any memory throughput is being reused - is a generalized birthday problem. That's why even having a few parallel requests is quite effective. Especially since the "random" choice of experts happens anew at any single layer, so there's a lot of independent samples.)
For prefill, it's really easy to batch MoE and get really good tk/s, even on a single stream.
For decode, you will run into the problem that:
1) you need more parallel requests which means more memory for context
2) 5 requests will not give you very much expert overlap on parallel requests
I'm not sure what you are claiming. Decode is bottle-necked by memory bandwidth. To see a speed up of 2x, you have to ensure each expert weight memory fetch can be used by 2 parallel streams. What exactly is the average factor you are claiming for 5x parallel streams (due to "birthday paradox" factors)? The Birthday paradox isn't really relevant here. It's about coverage, not parallelism.
> Memory for context is an issue, but recent models like DeepSeek V4 use very little of it even at relatively large contexts.
This is not true.
If it's 4k instead of 2k msrp, that's a 100% increase.
The RTX 5090 is faster than an H200. It just has less ram (32 vs 141), doesn't have NVLink, and technically isn't allowed to be used in a datacenter.
The datacenter GPUs sell at an 80% margin. They're incredibly overpriced. But the laws of supply and demand are undefeated and so here we all are.
H200 has HBM and much more 64-bit compute
RTX 5090 has more CUDA cores that run at a higher clock speed. H200 has more RAM and significantly more RAM bandwidth.
Which one is net faster depends on your use case. But you may be very surprised that many workflows are faster on an RTX 5090!
Also had to do an Intel build, and there was no way we were going cudimm at current prices. =3
A US soldier i know commented that the iranian ai slop is "scary and powerful".
No doubt Cloud Gaming is in the cards for the future, only purists like myself with an RTX 5090 will pay premium for offline gaming
Once enough gaming compute runs at the edge it also allows for more technically advanced games than would currently be economically feasible (but aren’t made mostly for lack of a market/adoption of cloud gaming and the resulting lack of technical know-how). So I think it will stick and probably end up winning over the holdouts, once the cost of rendering the games they want to play with consumer hardware becomes too large to stomach.
NVIDIA in their recent quarterly report stopped categorizing "Geforce" as a single category, and merged it into "Edge-Computing".
If you are a PC Gamer or PC Enthusiast as I am, then we have some dark times ahead.
Or, we could be fucked.
"Order yours now, for just $99.99 per month, hardware included! Order today, and you will get three months of 'Office Suite' for free, with a small additional cost of $49.99 after month 4. On a tight budget? Switch to the yearly subscription, and pay comfortably in 18 installments."
Why were tech savy investors unable to figure this out when the datacenter craze had already started?
How to explain this lag between quickly rising demand for all datacenter components besides memory?
https://davidoks.blog/p/ai-is-killing-the-cheap-smartphone
Maybe long-term purchase agreements from big buyers might have helped convince them it's okay to build, but apparently it didn't happen.
The entire sector is now facing a critical RAM starvation crisis where memory manufacturers are actively slow-rolling supply just to keep prices high and avoid running out entirely.
This has created an unprecedented supply-and-demand distortion where desperate companies are getting rejected even at a 5x markup, and mission-critical SKUs are skyrocketing to 10x and 20x their baseline value.
It is a macroeconomic squeeze at a staggering scale, and the massive venture scale opportunity lies in capturing the value created by this memory gatekeeper.
From the perspective of an armchair economist, the winners will be the investors who invest in RAM wisely. The losers will likely be cash strapped SAAS companies. They’re almost completely dependent on a fleet of servers in the hyperscalers, and they’re leasing those servers and services. That leaves small SAAS companies exposed to incoming inflation in the cost of hosting.
Which they will pass on to their customers. If their product provides enough value the customers will pay.....
A lot of capex is supposed to go into the datacentres, didn't they know that datacentres need to be filled among other stuff with RAM? I wonder if at some point we will discover that there is a shortage of fibre optic cables of SFPs ...
PS: Obviously armchair economist here too ... but for it doesn't seem too difficult to foresee the increase of the demand.
AI growth is locked in now, only if it were to stop will demand be abated.
As long as the discussion seems focused on memory, I'd suspect the latter, but if its really the semiconductor boules/wafers, then I'd expect the boule growers to profit, not the memory makers, who just pass on the cost.
So which is it?
Dram is just extremely specialised.
I asked for evidence different people keep feeding me opposite stories: one insists its not fab capacity but wafer competition, with a recent article claiming HBM3E takes 3 times as much wafer area per bit than LPDDR5X. Others tell me the complete opposite: its fab capacity, not wafer shortage.
Do we have citable references to ground either set of claims?
From your sibling comment, I think you're interpreting the 3x HBM stat as contributing to making wafers scarce. It's more that the next wafer to be processed in a fab is especially precious, making the opportunity cost larger. The beach sand remains plentiful.
So which is the bottleneck: fabs or boule growing?
also consider how most solar panels are monocrystalline silicon, how credible is silicon wafer shortage ... really? there is so much disinformation in this market...
Surely they need GPU capacity and would need memory for those GPUs but OpenAI doesn't build GPUs or any hardware, right? So did they pay to keep the supply locked up, or do they have the ability to put that ram into use?
I only feel sorrow for the electron devs, they will have a hard time.
SeedLM from Apple is an interesting approach for inference memory efficiency. I'd like to see someone try and build that into training so that it's not a post training compression step.
doesn't matter anyway when things are not reasonably priced. i am stuck at the same memory capacity in my personal system for the better part of two decades, partially due to the above and the current pricing today.
If you made it 10x cheaper right now you would see a truly unimaginable wave of bot slop.
we are going to have amazing cheap used hardware for a decade
WallstreeetBets has been disturbingly accurate in its predictions - basically anything related to AI.
Memory squeeze will get worse before it gets better.
As you stated it, it would merely be a property of (nearly) all demand curves. Jevons paradox only happens sometimes. It isn't a law.
Generally when someone replaces their vehicle the new one is more fuel efficient than the old one even if I bought the same car.