It’s still not that common to have one die. Servers have ECC memory which can get you by longer even as a DIMM starts to die. It’s also rarely run at speeds consumer RAM is run at.
The thing with server memory is when you get to point where you have hundreds of servers and each one has 12 to 24 DIMMs the chance you have a bad one somewhere increases.









It isn’t magic but RAM doesn’t tend to just stop working. You’ll start getting reported error correction events on a bad dim before it might fail which usually means it’s replacement time.