Big infrastructure crashes…
Outsourcing, cutting cost, lack of *regular* testing, lack of technical management, lack of procedures, lack of redundancy and guess what is going to happen ?
- Morgan Chase blames Oracle for online bank crash
- Massive Bank Failure Due to Human Error, IBM Blamed
- Downtime nightmare could cost DBS dearly
- NetApp and TMS involved in Virgin Blue outage
- ‘Amateur’ IBM brings down Air New Zealand
- Bank of America Online Banking Experiences Down Time
- Massive computer outage halts some Va. agencies
- Facebook: More Details on Today’s Outage
- Australian bank customers are demanding compensation after bank systems failure left thousands out of cash over the weekend as bank systems failed to update accounts..
- Re: Corrupt rollback segment or was it Netapp storage offlining LUNs (Netapp storage misconfiguration fault?)
- Her Majesty’s Revenue and Customs systems downed by upgrades
- Alaska Airlines cancels flights after outage
- Kevin’s Closson blog entry on Alaska Airline system crash
- UK net banking website falls flat on its bank
- Amazon EC2 goes down, taking with it Reddit, Foursquare and Quora
- London Ambulance Service downed by upgrade cockup
- HSBC UK systems major outage — 8-node beast down, was it Veritas Cluster for RAC, VXFS or RAC problem?
- Titsup EMC VNX kit unleashes 5 days of chaos in Sweden… two disk failures on the flash tier/cache without spares? (AKA let’s save some bucks?)
- Amazon outage: Life of our patients is at stake – I am desperately asking you to contact
- Disaster Recovery Disaster: Drill Gone Wrong Leads To Loss Of Data On 800K (IBM-managed disaster recovery exercise gone wrong)
- E24Cloud: Odzyskiwanie danych ze środowisk wirtualizacyjnych [PL] – a detailed case study of total SAN failure in Cloud
- RBS collapse details revealed: Arrow points to defective part — or was it outsourcing again?
- 16.9 million customers at RBS, Natwest and Ulsterbank being frozen out of their accounts for days, and ongoing issues in some cases. – Half the team at the heart of the RBS disaster WERE in India
- City’s IT Infrastructure Brought To Its Knees By Data Center Outage
- Hundreds of websites go titsup in Prime Hosting disk meltdown — have the vendor of the array ever heard about RAID6 scrubbing/regular data validation and monitoring?
Excellent presentations/entries on avoiding big disasters and their reasons:
Articles from Availability Digest-an extremely good archive!
Want to share a link? Drop me an email.