Q&A: What caused the global IT chaos and how long will it take to fix?

Airlines grounded and media firms knocked off air after ‘blue screen of death’ error screens seen on Microsoft Windows workstations following Crowdstrike update

A major global IT outage has caused significant disruptions worldwide. Photograph: Sasko Lazarov / RollingNews.ie
A major global IT outage has caused significant disruptions worldwide. Photograph: Sasko Lazarov / RollingNews.ie

A major global IT outage hit businesses around the world on Friday, grounding airlines and knocking media companies off air.

In Ireland, Ryanair was advising customers travelling on Friday to get to the airport earlier, and check in at the airport rather than through their apps. In the UK, trains, health services and airlines were affected by the IT issue. Sky News was temporarily off the air, although it later resumed broadcasting – with the outage the top story throughout the morning. Emergency services in Alaska were impacted by the problem.

DownDetector, a website that tracks outages, was practically melting under the strain.

In short: global chaos.

READ MORE

What happened?

Late on Thursday, reports began to emerge in Australia and the US that people were experiencing issues with Microsoft’s cloud services, including Microsoft Azure, a cloud computing service that is used by businesses across the globe, and its office software suite Microsoft 365.

Irish businesses caught up in major global IT outageOpens in new window ]

Shortly afterwards, some users also reported the “blue screen of death” on computers, preventing them from accessing systems or being able to reboot.

Initially, there were fears that a well-coordinated cyberattack was to blame for the problems, given the range and breadth of the targets. But it seems that the problem was more mundane.

What is the problem?

There are two issues at play here. First, the Azure issue, which started on Thursday night. The company’s cloud services, including Microsoft 365, were affected for many customers as a result.

But then came the “blue screen of death”. A software upgrade to global cybersecurity company CrowdStrike’s Falcon sensor, which is designed to identify and block threats to IT services, seemed to be the root of the problem.

Microsoft’s Azure unit said it was aware of an issue that impacted virtual machines running the Windows OS and the CrowdStrike Falcon agent, getting stuck rebooting. “We recommend customers that are able to, to restore from a backup from before this time,” Microsoft’s status update page said.

Companies around the world hit by IT outageOpens in new window ]

It only affected Windows systems, meaning Linux and Mac users have been spared.

On X, the platform formerly known as Twitter, CrowdStrike chief executive George Kurtz said the problem was caused by a defect found in a single content update for Windows hosts. “The issue has been identified, isolated and a fix has been deployed.”

So it appears to have been two separate problems and a case of unfortunate timing.

Is it fixed?

Hopefully. The Azure problem has been sorted by Microsoft, and Microsoft 365 services are showing as “up and running”.

CrowdStrike’s Kurtz has apologised and said the company would work with all of its customers as they work to get their operations back online.

“We’re deeply sorry for the impact that we’ve caused to customers, to travellers, to anyone affected by this, including our company,” he said on NBC News’ Today programme. “Many of the customers are rebooting the system and it’s coming up and it’ll be operational. It could be some time for some systems that won’t automatically recover.”

There may still be some delays for customers however, as companies deal with the knock-on effect of the outage.

What happens now?

Given the widespread nature of the problem, there is likely to be scrutiny on the dominance of a handful of companies in the tech sector. With so many services taken down with a single update, it is inevitable that people will start to question the wisdom of the current dependence on cloud services.

“CrowdStrike has had a catastrophic error that has taken a large percentage of the global IT systems offline. On the one hand it’s shown how large CrowdStrike’s market share is, but it’s also shown how fragile the interconnected world we live in can be,” said Richard Ford, CTO of Integrity360. “In this instance a small change has led to a huge global impact, and the questions will be how and why it happened. CrowdStrike were very bullish in their mission statement: “We Stop Breaches”. Unfortunately, this time, they’ve created the outage.”

Who is Crowdstrike?

You might be familiar with the name; the company was one of those who helped investigate the cyberattacks on the Democratic National Committee, along with the connection to Russian intelligence services. US-based CrowdStrike is a global cybersecurity company that offers threat intelligence, monitoring systems to actively detect intrusions and deal with them. However, to do that the software must have wide ranging access to systems, which means when things go wrong, it can cause chaos for the companies they were trying to protect.