Last weekend saw one of the largest collapses of online infrastructure since McAfee’s 2010 Windows XP bug, due to an update-induced failure on systems running CrowdStrike. As the damage is slowly being reversed, it’s instructive to look at the consequences of CrowdStrike and what it can tell us about the future.
What happened?
CrowdStrike, a major enterprise cybersecurity player, pushed an update on Friday the 22nd to their Falcon Sensor product. The update ended up causing systems it was installed on to crash and display the blue screen of death. Worse, affected computers would then be put in a boot loop where they would power on, attempt to boot and immediately crash out again.
As a consequence of the boot loop, further updates could not be pushed out to fix the issue. Affected systems needed to be fixed manually, one at a time. This left companies across the world and across industries completely frozen out.
The outage grounded planes, froze trains, and generally wreaked havoc. Microsoft said it was taking “mitigation actions” and looking for a solution to issues. CrowdStrike CEO George Kurtz also posted on X, stating that “CrowdStrike is actively working with customers impacted by a defect found in a single content update for Windows hosts.”
What’s the immediate workaround?
There is a manual intervention that can restore systems to functionality,
- Boot Windows into Safe Mode or WRE.
- Go to C:\Windows\System32\drivers\CrowdStrike
- Locate and delete file matching “C-00000291*.sys”
- Boot normally.
The issue lies in the fact that manually fixing individual systems takes time, and as a major player on the enterprise market CrowdStrike was in active use on a number of critical systems from airports all the way to hospitals. The time-sensitive nature of these systems means that some users might consider instead restoring them from backups or using Microsoft’s built-in shadow copies.
Either way, the proposed fixes are not perfect – costing time and potentially also data if backups are not fully up to date. The overall impact is still being calculated but the cumulative economic damage promises to be immense.
What’s next?
The weekend’s events will still take time to fully process. That being said, the consequences of CrowdStrike’s misstep are likely to be bittersweet – bringing positive change to balance out the severe financial loss and difficulties it caused.
The fallout highlights more than before the need to keep a set of backups and archives for your critical data; the majority of the current losses are from organizations which are completely unable to function until they’ve recovered access to their data; with a backup, they could have restored to a previous, unpatched iteration with less severe impacts on their performance. Even more ideally, in some cases employees with end-user access to archives could have continued to be productive while the main systems were being repaired.
As such, it is to be hoped that organizations worldwide will recognize the vulnerability of online systems and take steps to fix the issue. It is perhaps useful that the incident was not a cyberattack or a deliberate attempt to disrupt business, but a simple error; it makes it harder to pretend this cannot happen to anyone, anywhere, and hopefully will prompt immediate action on behalf of many to make sure the next incident is less severe.
Secure your business continuity with TECH-ARROW