CrowdStrike, Microsoft, and the Biggest Global IT Blockage 🦅

Arun Kumar Dave
5 min readJul 23, 2024

--

So, what exactly happened?

On Friday 04:09 UTC, CrowdStrike released a sensor configuration update to all their configured systems, a regular update like hundreds before, that affected Windows OS alone (not Linux and Mac). This update was designed to target newly observed, malicious channel pipes being used by common C2 frameworks in cyberattacks.

The configuration update triggered a logic error that resulted in an operating system crash. We still don’t know what caused that logical error. But Windows allowing the auto-update in the machines, and Crowdstrike accessing some of the contents of kernel modules caused the Operating System to Crash leading to a Blue Death of Error (BDoS) screen.

Crowdstrike is one of the endpoint security drivers, like many other drivers in the OS. It deals with malicious access or anomaly behaviors of files and folders inside the machine and has access as deep as kernel-level modules.

Every machine (not just PCs or laptops) that has Windows OS, and that was online during that time crashed. The machines at airports, banks, and communication channels experienced crash.

Usually, when something like this happens, the devs and the guys in the office rush around to find the issue, reproduce and fix it. Most likely, they revert the code which is sometimes the best option, or provide a new patch with the fix. The time taken is usually proportionate to the size of the issue.

Crowdstrike took 1 hour to fix and update the new version driver at 05:27 UTC. That’s pretty much the time any other company like Google, Meta, or Microsoft itself would’ve taken to give the fix. But did the problem get resolved?

No.

Why not?

Because the problem had occurred in the user machine, and not in any cloud server. Remember, Google had an outage that was caused by a Load balancer issue, they took 2 hours to fix and then everything was back online and returned to normal. It’s because that’s Cloud. The same happened with Facebook/Instagram last year.

So, here Crowdstrike released a new driver and Microsoft was quick enough to provide the fix, and it within 2 minutes they posted the issue fix -

  1. Boot Windows into Safe Mode or the Windows Recovery Environment
  2. Navigate to the C:\Windows\System32\drivers\CrowdStrike directory.
  3. Locate the file matching “C-00000291*.sys”, and delete it.
  4. Boot the host normally.

But the problem is that Crowdstrike or Microsoft can’t do anything after this point — it was now the job of individual machine owners or SysAdmins in case of shared network connectivity to work on the fix. And if the machines are bit-lockers protected — It’s a nightmare. There’s no way IT admins can run auto-update.

Now, how many of the Airport staff can do the given process, or at least reboot the system in safe-mode. And imagine the scale, Microsoft said around 8.5 million machines were affected due to this outage. Every single machine had to be rebooted at least once manually by someone.

That’s the reason the outage lasted hours or even days in some parts of the world.

Is it the first time something like this is happening?

No, Crowdstrike had a faulty update like this a few months ago, that affected Debian Linux servers, but nobody seemed to bother about it because very few were affected. And they provided the fix only after a day.

Who was affected the most?

Most US services like Airlines, Banks, call centers, and transportation- It was 9:07 pm when the issue happened, and the fix came at ~10:40.

Some of India’s services — It was 9:37 am when the issue happened. But Crowdstrike’s market share in India is very low compared to the US, and it affected only Windows servers. Imagine if McAfee had hit, which holds a major market share of India’s security endpoint system.

And, who was affected the least?

Regions of Europe, and UAE, mostly because they were sleeping when the problem occurred, and before they turned ON their machines, Crowdstrike had updated the fix.

So, What is the future solution? Should Microsoft just stop using Crowdstrike?

Not really. Because services like Crowdstrike are essential for safeguarding the system against malicious activities, they provide critical information that is very useful for times like these when hacking is this high.

Source — Crowdstrike Demo video

Crowdstrike holds a very high reputation in the Software Industry for providing end-point central solutions to ransomware to many software companies. That includes more than 200 of the Fortune-500 companies.

But why do big companies prefer Windows over Linux/Ubuntu which is free and open source? Of course, Linux/Ubuntu is free, and difficult to use for normal indivuduals. Because Windows is easy to use, even for SysAdmins, that’s why big companies prefer Microsoft’s OS.

But as they say, with great power comes great..?

This update was much like configuration changes, and there are debates around the world that such changes cannot be tested against every affected file. As a software developer myself, I cannot agree with this more. As always, there are mixed reactions everywhere.

Now comes the BIG question?

Are we ready for AI? The pace at which we’re moving, is it really safe to say such blockages won’t happen again?

This incident was a humane error, the misconfiguration of .sys file was caused un-intentionally, and we can see how many services it has disrupted. It is very much possible to “intentionally” misconfigure something to cause an even bigger harm. Very much possible. And the damage can even harm someone physically.

Every time a new problem at this scale arises, we (humans and machines, both) learn not to repeat it again, and but the unique types of such problems are not finite.

References -

  1. https://enlyft.com/tech/products/crowdstrike-falcon-platform
  2. https://crowdstrike.com/explore/business-value-of-crowdstrike/falcon-platform-demo
  3. https://www.youtube.com/watch?v=wAzEJxOo1ts
  4. https://learn.microsoft.com/en-us/windows/win32/ipc/named-pipes

--

--

Arun Kumar Dave
Arun Kumar Dave

No responses yet