This content originally appeared on DEV Community and was authored by Olivier Miossec
On Friday Morning 19th of July 2024, a faulty update from the security firm CrowdStrike put millions of corporate computers in a BSOD loop.
I will not return to the details, some articles will explain it better than me. You can find them here and here.
The root cause of the bug is updated driver files, C-00000291*.sys caused the BSOD. Removing these files solves the problem.
In the physical world, you need to start Windows in Safe Mode and remove these drivers, but it is not so easy in the cloud.
Fortunately, Microsoft has come up with a solution for Azure.
The solution is based on several Azure CLI commands. This will copy the VHD of the specified VM and then deploy a VM to access the disk and remove the faulty driver before restoring it to the initial VM.
az vm repair create -g <TargetVMResourceGroup> -n <TargetVM> --verbose
az vm repair run -g <TargetVMResourceGroup> -n <TargetVM> --run-id win-crowdstrike-fix-bootloop --run-on-repair –verbose
az vm repair restore -g <TargetVMResourceGroup> -n <TargetVM> --verbose
These commands will download the CrowdStrike Fix payload, copy the disk, create a new VM in a separate resource group, perform the fix, and replace the disk of the target VM.
The operation is very manual, and everything is interactive, you will be asked to respond Yes and you will need to provide a username and a password for the repair VM (The password must be at least 12 characters long, with a special character, a number, and different case).
It works well, but it will take at least 40 minutes per VM so scaling is difficult.
There is another solution, as the VM is in working condition for 1 to 2 minutes after the restart before entering the BSOD loop, it is possible to use a script to delete the faulty driver.
You can imagine a scenario where you stop an Azure VM then start the VM and execute a script via Run Command.
Here is an example of a script you can apply via the run command
$crowdStrikeDefaultFolder = "C:\Windows\System32\drivers\CrowdStrike\"
$crowdStrikeFolderExist = Test-Path -Path $crowdStrikeDefaultFolder -ErrorAction SilentlyContinue
If ($crowdStrikeFolderExist) {
$faultyDriverFiles = "$($crowdStrikeDefaultFolder)C-00000291*.sys"
try {
$faultyDriverFilesList = Get-ChildItem -Path $faultyDriverFiles
}
catch {
write-error "Unable to get driver list"
Write-Error -Message " Exception Type: $($_.Exception.GetType().FullName) $($_.Exception.Message)"
exit 0
}
foreach ($faultyDriver in $faultyDriverFilesList) {
try {
remove-Item -Path $faultyDriver.FullName -Force
}
catch {
write-error "Unable to delete driver file $($faultyDriver.FullName)"
Write-Error -Message " Exception Type: $($_.Exception.GetType().FullName) $($_.Exception.Message)"
}
}
}
It will not correct all the VMs, because it depends on how the VM boots and if the Azure agent can run the script before the Crowdstrike agent is fully started but it will reduce the number of affected VMs. This way you can scale the fix across several VMs.
This content originally appeared on DEV Community and was authored by Olivier Miossec
Olivier Miossec | Sciencx (2024-07-23T17:22:47+00:00) Ways to fix Crowstrike in Azure VMs. Retrieved from https://www.scien.cx/2024/07/23/ways-to-fix-crowstrike-in-azure-vms/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.