Notes from the field: Windows 2019 Storage Replica lock-up on VMware

On one of my latest projects consisting of a new Windows Server 2019 setup on VMware and making use of Storage Replica in a server to server setup for replicating home drives and profiles I came across a random lock-up of the VM and by that inaccessible shares.

The setup was all working until the failover part. It seems there is an delay of some sort and the failover isn’t instant or takes a while to be active with the server being unresponsive and disconnecting any form of management to the VM in question(VM tools are not responding as well and console login will not work in this failover time). I’ve tried the actions again of doing a storage replica failover and I got an BSOD on the VM stating: HAL INITIALIZATION FAILED I’ve tried all of this in a separate test setup and had this working without any problems on Server 2016, and Server 2019. Only this time it gave me this strange behavior. The difference in my own setup is HW level 14 and this new one had HW level 15 and the hosts are 6.7 13981272 build and my own setup is 6.7 14320388 build (older builds have also worked fine for me)

After some troubleshooting and providing the BSOD dump findings to VMware GSS support it became clear that version 10341 of VMware tools was the troublemaker. The solution was to upgrade to the latest 10346 VMware tools. The vmm2core tool provided me with the means of creating a dump file with the VM in question.