Tips to do in case of windows replication failure

Why Windows replication fails with Hystax VMware Replication Agent - quiesce failure

Challenge

Replication fails when a Windows machine is being replicated with Hystax VMware Replication Agent with the following error message:

"Backup creation failed with code 500, reason: An error occurred while quiescing the virtual machine. See the virtual machine's event log for details."

ESXi logs shows quiesce operation failure:
Windows replication fails



Cause

VMware Tools installed on the machine is not able to perform sync operation requested by ESXi to perform a clean snapshot.
It's a known VMware issue described in VMware KB 

Solution

Several ways resolve the issue:

Reset VSS service

Restart VSS services on the machine in case of the following error:
  1. pyVmomi.VmomiSupport.vim.fault.GenericVmConfigFault: (vim.fault.GenericVmConfigFault) {

       dynamicType = <unset>,

       dynamicProperty = (vmodl.DynamicProperty) [],

       msg = 'An error occurred while saving the snapshot: Failed to quiesce the virtual machine.',

       faultCause = <unset>,

       faultMessage = (vmodl.LocalizableMessage) [

       (vmodl.LocalizableMessage) {

          dynamicType = <unset>,

          dynamicProperty = (vmodl.DynamicProperty) [],

          key = 'msg.checkpoint.save.fail2.std3',

          arg = (vmodl.KeyAnyValue) [

             (vmodl.KeyAnyValue) {

                dynamicType = <unset>,

                dynamicProperty = (vmodl.DynamicProperty) [],

                key = '1',

                value = 'msg.snapshot.error-QUIESCINGERROR'

             }

          ],

          message = 'An error occurred while saving the snapshot: Failed to quiesce the virtual machine.'

       },

       (vmodl.LocalizableMessage) {

          dynamicType = <unset>,

          dynamicProperty = (vmodl.DynamicProperty) [],

          key = 'msg.snapshot.vigor.take.error',

          arg = (vmodl.KeyAnyValue) [

             (vmodl.KeyAnyValue) {

                dynamicType = <unset>,

                dynamicProperty = (vmodl.DynamicProperty) [],

                key = '1',

                value = 'msg.snapshot.error-QUIESCINGERROR'

             }

          ],

          message = 'An error occurred while taking a snapshot: Failed to quiesce the virtual machine.'

       }

       ],

       reason = 'An error occurred while saving the snapshot: Failed to quiesce the virtual machine.'

    }

To restart VSS, you can follow the steps below:
      1.  Go to Administrative tools 🠒 Services 
      Find "Volume Shadow Copy" service



      2. Start or restart the "Volume shadow copy" service. Make sure that startup type is Manual and service status is Started
 



Reboot the machine

- Reboot the machine
- Create a new replication of this machine 

To avoid this in the future, make sure:
 - VMware Tools is up and running on the machine - run/repair if necessary
 - VMware Tools is not blocked by Windows firewall - add VMware Tools to exceptions or turn firewall off


Change quiesce policy

There are several replication strategies that you can follow when backing up a VMware machine and running into a failure to quiesce for the snapshot. 
Although there is a default scenario, which stops the backup in case of an error, an alternative can be selected in advance by setting a corresponding value for the "quiesce_strategy" key in the hvragent configuration file.  
The config file can be found at: /run/config/hvragent.conf  



quiesce_strategy" can be assigned one of the following values: 
  1. "enforce" (default) - A quiesce attempt is made. In case of failure, backup is canceled.
  2. "optional" - A quiesce attempt is made. In case of failure, a non-quiesced snapshot is created.
  3. "disable" - Always create non-quiesced snapshots.