Monday 6 April 2015

Exchange DAG Node down, Windows Cluster Service will not start

Problem: After a server reboot an Exchange DAG member is down, the Cluster Service Fails to start.

Unable to start the cluster service it terminates with the Event ID 7024

"The Cluster Service terminated with service-specific error cannot create a file when that file already exists." 


Error Messages in Event Log:



Service Control Manager Event ID 7001

The Cluster Service service depends on the Csv File System Driver service which failed to start because of the following error: 
The system cannot find the file specified.


Service Control Manager Event ID 7000

The Csv File System Driver service failed to start due to the following error: 
The system cannot find the file specified.



MSExchangeRepl Event ID 4113

Database redundancy health check failed.
Database copy: DB_NAME
Redundancy count: 1

Error: Passive copy 'DB_NAME\EXCH_SERVER' is not UP according to clustering.


MSExchangeRepl Event ID 2060

The Microsoft Exchange Replication service encountered a transient error while attempting to start a replication instance for DB_NAME\EXCH_SERVER. The copy will be set to failed. Error: The NetworkManager has not yet been initialized. Check the event logs to determine the cause.




Solution:

Check Device Manager and the Microsoft Failover Cluster Virtual Adapter has a yellow exclamation mark (this is netft.sys)

“This device is not working properly because Windows cannot load the drivers required for this device. (Code 31)”

Need to remove the "Microsoft Failover Cluster Virtual Adapter" and reinstall it using the following steps: 

1. From Device Manager/Network adapters, click on View-->Show hidden devices and then uninstall "Microsoft Failover Cluster Virtual Adapter" 
2. Reboot the server for changes to take effect 
3. After reboot Open Device Manager/Network adapters 
4. From Action Menu Select “Add Legacy Hardware” and then click Next 
5. Select “Install the Hardware that I manually selected from a list (Advanced)” and click Next 
6. Select “Network Adapters” and then click Next 
7. Select “Microsoft” From the left pane and select “Microsoft Failover Cluster Virtual Adapter” from the right list 

8. Once the Adapter is added successfully you be able to start the cluster service successfully and let the DAG sync up the queue log files.

NOTE: If you reboot your server and issue comes back follow the same steps but disable IPv6 on the network adapters since it is not required when Exchange is not on a Domain Controller.