Cluster network ‘SAN1’ is partitioned

You may encounter this error, about your storage networks, when setting up your Windows 2008 Failover Cluster. The following errors, Event ID 1129, will show up in Cluster Events

Cluster network ‘SAN1’ is partitioned. Some attached failover cluster nodes cannot communicate with each other over the network. The failover cluster was not able to determine the location of the failure. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapter. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.

and…

Cluster network interface ‘Node1 – SAN1’ for cluster node ‘Node1’ on network ‘SAN1’ is unreachable by at least one other cluster node attached to the network. The failover cluster was not able to determine the location of the failure. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapter. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.

The suggested fixes didn’t make sense to me as this was a storage network and of course the Nodes can’t (shouldn’t) communicate via these networks. As it turns out it’s a simple configuration change to tell the cluster that node communication is not allowed on this network.

In Failover Cluster Manager expand the cluster and select ‘Networks’. Right click the appropriate storage network and select ‘Properties’.

Choose the "Do not allow cluster network communication on this network" option.

san_cluster_network

Repeat for any other appropriate networks and you should stop seeing this error in the logs.


5 Comments

  1. Tom says:

    Hi Rhys,

    Thanks for this nice article.

    We recently had a first occurance of this entry in our eventlog. It lead to a major disturbance on the communication to our sql.

    After rebooting the servers the error disappeared from the eventlog and communication to the sql returned normal again.

    Did you ever experience that this error can cause such an disturbance in communication ?

    Best regards,
    Tom

  2. Rhys says:

    Hi Tom,

    No, we’ve never had any disruption due to this error. The cluster nodes simply shouldn’t have been trying to communicate over this network as it’s not possible.

    Rhys

  3. Lisman says:

    Hi Rhys;

    I thank you so much. I have been struggling with this issue for some time now and your article helped resolve the issue.

  4. Yemi says:

    Hi Rhys,

    Thank you for this article. It solved the Sql server Cluster network issues we’ve been having for a while. Microsoft needs to update their knowledge base because nothing on their site points to the root cause.

    Regards,
    Yemi

  5. Azeem says:

    This post deserve a post.

    I did this for Exchange 2010 cluster network issue and its now resolved.

    Thank you for smart solution.

Leave a Reply