Automatic Switchover for eight-node MetroCluster

September 7, 2020
by ProLion Team

With the release of ONTAP 9, NetApp made it possible to expand two four-node Fabric MetroClusters into one eight-node MetroCluster.

Why eight-node MetroCluster?

Initially many asked which use cases an eight-node MetroCluster would be required. One case has proven to be very useful, namely when migrating from one four-node MetroCluster to another. Particularly then it makes sense to first expand to an eight-node MetroCluster and then reengineer back to a four-node after the migration.

We are also seeing the eight-node MetroCluster more and more frequently in permanent production use. For larger oganizations, it can be very beneficial to be able to manage different NetApp controllers within a cluster and nevertheless do not miss the high availability of a MetroCluster.

Four-node MetroCluster
Eight-node MetroCluster

Highest availability: What about the automatic Switchover?

One question that always remains with MetroCluster, and should also be asked with an eight-node setup: What about the automatic switchover?

The NetApp solution

NetApp offers two potential solutions for automatic switchover with the MetroCluster: Tiebreaker and Mediator.

Customers who have already had experience with MetroCluster understands that Mediator is not a viable option, because it is only available for MetroCluster IP and an eight-node MetroCluster is only suitable for fabric-attached systems.

NetApp Tiebreaker supports eight-node MetroCluster configurations. However there are a few limitations. For example, the tiebreaker does not trigger any alarms or actions if one HA pair is down on the disaster side. As a result, services are of course offline. In addition, the NetApp ONTAP 9 Documentation Center also shows that operating the tiebreaker in active mode introduces several risks; and only in active mode the Tiebreaker would actively initiate a switchover at all.

All these challenges can be overcome with ProLion ClusterLion!

The ProLion solution

ClusterLion can already perform a switchover even if only one HA pair fails. And thanks to the secure power off mechanism and the independent infrastructure, the risk of wrong decisions is also reduced to an absolute minimum.

ClusterLion in a nutshell

ClusterLion can be retrofitted at any time with no interruption to your existing storage cluster during operations. After commissioning ClusterLion permanently monitors the power supply, the interconnects and selected services of the storage cluster. If cluster services are affected by a failure then this is recognised instantly and reacted to accordingly. Firstly, ClusterLion cuts off the power supply to the affected cluster nodes thereby creating a consistent state within the cluster. This also prevents unintentional restart of the cluster node. The transfer of cluster services to the still-functioning location via in-house UMTS connections is initiated by ClusterLion as a second step. Your services continue to run securely and in an orderly manner at the still-functioning location – excellent availability courtesy of ClusterLion.

ClusterLion functionality overview

More information can be found at