It has been a while and weﾒre still working on the Sun cluster. We have
narrowed the issue down to the failover/fencing agent for Sun cluster.
Either it is not happy or weﾒre misconfigured it. ISCSI is fine, but
when we use an iSCSI LUN as a quorum device bad things happen.
I have 2 node Solaris Cluster running Sun Cluster 3.1
Solaris is Solaris 10 on x86
We are trying to use the N-series to provide a quorum device via iSCSI. We can attach to the drive just fine from both nodes as long as it is “just a drive”, but as soon as we make it a quorum device it’s reservation number changes and the cluster nodes kill each other off due to iscsi reservation conflicts. In short, it sounds like the failover fencing driver is not working properly. Whether this is due to our config or something else is the question. I have been through the NetApp doc on setting this up as have the Sun techs. What would be really helpful would be to do a gotomeeting with someone who knows about this application to double check my configs and see what we can find out.
One possibility we have wondered about is whether the driver is hardcoded to recognize particular IQN’s. The NetApp hardware and N-series seem to vary here ... Is this what may or could cause issues? In short, if the failover fencing driver is hardcoded to look for a particular IQN. It is my understanding that IBM has not yet “branded” this particular agent (why I don’t know) but is in the process of doing so. IBM supplied the NetApp version of the driver from the NOW site to us to use for the time being.