HP StorageWorks Clustered File System 3.6.
Legal and notice information © Copyright 1999-2008 Hewlett-Packard Development Company, L.P. Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. The information contained herein is subject to change without notice.
Contents 1 HP Technical Support HP Storage Website . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 HP NAS Services Website . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2 Overview Event Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . View Event Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Event Notifier Services . . . . . . . . . . . . . . . . . .
1 HP Technical Support Telephone numbers for worldwide technical support are listed on the following HP website: http://www.hp.com/support. From this website, select the country of origin. For example, the North American technical support number is 800-633-3600. NOTE: For continuous quality improvement, calls may be recorded or monitored.
HP Technical Support 2 HP NAS Services Website The HP NAS Services site allows you to choose from convenient HP Care Pack Services packages or implement a custom support solution delivered by HP ProLiant Storage Server specialists and/or our certified service partners. For more information, see us at http://www.hp.com/hps/storage/ns_nas.html. For the latest documentation, go to http://www.hp.com/support/manuals.
2 Overview HP Clustered File System generates an event message when an error condition or failure occurs or when the status of the cluster changes. To provide an audit trail of cluster operations, a message is also generated when a user requests and is granted or denied authorization to perform a task. Event messages are logged and can be viewed either with the Cluster Event Viewer provided with the HP Management Console or with command-line tools.
Chapter 2: Overview 4 The message is also sent to the HP Clustered File System mxlogd process, which takes these actions: • Sends the message to the event notifier services configured on the server. If the message has been selected to trigger a notifier service, the appropriate action will take place (send an SNMP trap, send email, or run a script). • Sends the message to all servers in the cluster. The servers, including the server where the event occurred, copy the message into their own cluster logs.
Chapter 2: Overview 5 Event Viewer, and then click on Matrix Server to see the log messages. You can use the options on the Action menu to manipulate the event log. Note that the Windows event log on a particular server includes only the messages that were generated on that server. Event Notifier Services HP Clustered File System provides the following event notifier services: • SNMP Notifier Service.
Chapter 2: Overview 6 Alerts Certain events called Alerts are tracked by an HP Clustered File System component. When the condition causing the event is resolved, HP Clustered File System closes the Alert. Event messages for Alerts are displayed on the Alerts pane on the HP CFS Management Console and are also written to the event logs in the same manner as other event messages.
3 Event Messages This chapter lists alert messages and corrective actions for HP Clustered File System and FS Option for Windows. Alert Descriptions The following table lists Alerts generated by HP Clustered File System, FS Option for Windows. ID Message and Corrective Action 101 License is invalid. HP Clustered File System will be terminated in hour(s) minute(s). Action. The ClusterPulse process has recognized a license violation. This message will be repeated every 15 minutes.
Chapter 3: Event Messages 8 ID Message and Corrective Action 107 Virtual host IP
conflict. Network address is replying to pings. Action. Determine which server owns the IP address assigned to the virtual host. If the server owning the address is configured in the cluster but HP Clustered File System is down, reboot the server to get the operating system to release the IP address. Otherwise, another device reachable on the network already owns the IP address.Chapter 3: Event Messages ID Message and Corrective Action 4507 Device monitor script configuration resolved. 9 Action. None. Alert 4506 is resolved. 4508 Probe failed: monitor process creation failed. The monitor probe failed on the specified server. The server may no longer be suitable for the virtual host associated with the monitor. If the virtual host was active on the affected server, it may have failed over to another server.
Chapter 3: Event Messages ID Message and Corrective Action 4514 Probe failed: monitor probe failed. 10 The monitor probe failed on the specified server. The server may no longer be suitable for the virtual host(s) associated with the monitor. If the virtual host(s) were active on the affected server, they may have failed over to another server. (Whether failover occurs is dependent on the configuration of the monitor and the virtual host(s).) Action.
Chapter 3: Event Messages ID Message and Corrective Action 4521 Virtual host address has been restored. 11 Action. None. Alert 4520 is resolved. 4522 Probe failed: virtual host address release failed. The monitor probe failed on the specified server. Another attempt will be made to activate the virtual host. 4523 Virtual host address release resolved. Action. None. Alert 4522 is resolved. 4524 Probe failed: Unsupported monitor type ''.
Chapter 3: Event Messages 12 ID Message and Corrective Action 4528 Probe failed: Monitor type '' will terminate to load newer function. The monitor probe failed on the specified server. The server may no longer be suitable for the virtual host associated with the monitor. If the virtual host was active on the affected server, it may have failed over to another server. (Whether failover occurs is dependent on the configuration of the monitor and the virtual host.) Action.
Chapter 3: Event Messages ID Message and Corrective Action 4534 Probe failed: Monitor probe function is NULL. 13 The monitor probe failed on the specified server. The server may no longer be suitable for the virtual host(s) associated with the monitor. If the virtual host(s) were active on the affected server, they may have failed over to another server. (Whether failover occurs is dependent on the configuration of the monitor and the virtual host(s).) Action.
Chapter 3: Event Messages ID Message and Corrective Action 4540 Probe failed: Partition to monitor is not specified. 14 The monitor probe failed on the specified server. The server may no longer be suitable for the virtual host(s) associated with the monitor. If the virtual host(s) were active on the affected server, they may have failed over to another server. (Whether failover occurs is dependent on the configuration of the monitor and the virtual host(s).) Action.
Chapter 3: Event Messages ID Message and Corrective Action 4546 Probe failed: Invalid parameters. 15 The monitor probe failed on the specified server. The server may no longer be suitable for the virtual host associated with the monitor. If the virtual host was active on the affected server, it may have failed over to another server. (Whether failover occurs is dependent on the configuration of the monitor and the virtual host.) Action. Verify that the monitor is configured correctly.
Chapter 3: Event Messages ID Message and Corrective Action 4552 Probe failed: SSL_write error 16 The monitor probe failed on the specified server. The server may no longer be suitable for the virtual host associated with the monitor. If the virtual host was active on the affected server, it may have failed over to another server. (Whether failover occurs is dependent on the configuration of the monitor and the virtual host.) Action. Verify that the monitor is configured correctly.
Chapter 3: Event Messages ID Message and Corrective Action 4558 Probe failed: Server replied with error code: '' 17 The monitor probe failed on the specified server. The server may no longer be suitable for the virtual host associated with the monitor. If the virtual host was active on the affected server, it may have failed over to another server. (Whether failover occurs is dependent on the configuration of the monitor and the virtual host.) Action.
Chapter 3: Event Messages ID Message and Corrective Action 4564 Probe failed: SNMP URL contains invalid OID. 18 The monitor probe failed on the specified server. The server may no longer be suitable for the virtual host associated with the monitor. If the virtual host was active on the affected server, it may have failed over to another server. (Whether failover occurs is dependent on the configuration of the monitor and the virtual host.) Action.
Chapter 3: Event Messages ID Message and Corrective Action 4570 Probe failed: SNMP Get request failed. 19 The monitor probe failed on the specified server. The server may no longer be suitable for the virtual host associated with the monitor. If the virtual host was active on the affected server, it may have failed over to another server. (Whether failover occurs is dependent on the configuration of the monitor and the virtual host.) Action.
Chapter 3: Event Messages ID Message and Corrective Action 4576 Probe failed: Filesystem fsync failed. 20 The monitor probe failed on the specified server. The server may no longer be suitable for the virtual host(s) associated with the monitor. If the virtual host(s) were active on the affected server, they may have failed over to another server. (Whether failover occurs is dependent on the configuration of the monitor and the virtual host(s).) Action.
Chapter 3: Event Messages ID Message and Corrective Action 4582 Probe failed: Filesystem write failed. 21 The monitor probe failed on the specified server. The server may no longer be suitable for the virtual host(s) associated with the monitor. If the virtual host(s) were active on the affected server, they may have failed over to another server. (Whether failover occurs is dependent on the configuration of the monitor and the virtual host(s).) Action.
Chapter 3: Event Messages ID Message and Corrective Action 4588 Probe failed: Filesystem is not mounted. 22 The monitor probe failed on the specified server. The server may no longer be suitable for the virtual host(s) associated with the monitor. If the virtual host(s) were active on the affected server, they may have failed over to another server. (Whether failover occurs is dependent on the configuration of the monitor and the virtual host(s).) Action.
Chapter 3: Event Messages ID Message and Corrective Action 4594 Probe failed: . NIS service is not available. 23 The monitor probe failed on the specified server. The server may no longer be suitable for the virtual host associated with the monitor. If the virtual host was active on the affected server, it may have failed over to another server. (Whether failover occurs is dependent on the configuration of the monitor and the virtual host.) Action.
Chapter 3: Event Messages ID Message and Corrective Action 4600 Probe failed: . NIS RPC service is unknown. 24 The monitor probe failed on the specified server. The server may no longer be suitable for the virtual host associated with the monitor. If the virtual host was active on the affected server, it may have failed over to another server. (Whether failover occurs is dependent on the configuration of the monitor and the virtual host.) Action.
Chapter 3: Event Messages 25 ID Message and Corrective Action 4606 Probe failed: Socket operation requires a valid IP address. The monitor probe failed on the specified server. The server may no longer be suitable for the virtual host associated with the monitor. If the virtual host was active on the affected server, it may have failed over to another server. (Whether failover occurs is dependent on the configuration of the monitor and the virtual host.) Action.
Chapter 3: Event Messages ID Message and Corrective Action 4612 Probe failed: Socket receive error: 26 The monitor probe failed on the specified server. The server may no longer be suitable for the virtual host associated with the monitor. If the virtual host was active on the affected server, it may have failed over to another server. (Whether failover occurs is dependent on the configuration of the monitor and the virtual host.) Action.
Chapter 3: Event Messages 27 ID Message and Corrective Action 4618 Probe failed: Socket connection has timed out after seconds. The monitor probe failed on the specified server. The server may no longer be suitable for the virtual host associated with the monitor. If the virtual host was active on the affected server, it may have failed over to another server. (Whether failover occurs is dependent on the configuration of the monitor and the virtual host.) Action.
Chapter 3: Event Messages ID Message and Corrective Action 4624 Probe failed: DNS query failed. 28 The monitor probe failed on the specified server. The server may no longer be suitable for the virtual host associated with the monitor. If the virtual host was active on the affected server, it may have failed over to another server. (Whether failover occurs is dependent on the configuration of the monitor and the virtual host.) Action. Verify that the monitor is configured correctly.
Chapter 3: Event Messages ID Message and Corrective Action 4630 Probe failed: feature license is unavailable. 29 The monitor feature license is unavailable. The server may no longer be suitable for the virtual host associated with the monitor. If the virtual host was active on the affected server, it may have failed over to another server. (Whether failover occurs is dependent on the configuration of the monitor and the virtual host.) Action.
Chapter 3: Event Messages 30 ID Message and Corrective Action 13905 Reboot ASAP as it stopped cluster network communication at date/time but attempts to exclude it from the SAN were unsuccessful! Rebooting it will allow normal cluster operation to continue. Alternatively, if the server cannot be rebooted, but can be confirmed to have no access to the SAN, run ‘mx server markdown ' to restore normal cluster operation. Action.
Chapter 3: Event Messages 31 ID Message and Corrective Action 13909 Membership partitions are corrupt or inaccessible, preventing SAN access. Action. Determine the state of each membership partition. Open the Configure Cluster window and go to the Storage Settings tab, which shows the state of each partition. If a single membership partition is corrupt, use the Repair feature to resilver the partition while the cluster is online.
Chapter 3: Event Messages 32 ID Message and Corrective Action 13915 Membership partition corrupt and must be repaired as soon as possible. Action. Use the mx config mp repair command or the Repair button on the Storage Settings tab of the Configure Cluster Window to repair the partition while the cluster is online. If the cluster is offline, use the mx config mp repair command or resilver the partition with the mprepair utility. 13916 Membership Partition corruption resolved.
Chapter 3: Event Messages 33 ID Message and Corrective Action 13923 is unable to join the cluster because the fencing information it provided does not appear to be valid for this cluster configuration. As a result, this server will not be allowed to mount filesystems. This problem may be due to a configuration error or fencing hardware problem. Action. Check the fencing hardware and the fencing configuration for the server. 13924 Invalid fencing information from resolved. Action.
Chapter 3: Event Messages 34 ID Message and Corrective Action 13929 Membership Partition does not belong to this cluster; cannot use. Action. Determine whether the specified membership partition belongs to this cluster. Use the Storage Settings tab on the Configure Cluster window or the mpdump command to see the membership partitions configured for the cluster. If the partition does belong to the cluster, first verify that no other cluster is using it.
Chapter 3: Event Messages ID Message and Corrective Action 13935 FenceAgent: disabled status being reported on . 35 Fabric fencing is configured and there is an operational problem with the specified FibreChannel switch. Action. Check the switch for faulting or failed components such as GBICs and/or faulting slots. 13936 FenceAgent: disabled status being reported on resolved. Action. None. Alert 13935 is resolved.
Chapter 3: Event Messages 36 ID Message and Corrective Action 17005 This cluster is unable to take control of SAN, because the servers are unable to perform fencing operations, possibly due to a networking or fencing hardware failure or misconfiguration. As a result, some or all filesystem operations may be paused throughout the cluster. In addition, filesystem mounts and unmounts and disk imports and deports can not be performed. Action.
Chapter 3: Event Messages ID Message and Corrective Action 17010 This cluster takes control of SAN failure resolved. 37 Action. None. Alert 17009 is resolved. 17011 Singleton cluster unable to take control of SAN. Possibly this server has not been added to the cluster or has been deleted from the cluster, or possibly a network failure has partitioned this server from the rest of the cluster. As a result, some or all filesystem operations may be paused throughout the cluster.
Chapter 3: Event Messages ID Message and Corrective Action 17016 Inaccessible majority of membership partitions resolved. 38 Action. None. Alert 17015 is resolved. 17017 Membership partition is unwritable, possibly due to a SAN or storage hardware failure. If other membership partitions become inaccessible, HP Clustered File System’s ability to recover from a server failure will be compromised. Action. None of the servers in the cluster can write to the specified membership partition.
Chapter 3: Event Messages ID Message and Corrective Action 17024 Stalled server waiting for filesystem locks resolved. 39 Action. None. Alert 17023 is resolved. 17025 Filesystem suspended. Action. The filesystem has been suspended by the HP Clustered File System psfssuspend command or by a third-party application such as a backup utility. Writes to the filesystem will be blocked until the filesystem is resumed.
Chapter 3: Event Messages 40 ID Message and Corrective Action 17031 has lost a significant portion of its SAN access, including access to all the membership partitions, possibly due to a SAN hardware failure. As a result, this server is ineligible to become the cluster ADM. Action. The specified server is unable to write to any of the membership partitions. Ensure that the server can access the membership partitions and also has write access to them.
Chapter 3: Event Messages ID Message and Corrective Action 40505 NT service stop failure resolved. 41 Action. None. Alert 40504 is resolved. 40506 Failure to shutdown NT service in order to start monitoring. Action. Check the cluster log and the Service event log for a possible cause of the failure. If the shared storage is mounted, check the ERRORLOG for the instance on the SAN. If you are unable to resolve this problem, contact HP Support. 40507 NT service shutdown failure resolved.
Chapter 3: Event Messages ID Message and Corrective Action 40523 Entered maintain mode. 42 Action. None. Alert 40522 is resolved. 50000 configuration error: the share resource is not set. Action. A Cluster File Share failure occurred on the specified server. The file share may not be accessible on that server. Verify that the Cluster File Share is configured correctly. 50001 configuration error resolved: the share resource has been set. Action. None.
Chapter 3: Event Messages ID Message and Corrective Action 50009 configuration: a null users failure is resolved. 43 Action. None. Alert 50008 is resolved. 50010 probe failed: out of memory. Action. A Cluster File Share failure occurred on the specified server. The file share may not be accessible on that server. Verify that the Cluster File Share is configured correctly. 50011 probe: an out of memory failure is resolved. Action. None.
Chapter 3: Event Messages 44 ID Message and Corrective Action 50019 Share name collision between the shared and monitored resources attributes resolved. Action. None. Alert 50018 is resolved. 50020 Cannot access the shared resource: . Action. A Cluster File Share failure occurred on the specified server. The file share may not be accessible on that server. Verify that the Cluster File Share is configured correctly. 50021 Failure accessing shared resource resolved. Action. None.
Chapter 3: Event Messages ID Message and Corrective Action 50032 configuration error: the share path is not set. 45 Action. A Virtual File Share failure occurred on the specified server. The server may no longer be suitable for the Virtual File Server associated with the Virtual File Share. If the Virtual File Server was active on the affected server, it may have failed over to another server.
Chapter 3: Event Messages ID Message and Corrective Action 50039 configuration: a null users failure is resolved. 46 Action. None. Alert 50038 is resolved. 50040 probe failed: out of memory. Action. A Virtual File Share failure occurred on the specified server. The server may no longer be suitable for the Virtual File Server associated with the Virtual File Share.
Chapter 3: Event Messages ID Message and Corrective Action 50046 Cannot find the shared resource: . 47 Action. A Virtual File Share failure occurred on the specified server. The server may no longer be suitable for the Virtual File Server associated with the Virtual File Share. If the Virtual File Server was active on the affected server, it may have failed over to another server. (Whether failover occurs is dependent on the configuration of the monitor and Virtual File Server.
Chapter 3: Event Messages ID Message and Corrective Action 50052 Share of subdirectory failed. failure 1 of . 48 Action. A Virtual File Share failure occurred on the specified server. The server may no longer be suitable for the Virtual File Server associated with the Virtual File Share. If the Virtual File Server was active on the affected server, it may have failed over to another server.