VERITAS Volume Manager 4.
© Copyright 2005 - 2006 Hewlett-Packard Development Company L.P.
Technical Support Publication History The manual publication date and part number indicate its current edition. The publication date will change when a new edition is released. The manual part number will change when extensive changes are made. To ensure that you receive the new editions, you should subscribe to the appropriate product support service. See your HP sales representative for details. • First Edition: December 2002, 5187-1878, HP-UX 11i Version 1(B.11.
Technical Support 4
Contents Technical Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Recovery from Hardware Failure Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Listing Unstartable Volumes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Displaying Volume and Plex States . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Contents 3. Recovery from Boot Disk Failure Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Recovery from a Failed VxVM Root Mirror Disk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Recovery by Booting from Recovery Media. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Starting VxVM after Booting from Recovery Media . . . . . . . . . . . . . . . . . . . . . . . . . .
Preface The VERITAS Volume Manager Troubleshooting Guide provides information about how to recover from hardware failure, and how to understand and deal with VERITAS Volume Manager (VxVM) error messages during normal operation. It includes guidelines for recovering from the failure of disks and other hardware upon which virtual software objects such as subdisks, plexes and volumes are constructed in VxVM.
How This Guide Is Organized This guide is organized as follows: • “Recovery from Hardware Failure” on page 11 • “Recovery from Failure of Instant Snapshot Operations” on page 35 • “Recovery from Boot Disk Failure” on page 45 • “Logging Commands and Transactions” on page 53 • “Backing Up and Restoring Disk Group Configurations” on page 59 • “Error Messages” on page 65 Refer to the Release Notes for information about the other documentation that is provided with this product.
Table 1 Typeface Typographic Conventions (Continued) Usage Examples | In command synopsis, a vertical bar separates mutually exclusive arguments. mount [ suid | nosuid ] blue text An active hypertext link In PDF and HTML files, click on links to move to the specified location. Related Documentation For more information about VERITAS 4.1 products refer to the following documents located in the /usr/share/doc directory: • VERITAS File System 4.1 Release Notes • VERITAS File System 4.
1 Recovery from Hardware Failure Introduction Veritas Veritas Volume Manager (VxVM) protects systems from disk and other hardware failures and helps you to recover from such events. This chapter describes recovery procedures and information to help you prevent loss of data or system access due to disk and other hardware failures. If a volume has a disk I/O failure (for example, because the disk has an uncorrectable error), VxVM can detach the plex involved in the failure.
Recovery from Hardware Failure Displaying Volume and Plex States # vxinfo [-g diskgroup] [volume ...
Recovery from Hardware Failure Understanding the Plex State Cycle See the “Creating and Administering Plexes” and “Administering Volumes” chapters in the VERITAS VolumeManager Administrator’s Guide for a description of the possible plex and volume states. Understanding the Plex State Cycle Changing plex states are part of normal operations, and do not necessarily indicate abnormalities that must be corrected.
Recovery from Hardware Failure Understanding the Plex State Cycle The figure Figure 1-2, “Additional Plex State Transitions,” shows additional transitions that are possible between plex states as a result of hardware problems, abnormal system shutdown, and intervention by the system administrator. When first created, a plex has state EMPTY until the volume to which it is attached is initialized. Its state is then set to CLEAN.
Recovery from Hardware Failure Recovering an Unstartable Mirrored Volume “Recovering an Unstartable Mirrored Volume” on page 15, and subsequent sections describe the actions that you can take if a system crash or I/O error leaves no plexes of a mirrored volume in a CLEAN or ACTIVE state. For information on the recovery of RAID-5 volumes, see “Failures on RAID-5 Volumes” on page 19 and subsequent sections.
Recovery from Hardware Failure Recovering an Unstartable Volume with a Disabled Plex in the RECOVER State NOTE Following severe hardware failure of several disks or other related subsystems underlying all the mirrored plexes of a volume, it may be impossible to recover the volume using vxmend. In this case, remove the volume, recreate it on hardware that is functioning correctly, and restore the contents of the volume from a backup or from a snapshot image.
Recovery from Hardware Failure Forcibly Restarting a Disabled Volume Forcibly Restarting a Disabled Volume If a disk failure caused a volume to be disabled, and the volume does not contain any valid redundant plexes, you must restore the volume from a backup after replacing the failed disk.
Recovery from Hardware Failure Reattaching Falled Disks c1t1d0 auto:simple mydg03 mydg online . . . • Use the vxeditset command to clear the flag for each disk that is marked as failing (in this example, mydg02): # vxedit set failing=off mydg02 • Use the vxdisklist command to verify that the failing flag has been cleared: # vxdisk list DEVICE TYPE DISK GROUP STATUS c1t1d0 auto:simple mydg01 mydg online c1t2d0 auto:simple mydg02 mydg online c1t3d0 auto:simple mydg03 mydg online . . .
Recovery from Hardware Failure Failures on RAID-5 Volumes - - mydg03 mydg failed was: c1t3d0 - - mydg04 mydg failed was: c1t4d0 2. Once the fault has been corrected, the disks can be reattached by using the following command to rescan the device list: # /usr/sbin/vxdctl enable 3. Use the vxreattach command with no options to reattach the disks: # /etc/vx/bin/vxreattach After reattachment takes place, recovery may not be necessary unless a disk was faulty and had to be replaced.
Recovery from Hardware Failure Failures on RAID-5 Volumes If a loss of sync occurs while a RAID-5 volume is being accessed, the volume is described as having stale parity. The parity must then be reconstructed by reading all the non-parity columns within each stripe, recalculating the parity, and writing out the parity stripe unit in the stripe. This must be done for every stripe in the volume, so it can take a long time to complete.
Recovery from Hardware Failure Failures on RAID-5 Volumes v r5vol - 204800 RAID - raid5 204800 RAID 3/16 RW sd disk01-01 r5vol-01disk01 0 102400 0/0 c2t9d0 ENA sd disk02-01 r5vol-01disk02 0 102400 1/0 c2t10d0 dS sd disk03-01 r5vol-01disk03 0 102400 2/0 c2t11d0 ENA pl r5vol-02 1440 CONCAT - RW c2t12d0 ENA - RW pl r5vol-01 ENABLED DEGRADED r5vol ENABLED ACTIVE r5vol ENABLED LOG sd disk04-01 r5vol-02disk04 0 1440 pl r5vol-03 1440 r5vol ENABLED LOG 0 CONCAT sd disk05-0
Recovery from Hardware Failure Failures on RAID-5 Volumes sd disk04-01 r5vol-02 disk04 0 1440 0 pl r5vol-03 ENABLED LOG 1440 CONCAT disk05 0 1440 0 r5vol sd disk05-01 r5vol-12 c2t12d0 ENA - RW c2t14d0 ENA Default Startup Recovery Process for RAID-5 VxVM may need to perform several operations to restore fully the contents of a RAID-5 volume and make it usable. Whenever a volume is started, any RAID-5 log plexes are zeroed before the volume is started.
Recovery from Hardware Failure Failures on RAID-5 Volumes Recovering a RAID-5 Volume The types of recovery that may typically be required for RAID-5 volumes are the following: • “Parity Resynchronization” on page 23 • “Log Plex Recovery” on page 25 • “Stale Subdisk Recovery” on page 25 Parity resynchronization and stale subdisk recovery are typically performed when the RAID-5 volume is started, or shortly after the system boots. They can also be performed by running the vxrecover command.
Recovery from Hardware Failure Failures on RAID-5 Volumes If a volume without valid RAID-5 logs is started and the process is killed before the volume is resynchronized, the result is an active volume with stale parity. For an example of the output of the vxprint -ht command, see the following example for a stale RAID-5 volume: V NAME PL NAME SD NAME SV NAME ...
Recovery from Hardware Failure Failures on RAID-5 Volumes Log Plex Recovery RAID-5 log plexes can become detached due to disk failures. These RAID-5 logs can be reattached by using the att keyword for the vxplex command. To reattach the failed RAID-5 log plex, use the following command: # vxplex att r5vol r5vol-l1 Stale Subdisk Recovery Stale subdisk recovery is usually done at volume start time.
Recovery from Hardware Failure Failures on RAID-5 Volumes NOTE RAID-5 subdisk moves are performed in the same way as subdisk moves for other volume types, but without the penalty of degraded redundancy. Starting RAID-5 Volumes When a RAID-5 volume is started, it can be in one of many states. After a normal system shutdown, the volume should be clean and require no recovery.
Recovery from Hardware Failure Failures on RAID-5 Volumes Figure 1-3 Invalid RAID-5 Volume disk00-00 disk01-00 disk02-00 disk03-00 disk04-00 disk05-00 W Data Data Parity W X Data Parity Data X Y Parity Data Data Y Z Data Data Parity Z RAID-5 Plex This example shows four stripes in the RAID-5 array. All parity is stale and subdisk disk05-00 has failed. This makes stripes X and Y unusable because two failures have occurred within those stripes.
Recovery from Hardware Failure Recovering from Incomplete Disk Group Moves # vxmend [-g diskgroup] fix unstale subdisk • If some subdisks are stale and need recovery, and if valid logs exist, the volume is enabled by placing it in the ENABLED kernel state and the volume is available for use during the subdisk recovery. Otherwise, the volume kernel state is set to DETACHED and it is not available during subdisk recovery.
Recovery from Hardware Failure Recovery from Failure of a DCO Volume Automatic recovery depends on being able to import both the source and target disk groups. If this is not possible (for example, if one of the disk groups has been imported on another host), perform the following steps to recover the disk group: Step 1. Use the vxprint command to examine the configuration of both disk groups. Objects in disk groups whose move is incomplete have their TUTIL0 fields set to MOVE. Step 2.
Recovery from Hardware Failure Recovery from Failure of a DCO Volume Persistent FastResync uses a data change object (DCO) volume to perform tracking of changed regions in a volume. If an error occurs while reading or writing a DCO volume, it is detached and the badlog flag is set on the DCO. All further writes to the volume are not tracked by the DCO.
Recovery from Hardware Failure Recovering a Version 0 DCO Recovering a Version 0 DCO For a version 0 DCO, perform the following steps to recover the DCO volume: 1. Correct the problem that caused the I/O failure. 2.
Recovery from Hardware Failure Recovering a Version 20 DCO If a snapshot volume and the original volume are in different disk groups, you must perform a separate snapclear operation on each volume: # vxassist -g diskgroup1 snapclear volume snap_obj_to_snapshot # vxassist -g diskgroup2 snapclear snapvol snap_obj_to_volume Here snap_obj_to_volume is the name of the snap object associated with the snapshot volume, snapvol, that points to the original volume.
Recovery from Hardware Failure Recovering a Version 20 DCO # vxsnap [-g diskgroup] unprepare volume For the example output, the command would take this form: # vxsnap -g mydg unprepare vol1 4. Start the volume using the vxvol command: # vxvol [-g diskgroup] start volume For the example output, the command would take this form: # vxvol -g mydg start vol1 5.
Recovery from Hardware Failure Recovering a Version 20 DCO 34 Chapter 1
2 Recovery from Failure of Instant Snapshot Operations This chapter describes how to recover from various failure and error conditions that may occur during instant snapshot operations: • “Failure of vxsnap prepare” on page 36 • “Failure of vxsnap make for Full-Sized Instant Snapshots” on page 37 • “Failure of vxsnap make for Break-Off Instant Snapshots” on page 38 • “Failure of vxsnap make for Space-Optimized Instant” on page 39 • “Failure of vxsnap restore” on page 40 • “Failure of vxsnap reatt
Recovery from Failure of Instant Snapshot Operations Failure of vxsnap prepare Failure of vxsnap prepare If a vxsnapprepare operation fails prematurely, the vxprint command may show the new DCO volume in the INSTSNAPTMP state. VxVM can usually recover the DCO volume without intervention. However, in certain situations, this recovery may not succeed.
Recovery from Failure of Instant Snapshot Operations Failure of vxsnap make for Full-Sized Instant Snapshots Failure of vxsnap make for Full-Sized Instant Snapshots If a vxsnapmake operation fails during the creation of a full-sized instant snapshot, the snapshot volume may go into the DISABLED state, be marked invalid and be rendered unstartable.
Recovery from Failure of Instant Snapshot Operations Failure of vxsnap make for Break-Off Instant Snapshots Failure of vxsnap make for Break-Off Instant Snapshots If a vxsnapmake operation fails during the creation of a third-mirror break-off instant snapshot, the snapshot volume may go into the INSTSNAPTMP state. VxVM can usually recover the snapshot volume without intervention. However, in certain situations, this recovery may not succeed.
Recovery from Failure of Instant Snapshot Operations Failure of vxsnap make for Space-Optimized Instant Failure of vxsnap make for Space-Optimized Instant Snapshots If a vxsnapmake operation fails during the creation of a space-optimized instant snapshot, the snapshot volume may go into the INSTSNAPTMP state. VxVM can usually recover the snapshot volume without intervention. However, in certain situations, this recovery may not succeed.
Recovery from Failure of Instant Snapshot Operations Failure of vxsnap restore Failure of vxsnap restore If a vxsnaprestore operation fails, the volume being restored may go into the DISABLED state.
Recovery from Failure of Instant Snapshot Operations Failure of vxsnap reattach or refresh Failure of vxsnap reattach or refresh If a vxsnap reattach or refresh operation fails, the volume being refreshed may go into the DISABLED state, be marked invalid and be rendered unstartable. You can use the following command to check that the inst_invalid flag is set to on: # vxprint [-g diskgroup] -F%inst_invalid volume Use the following steps to recover the volume: 1.
Recovery from Failure of Instant Snapshot Operations Copy-on-write Failure Copy-on-write Failure If an error is encountered while performing an internal resynchronization of a volume’s snapshot, the snapshot volume goes into the INVALID state, and is made inaccessible for I/O and instant snapshot operations. Use the following steps to recover the snapshot volume: 1. Use the vxsnap command to dissociate the volume from the snapshot hierarchy: # vxsnap [-g diskgroup] dis snapshot_volume 2.
Recovery from Failure of Instant Snapshot Operations I/O Errors During Resynchronization I/O Errors During Resynchronization Snapshot resynchronization (started by vxsnapsyncstart, or by specifying sync=on to vxsnap) stops if an I/O error occurs, and displays the following message on the system console: VxVM vxsnap ERROR V-5-1-6840 Synchronization of the volume volume stopped due to I/O error After correcting the source of the error, use the following command to restart the resynchronization operation: #
Recovery from Failure of Instant Snapshot Operations I/O Failure on a DCO Volume I/O Failure on a DCO Volume If an I/O failure occurs on a DCO volume, its FastResync maps and DRL log cannot be accessed, and the DCO volume is marked with the BADLOG flag. DRL logging and recovery, and instant snapshot operations are not possible with the volume until you recover its DCO volume using the procedure described in“Recovering a Version 20 DCO” on page 32 .
3 Chapter 3 Recovery from Boot Disk Failure 45
Recovery from Boot Disk Failure Introduction Introduction Veritas Veritas Volume Manager (VxVM) protects systems from disk and other hardware failures and helps you to recover from such events. This chapter describes recovery procedures and provides information that help to prevent loss of data or system access due to the failure of the boot (root) disk. For information about recovering volumes and their data on non-boot disks, see Chapter 1, “Recovery from Hardware Failure,” on page 11.
Recovery from Boot Disk Failure Recovery from a Failed VxVM Root Mirror Disk Recovery from a Failed VxVM Root Mirror Disk If a failed primary boot disk is under VxVM control and is mirrored, follow these steps to replace it: Step 1. Replace the failed boot disk. Depending on the system hardware, this may require you to shut down and power off the system. Step 2.
Recovery from Boot Disk Failure Recovery by Booting from Recovery Media Recovery by Booting from Recovery Media If there is a failure to boot from the VxVM boot disk on HP-UX 11i, and no bootable root mirror is available, it may be necessary to boot from an alternate boot source, or from recovery media such as the following: • HP-UX 11i installation CD. • Bootable recovery tape. • Secondary boot disk in the configuration. • HP-UX Ignite-UX server that is accessible over a LAN.
Recovery from Boot Disk Failure Recovery by Booting from Recovery Media TY NAME ASSOC KSTATE LENGTH v rootvol root DISABLED 393216 - ACTIVE ... pl rootvol-01 rootvol DISABLED 393216 - STALE ... sd rootdisk01-02 rootvol-01 ENABLED 393216 0 pl rootvol-02 393216 - sd rootdisk02-02 rootvol-02 ENABLED 393216 0 rootvol DISABLED PLOFFS STATE ... -... STALE ... - ...
Recovery from Boot Disk Failure Using VxVM Maintenance Mode Boot (MMB) Using VxVM Maintenance Mode Boot (MMB) Another method for performing limited recovery on a VxVM boot disk is to use the VxVM Maintenance Mode Boot (MMB). MMB mode is initiated by booting the system and gaining control at the ISL prompt.
Recovery from Boot Disk Failure Using VxVM Maintenance Mode Boot (MMB) Step 5.
Recovery from Boot Disk Failure Recovery by Reinstallation Recovery by Reinstallation NOTE If you configured VxVM rootability by installing via Ignite-UX, consult the “System Recovery” section of the Ignite-UX Administration Guide, before consulting this section. In many instances, reinstalling from a saved Ignite-UX configuration is sufficient to recover a failed boot disk.
4 Logging Commands and Transactions This chapter provides information on how to administer logging of commands and transactions in VERITAS Volume Manager (VxVM). Logging Commands For information on how to administer error logging, see “Error Messages” on page 65“. Logging Commands The vxcmdlog command allows you to log the invocation of other VxVM commands to a file.
Logging Commands and Transactions Logging Commands The size of the command log is checked after an entry has been written so the actual size may be slightly larger than that specified. When the log reaches a maximum size, the current command log file, cmdlog, is renamed as the next available historic log file, cmdlog.number, where number is an integer from 1 up to the maximum number of historic log files that is currently defined, and a new current log file is created.
Logging Commands and Transactions Logging Transactions If there is an error reading from the settings file, command logging switches to its built-in default settings. This may mean, for example, that logging remains enabled after being disabled using vxcmdlog-moff command. If this happens, use the vxcmdlog utility to recreate the settings file, or restore the filefrom a backup. See the vxcmdlog(1M) manual page for more information about the vxcmdlog utilit.
Logging Commands and Transactions Logging Transactions NOTE The .translog file is a binary and should not be edited. The size of the transaction log is checked after an entry has been written so the actual size may be slightly larger than that specified. When the log reaches a maximum size, the current transaction log file, translog, is renamed as the next available historic log file, translog.
Logging Commands and Transactions Associating Command and Transaction Logs NOTE The client ID is the same as that recorded for the corresponding command line in the command log. See “Logging Commands” on page 53"and “Associating Command and Transaction Logs” on page 57" for more information. If there is an error reading from the settings file, transaction logging switches to its built-in default settings.
Logging Commands and Transactions Associating Command and Transaction Logs /usr/sbin/vxdg -m import foodg NOTE If there are multiple matches for the combination of the client and process ID, you can determine the correct match by examining the time stamp. If a utility opens a conditional connection to vxconfigd, its client ID is shown as zero in the command log, and as a non-zero value in the transaction log. You can use the process ID and time stamp to relate the log entries in such cases.
5 Backing Up and Restoring Disk Group Configurations Disk group configuration backup and restoration allows you to backup and restore all configuration data for VERITAS Volume Manager (VxVM) disk groups, and for VxVM objects such as volumes that are configured within the disk groups. Using this feature, you can recover from corruption of a disk group’s configuration that is stored as metadata in the private region of aVM disk.
Backing Up and Restoring Disk Group Configurations If such errors occur, you can restore the disk group configuration from a backup after you have corrected any underlying problem such as failed or disconnected hardware. Configuration data from a backup allows you to reinstall the private region headers of VxVM disks in a disk group whose headers have become damaged, to recreate a corrupted disk group configuration, or to recreate a disk group and the VxVM objects within it.
Backing Up and Restoring Disk Group Configurations Backing Up a Disk Group Configuration Backing Up a Disk Group Configuration VxVM uses the disk group configuration daemon to monitor the configuration of disk groups, and to back up the configuration whenever it is changed. By default, the five most recent backups are preserved. If required, you can also back up a disk group configuration by running the vxconfigbackup command.
Backing Up and Restoring Disk Group Configurations Restoring a Disk Group Configuration Restoring a Disk Group Configuration You can use the vxconfigrestore utility to restore or recreate a disk group from its configuration backup. The restoration process has two stages: precommit and commit. In the precommit stage, you can examine the configuration of the disk group that would be restored from the backup.
Backing Up and Restoring Disk Group Configurations Restoring a Disk Group Configuration If any of the disk headers are reinstalled, a saved copy of the disks’ attributes is used to recreate their private and public regions. These disks are also assigned new diskIDs. The VxVM objects within the disk group are then recreated using the backupconfiguration records for the disk group. This process also has the effect of creating new configuration copies in the disk group.
Backing Up and Restoring Disk Group Configurations Resolving Conflicting Backups for a Disk Group Resolving Conflicting Backups for a Disk Group In some circumstances where disks have been replaced on a system, there may exist several conflicting backups for a disk group. In this case, you see a message similar to the following from the vxconfigrestore command: VxVM vxconfigrestore ERROR V-5-1-6012 There are two backups that have the same diskgroup name with different diskgroup id : 1047336696.19.xxx.
6 Error Messages This chapter provides information on error messages associated with the Veritas Volume Manager (VxVM) configuration daemon (vxconfigd), the kernel, and other utilities. It covers most informational, failure, and error messages displayed on the console by vxconfigd, and by the Veritas Volume Manager kernel driver, vxio. These include some errors that are infrequently encountered and difficult to troubleshoot. NOTE Some error messages described here may not apply to your system.
Error Messages Configuring Logging in the Startup Script To enable logging of console output to the file /var/adm/configd.log, edit the startup script for vxconfigd as described in “Configuring Logging in the Startup Script” on page 66 or invoke vxconfigd under the C locale as shown here: # vxconfigd [-x [1-9]] -x log There are 9 possible levels of debug logging; 1 provides the least detail, and 9 the most.
Error Messages Understanding Messages NOTE By default, vxconfigd is started at boot time with the -xsyslog option. This redirects vxconfigd console messages to syslog. If you want to retain this behavior when restarting vxconfigd from the command line, include the -x syslog argument, as restarting vxconfigd does not preserve the option settings with which it was previously running.
Error Messages Understanding Messages A panic is a severe event as it halts a system during its normal operation. A panic message from the kernel module or from a device driver indicates a hardware problem or software inconsistency so severe that the system cannot continue. The operating system may also provide a dump of the CPU register contents and a stack trace to aid in identifying the cause of the panic.
Error Messages Understanding Messages VxVM), the second field (1) represents information about the product component, and the third field (3141) is the message index. The text of the error message follows the message number. Messages This section contains a list of messages that you may encounter during the operation of VERITAS Volume Manager. However, the list is not exhaustive and the second field may contain the name of different command, driver or module from that shown here.
Error Messages Understanding Messages V-5-0-34 VxVM vxdmp NOTICE V-5-0-34 added disk array disk_array_serial_number • Description: A new disk array has been added to the host. • Action: None. V-5-0-35 VxVM vxdmp NOTICE V-5-0-35 Attempt to disable controller controller_name failed. Rootdisk has just one enabled path. • Description: An attempt is being made to disable the one remaining active path to the root disk controller. • Action: The path cannot be disabled.
Error Messages Understanding Messages V-5-0-110 VxVM vxdmp NOTICE V-5-0-110 disabled controller controller_name connected to disk array disk_array_serial_number • Description: All paths through the controller connected to the disk array are disabled. This usually happens if a controller is disabled for maintenance. • Action: None. V-5-0-111 VxVM vxdmp NOTICE V-5-0-111 disabled dmpnode dmpnode_device_number • Description: A DMP node has been marked disabled in the DMP database.
Error Messages Understanding Messages V-5-0-145 VxVM vxio WARNING V-5-0-145 DRL volume volume is detached • Description: A Dirty Region Logging volume became detached because a DRL log entry could not be written. If this is due to a media failure, other errors may have been logged to the console. • Action: The volume containing the DRL log continues in operation.
Error Messages Understanding Messages • Description: A node failed to join a cluster. This may be caused by the node being unable to see all the shared disks. Other error messages may provide more information about the disks that cannot be found. • Action: Use the vxdisk-slist command on the master node to see what disks should be visible to the slave node. Then check that the operating system and VxVM on the failed node can also see these disks.
Error Messages Understanding Messages • Action: None; under normal startup conditions, this message should not occur. If necessary, start VxVM and re-attempt the operation. V-5-0-194 VxVM vxio WARNING V-5-0-194 Kernel log full: volume detached • Description: A plex detach failed because the kernel log was full. As a result, the mirrored volume will become detached. • Action: It is unlikely that this condition ever occurs. The only corrective action is to reboot the system.
Error Messages Understanding Messages • Description: A subdisk was detached from a RAID-5 volume because of the failure of a disk or an uncorrectable error occurring on that disk. • Action: Check for other console error messages indicating the cause of the failure. Replace a failed disk as soon as possible. V-5-0-243 VxVM vxio WARNING V-5-0-243 Overlapping mirror plex detached from volume volume • Description: An error has occurred on the last complete plex in a mirrored volume.
Error Messages Understanding Messages • Action: If the volume is mirrored, no further action is necessary since the alternate mirror’s contents will be written to the failing mirror; this is often sufficient to correct media failures. If this error occurs often, but never leads to a plex detach, there may be a marginally defective region on the disk at the position indicated. It may eventually be necessary to remove data from this disk (see the vxevac(1M) manual page) and then to reformat the drive.
Error Messages Understanding Messages • Action: Check for obvious problems with the disk (such as a disconnected cable). If hot-relocation is enabled and the disk is failing, recovery from subdisk failure is handled automatically. V-5-1-90 VxVM vxconfigd ERROR V-5-1-90 mode: Unrecognized operating mode • Description: An invalid string was specified as an argument to the -m option. Valid strings are: enable, disable, and boot. • Action: Supply a correct option argument.
Error Messages Understanding Messages • Description: The given directory could not be removed because vxconfigd could not fork in order to run the rm utility. This is not a serious error. The only side effect of a directory not being removed is that the directory and its contents will continue to use space in the root file system. The most likely cause for this error is that your system does not have enough memory or paging space to allow vxconfigd to fork.
Error Messages Understanding Messages V-5-1-123 VxVM vxconfigd ERROR V-5-1-123 Disk group group: Disabled by errors • Description: This message indicates that some error condition has made it impossible for VxVM to continue to manage changes to a disk group. The major reason for this is that too many disks have failed, making it impossible for vxconfigd to continue to update configuration copies. There should be a preceding error message that indicates the specific error that was encountered.
Error Messages Understanding Messages V-5-1-135 VxVM vxconfigd FATAL ERROR V-5-1-135 Memory allocation failure during startup • Description: This implies that there is insufficient memory to start up VxVM. • Action: This error should not normally occur, unless your system has very small amounts of memory. Adding swap space probably will not help, because this error is most likely to occur early in the boot sequence, before swap areas have been added.
Error Messages Understanding Messages — Description: vxconfigd could not open the /etc/fstab file, for the reason given. The /etc/fstab file is used to determine which volume (if any) to use for the /usr file system. — Action: This error implies that your root file system is currently unusable. You may be able to repair the root file system by mounting it after booting from a network or CD-ROM root file system.
Error Messages Understanding Messages • Description: These errors indicate that the volume cannot be started because the volume contains no valid complete plexes. This can happen, for example, if disk failures have caused all plexes to be unusable. It can also happen as a result of actions that caused all plexes to become unusable (for example, forcing the dissociation of subdisks or detaching, dissociation, or offlining of plexes).
Error Messages Understanding Messages V-5-1-528 VxVM vxconfigd NOTICE V-5-1-528 Detached volume volume • Description: The specified volume was detached as a result of a disk failure, or as a result of the administrator removing a disk with vxdg-krmdisk. A failing disk is indicated by a "Detached disk" message. Unless the disk error is transient and can be fixed with a reboot, the contents of the volume should be considered lost. • Action: Contact VERITAS Technical Support.
Error Messages Understanding Messages V-5-1-546 VxVM vxconfigd WARNING V-5-1-546 Disk disk in group group: Disk device not found • Description: No physical disk can be found that matches the named disk in the given disk group. This is equivalent to failure of that disk. (Physical disks are located by matching the disk IDs in the disk group configuration records against the disk IDs stored in the VERITAS Volume Manager header on the physical disks.
Error Messages Understanding Messages • Description: This can result from using vxdctlhostid to change the VERITAS Volume Manager host ID for the system. The error indicates that one of the disks in a disk group could not be updated with the new host ID. This usually indicates that the disk has become inaccessible or has failed in some other way.
Error Messages Understanding Messages • Description: On system startup, vxconfigd failed to import the disk group associated with the named disk. A message related to the specific failure is given in reason. Additional error messages may be displayed that give more information on the specific error. In particular, this is often followed by: VxVM vxconfigd ERROR V-5-1-579 Disk group group: Errors in some configuration copies: Disk device, copy number: Block bno: error ...
Error Messages Understanding Messages V-5-1-571 VxVM vxconfigd ERROR V-5-1-571 Disk group group, Disk disk: Skip disk group with duplicate name • Description: Two disk groups with the same name are tagged for auto-importing by the same host. Disk groups are identified both by a simple name and by a long unique identifier (disk group ID) assigned when the disk group is created. Thus, this error indicates that two disks indicate the same disk group name but a different disk group ID.
Error Messages Understanding Messages • Action: Reinitialize the disks in the group with larger log areas. Note that this requires that you restore data on the disks from backups. See the vxdisk(1M) manual page. To reinitialize all of the disks, detach them from the group with which they are associated, reinitialize and re-add them. Then deport and re-import the disk group to effect the changes to the log areas for the group.
Error Messages Understanding Messages • Action: The action to be taken depends on the reason given in the error message: Disk is in use by another host No valid disk found containing disk group The first message indicates that disks have been moved from a system that has crashed or that failed to detect the group before the disk was moved. The locks stored on the disks must be cleared.
Error Messages Understanding Messages CAUTION Be careful when using the -f option. It can cause the same disk group to be imported twice from different sets of disks, causing the disk group to become inconsistent. These operations can also be performed using the vxdiskadm utility. To deport a disk group using vxdiskadm, select menu item 8 (Remove access to (deport) a disk group). To import a disk group, select item 7(Enable access to (import) a disk group).
Error Messages Understanding Messages V-5-1-768 VxVM vxconfigd NOTICE V-5-1-768 Offlining config copy number on disk disk: Reason: reason • Description: An I/O error caused the indicated configuration copy to be disabled. This is a notice only, and does not normally imply serious problems, unless this is the last active configuration copy in the disk group.
Error Messages Understanding Messages V-5-1-1186 VxVM vxconfigd ERROR V-5-1-1186 Volume volume for mount point /usr not found in bootdg disk group • Description: The system is configured to boot with /usr mounted on a volume, but the volume associated with /usr is not listed in the configuration of the boot disk group.
Error Messages Understanding Messages VxVM vxconfigd ERROR V-5-1-1589 enable failed: Error check group configuration copies. Database file not found • Description: Regular startup of vxconfigd failed. This error can also result from the command vxdctlenable. The directory /var/vxvm/tempdb is inaccessible. This may be because of root file system corruption, if the root file system is full, or if /var is a separate file system, because it has become corrupted or has not been mounted.
Error Messages Understanding Messages • Description: The -k (kill existing vxconfigd process) option was specified, but a running configuration daemon process could not be killed. A configuration daemon process, for purposes of this discussion, is any process that opens the /dev/vx/config device (only one process can open that device at a time). If there is a configuration daemon process already running, then the -k option causes a SIGKILL signal to be sent to that process.
Error Messages Understanding Messages • Description: This message is returned by the vxdmpadm utility when an attempt is made to enable a controller that is not working or is not physically present. • Action: Check hardware and see if the controller is present and whether I/O can be performed through it. V-5-1-2353 VxVM vxconfigd ERROR V-5-1-2353 Disk group group: Cannot recover temp database: reason Consider use of "vxconfigd -x cleartempdir" [see vxconfigd(1M)].
Error Messages Understanding Messages V-5-1-2630 VxVM vxconfigd WARNING V-5-1-2630 library and vxconfigd disagree on existence of client number • Description: This warning may safely be ignored. • Action: None required. V-5-1-2824 VxVM vxconfigd ERROR V-5-1-2824 Configuration daemon error 242 • Description: A node failed to join a cluster, or a cluster join is taking too long. If the join fails, the node retries the join automatically.
Error Messages Understanding Messages • Action: Use the vxprint command to display the status of the disk groups involved. If vxprint shows that the TUTIL0 field for a disk group is set to MOVE, and you are certain that no disk group move, split or join should be in progress, use the vxdg command to clear the field as described in “Recovering from Incomplete Disk Group Moves” on page 28". Otherwise, retry the operation.
Error Messages Understanding Messages • Action: Objects specified for a disk group move, split or join must be either disks or top-level volumes. V-5-1-2907 VxVM vxdg ERROR V-5-1-2907 diskgroup: Disk group does not exist • Description: The disk group does not exist or is not imported • Action: Use the correct name, or import the disk group and try again.
Error Messages Understanding Messages V-5-1-2933 VxVM vxdg ERROR V-5-1-2933 diskgroup: Cannot remove last disk group configuration copy • Description: The requested disk group move, split or join operation would leave the disk group without any configuration copies. • Action: None. The operation is not supported. V-5-1-3009 VxVM vxdg ERROR V-5-1-3009 object: Name conflicts with imported diskgroup • Description: The target disk group of a split operation already exists as an imported disk group.
Error Messages Understanding Messages • Action: If the disk group is not imported by another cluster, retry the import using the -C (clear import) flag. V-5-1-3024 VxVM vxconfigd ERROR V-5-1-3024 vxclust not there • Description: An error during an attempt to join a cluster caused vxclust to fail. This may be caused by the failure of another node during a join or by the failure of vxclust. • Action: Retry the join. An error message on the other node may clarify the problem.
Error Messages Understanding Messages • Action: Before retrying the join, use vxdgreminor (see the vxdg(1M) manual page) to choose a new minor number range either for the disk group on the master or for the conflicting disk group on the slave. If there are open volumes in the disk group, the reminor operation will not take effect until the disk group is deported and updated (either explicitly or by rebooting the system).
Error Messages Understanding Messages • Description: The disk group could not be activated because it is activated in a conflicting mode on another node in a cluster. • Action: Retry later, or deactivate the disk group on conflicting nodes. V-5-1-3049 VxVM vxconfigd ERROR V-5-1-3049 Retry rolling upgrade • Description: An attempt was made to upgrade a cluster to a higher protocol version when a transaction was in progress. • Action: Retry the upgrade at a later time.
Error Messages Understanding Messages V-5-1-3243 VxVM vxdmpadm ERROR V-5-1-3243 The VxVM restore daemon is already running. You can stop and restart the restore daemon with desired arguments for changing any of its parameters. • Description: The vxdmpadmstartrestore command has been executed while the restore daemon is already running. • Action: Stop the restore daemon and restart it with the required set of parameters as shown in the vxdmpadm(1M) manual page.
Error Messages Understanding Messages V-5-1-3828 VxVM vxconfigd ERROR V-5-1-3828 upgrade operation failed: Already at highest version • Description: An upgrade operation has failed because a cluster is already running at the highest protocol version supported by the master. • Action: No further action is possible as the master is already running at the highest protocol version it can support.
Error Messages Understanding Messages • Description: An error was detected while adding a DCO object and DCO volume to a mirrored volume. There is at least one snapshot plex already created on the volume. Because this snapshot plex was created when no DCO was associated with the volume, there is no DCO plex allocated for it. • Action: See the section Adding a Version 0 DCO and DCO Volume in the chapter Administering Volume Snapshots of the VERITAS Volume Manager Administrator’s Guide.
Error Messages Understanding Messages • Action: If a connection to SAL is desired, ensure that the correct version of SAL is installed and configured correctly. Otherwise, suppress communication between vxassist and SAL by adding the following line to the vxassist defaults file (usually /etc/default/vxassist): salcontact=no V-5-1-4625 VxVM vxassist WARNING V-5-1-4625 SAL authentication failed... • Description: The SAN Access Layer (SAL) rejects the credentials that are supplied by the vxassist command.
Error Messages Understanding Messages V-5-1-5162 VxVM vxplex ERROR V-5-1-5162 Plexes do not belong to the same snapshot volume. • Description: An attempt was made to snap back plexes that belong to different snapshot volumes. • Action: Specify the plexes in separate invocations of vxplexsnapback. V-5-1-5929 VxVM vxconfigd NOTICE V-5-1-5929 Unable to resolve duplicate diskid.
Error Messages Understanding Messages c8t0d1 /0/4/0/0.8.0.108.0.0.1 c8t0d2 /0/4/0/0.8.0.108.0.0.2 controllers # product # pathgroups — Case 2: Some arrays such as EMC and HDS provide mirroring in hardware. When a LUN pair is split, depending on how the process is performed, this may result in two disks with the same disk ID. Check with your array vendor to make sure that you are using the correct split procedure.