VERITAS Volume Manager 4.
Disclaimer The information contained in this publication is subject to change without notice. VERITAS Software Corporation makes no warranty of any kind with regard to this manual, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. VERITAS Software Corporation shall not be liable for errors contained herein or for incidental or consequential damages in connection with the furnishing, performance, or use of this manual.
Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii How This Guide Is Organized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .vii Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii Getting Help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Recovering a Version 0 DCO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Recovering a Version 20 DCO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Chapter 2. Recovery from Failure of Instant Snapshot Operations . . . . . . . . . . . . 25 Failure of vxsnap prepare . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Resolving Conflicting Backups for a Disk Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Chapter 6. Error Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Logging Error Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Configuring Logging in the Startup Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Understanding Messages . . . . . . . .
vi VERITAS Volume Manager Troubleshooting Guide
Preface The VERITAS Volume Manager Troubleshooting Guide provides information about how to recover from hardware failure, and how to understand and deal with VERITAS Volume Manager (VxVM) error messages during normal operation. It includes guidelines for recovering from the failure of disks and other hardware upon which virtual software objects such as subdisks, plexes and volumes are constructed in VxVM.
Conventions Conventions Convention Usage Example monospace Used for path names, commands, output, directory and file names, functions, and parameters. Read tunables from the /etc/vx/tunefstab file. monospace (bold) Indicates user input. # ls pubs italic Identifies book titles, new terms, emphasized text, and variables replaced with a name or value. See the ls(1) manual page for more information. C:\> dir pubs See the User’s Guide for details.
Getting Help Getting Help For technical assistance, visit http://support.veritas.com and select phone or email support. This site also provides access to resources such as TechNotes, product alerts, software downloads, hardware compatibility lists, and the VERITAS customer email notification service. Use the Knowledge Base Search feature to access additional product information, including current and past releases of product documentation.
Documentation Feedback x VERITAS Volume Manager Troubleshooting Guide
1 Recovery from Hardware Failure VERITAS Volume Manager (VxVM) protects systems from disk and other hardware failures and helps you to recover from such events. This chapter describes recovery procedures and information to help you prevent loss of data or system access due to disk and other hardware failures. If a volume has a disk I/O failure (for example, because the disk has an uncorrectable error), VxVM can detach the plex involved in the failure.
Listing Unstartable Volumes Listing Unstartable Volumes An unstartable volume can be incorrectly configured or have other errors or conditions that prevent it from being started. To display unstartable volumes, use the vxinfo command. This displays information about the accessibility and usability of volumes: # vxinfo [-g diskgroup] [volume ...
Understanding the Plex State Cycle Understanding the Plex State Cycle Changing plex states are part of normal operations, and do not necessarily indicate abnormalities that must be corrected. A clear understanding of the various plex states and their interrelationship is necessary if you want to be able to perform the recovery procedures described in this chapter. The figure, “Main Plex State Cycle,” shows the main transitions that take place between plex states in VxVM.
Understanding the Plex State Cycle Additional Plex State Transitions Create plex PS: EMPTY PKS: DISABLED PS: ACTIVE PKS: DISABLED After crash and reboot (vxvol start) Initialize plex (vxvol init clean) Start up (vxvol start) PS: CLEAN PKS: DISABLED Shut down (vxvol stop) Recover data (vxvol resync) Take plex offline (vxmend off) PS: ACTIVE PKS: ENABLED PS: OFFLINE PKS: DISABLED Resync data (vxplex att) Put plex online (vxmend on) Uncorrectable I/O failure PS = Plex State Resync PS: IOFAIL fails P
Recovering an Unstartable Mirrored Volume Recovering an Unstartable Mirrored Volume A system crash or an I/O error can corrupt one or more plexes of a mirrored volume and leave no plex CLEAN or ACTIVE. You can mark one of the plexes CLEAN and instruct the system to use that plex as the source for reviving the others as follows: 1.
Recovering an Unstartable Volume with a Disabled Plex in the RECOVER State Recovering an Unstartable Volume with a Disabled Plex in the RECOVER State A plex is shown in the RECOVER state if its contents are out-of-date with respect to the volume. This can happen if a disk containing one or more of the plex’s subdisks has been replaced or reattached. If a plex is shown as being in this state, it can be recovered as follows: 1.
Forcibly Restarting a Disabled Volume Forcibly Restarting a Disabled Volume If a disk failure caused a volume to be disabled, and the volume does not contain any valid redundant plexes, you must restore the volume from a backup after replacing the failed disk.
Reattaching Failed Disks 3. Use the vxdisk list command to verify that the failing flag has been cleared: # vxdisk list DEVICE TYPE c1t1d0 auto:simple c1t2d0 auto:simple c1t3d0 auto:simple . . . DISK mydg01 mydg02 mydg03 GROUP mydg mydg mydg STATUS online online online Reattaching Failed Disks You can perform a reattach operation if a disk could not be found at system startup, or if VxVM is started with some disk drivers unloaded and unloadable (causing disks to enter the failed state).
Failures on RAID-5 Volumes You can use the command vxreattach -c to check whether reattachment is possible, without performing the operation. Instead, it displays the disk group and disk media name where the disk can be reattached. See the vxreattach(1M) manual page for more information on the vxreattach command. Failures on RAID-5 Volumes Failures are seen in two varieties: system failures and disk failures.
Failures on RAID-5 Volumes Disk Failures An uncorrectable I/O error occurs when disk failure, cabling or other problems cause the data on a disk to become unavailable. For a RAID-5 volume, this means that a subdisk becomes unavailable. The subdisk cannot be used to hold data and is considered stale and detached. If the underlying disk becomes available or is replaced, the subdisk is still considered stale and is not used.
Failures on RAID-5 Volumes A disk containing a RAID-5 log plex can also fail. The failure of a single RAID-5 log plex has no direct effect on the operation of a volume provided that the RAID-5 log is mirrored. However, loss of all RAID-5 log plexes in a volume makes it vulnerable to a complete failure. In the output of the vxprint -ht command, failure within a RAID-5 log plex is indicated by the plex state being shown as BADLOG rather than LOG.
Failures on RAID-5 Volumes The volume is not made available while the parity is resynchronized because any subdisk failures during this period makes the volume unusable. This can be overridden by using the -o unsafe start option with the vxvol command. If any stale subdisks exist, the RAID-5 volume is unusable. Caution The -o unsafe start option is considered dangerous, as it can make the contents of the volume unusable. Using it is not recommended. 2. Any existing log plexes are zeroed and enabled.
Failures on RAID-5 Volumes Note Following severe hardware failure of several disks or other related subsystems underlying a RAID-5 plex, it may be impossible to recover the volume using the methods described in this chapter. In this case, remove the volume, recreate it on hardware that is functioning correctly, and restore the contents of the volume from a backup. Parity Resynchronization In most cases, a RAID-5 array does not have stale parity.
Failures on RAID-5 Volumes Parity is regenerated by issuing VOL_R5_RESYNC ioctls to the RAID-5 volume. The resynchronization process starts at the beginning of the RAID-5 volume and resynchronizes a region equal to the number of sectors specified by the -o iosize option. If the -o iosize option is not specified, the default maximum I/O size is used. The resync operation then moves onto the next region until the entire length of the RAID-5 volume has been resynchronized.
Failures on RAID-5 Volumes Recovery After Moving RAID-5 Subdisks When RAID-5 subdisks are moved and replaced, the new subdisks are marked as STALE in anticipation of recovery. If the volume is active, the vxsd command may be used to recover the volume. If the volume is not active, it is recovered when it is next started. The RAID-5 volume is degraded for the duration of the recovery operation. Any failure in the stripes involved in the move makes the volume unusable.
Failures on RAID-5 Volumes When this occurs, the vxvol start command returns the following error message: VxVM vxvol ERROR V-5-1-1236 Volume r5vol is not startable; RAID-5 plex does not map entire volume length. At this point, the contents of the RAID-5 volume are unusable. Another possible way that a RAID-5 volume can become unstartable is if the parity is stale and a subdisk becomes detached or stale.
Failures on RAID-5 Volumes Forcibly Starting RAID-5 Volumes You can start a volume even if subdisks are marked as stale: for example, if a stopped volume has stale parity and no RAID-5 logs, and a disk becomes detached and then reattached. The subdisk is considered stale even though the data is not out of date (because the volume was in use when the subdisk was unavailable) and the RAID-5 volume is considered invalid.
Recovering from Incomplete Disk Group Moves Recovering from Incomplete Disk Group Moves If the system crashes or a subsystem fails while a disk group move, split or join operation is being performed, VxVM attempts either to reverse or to complete the operation when the system is restarted or the subsystem is repaired. Whether the operation is reversed or completed depends on how far it had progressed. Automatic recovery depends on being able to import both the source and target disk groups.
Recovery from Failure of a DCO Volume Recovery from Failure of a DCO Volume Note The FastResync feature is not supported by VxVM 4.1 on the HP-UX 11i v3 platform. Note The procedures in this section depend on the DCO version number. See the VERITAS Volume Manager Administrator’s Guide for information about DCO versioning. Persistent FastResync uses a data change object (DCO) volume to perform tracking of changed regions in a volume.
Recovery from Failure of a DCO Volume This output shows the mirrored volume, vol1, its snapshot volume, SNAP-vol1, and their respective DCOs, vol1_dco and SNAP-vol1_dco. The two disks, mydg03 and mydg04, that hold the DCO plexes for the DCO volume, vol1_dcl, of vol1 have failed. As a result, the DCO volume, vol1_dcl, of the volume, vol1, has been detached and the state of vol1_dco has been set to BADLOG.
Recovery from Failure of a DCO Volume Recovering a Version 0 DCO For a version 0 DCO, perform the following steps to recover the DCO volume: 1. Correct the problem that caused the I/O failure. 2.
Recovery from Failure of a DCO Volume If a snapshot volume and the original volume are in different disk groups, you must perform a separate snapclear operation on each volume: # vxassist -g diskgroup1 snapclear volume snap_obj_to_snapshot # vxassist -g diskgroup2 snapclear snapvol snap_obj_to_volume Here snap_obj_to_volume is the name of the snap object associated with the snapshot volume, snapvol, that points to the original volume.
Recovery from Failure of a DCO Volume 4. Start the volume using the vxvol command: # vxvol [-g diskgroup] start volume For the example output, the command would take this form: # vxvol -g mydg start vol1 5. Prepare the volume again using the following command: # vxsnap [-g diskgroup] prepare volume [ndcomirs=number] \ [regionsize=size] [drl=yes|no|sequential] \ [storage_attribute ...
Recovery from Failure of a DCO Volume 24 VERITAS Volume Manager Troubleshooting Guide
Recovery from Failure of Instant Snapshot Operations 2 This chapter describes how to recover from various failure and error conditions that may occur during instant snapshot operations: ◆ Failure of vxsnap prepare ◆ Failure of vxsnap make for Full-Sized Instant Snapshots ◆ Failure of vxsnap make for Break-Off Instant Snapshots ◆ Failure of vxsnap make for Space-Optimized Instant Snapshots ◆ Failure of vxsnap restore ◆ Failure of vxsnap reattach or refresh ◆ Copy-on-write Failure ◆ I/O Error
Failure of vxsnap make for Full-Sized Instant Snapshots Failure of vxsnap make for Full-Sized Instant Snapshots If a vxsnap make operation fails during the creation of a full-sized instant snapshot, the snapshot volume may go into the DISABLED state, be marked invalid and be rendered unstartable. You can use the following command to check that the inst_invalid flag is set to on: # vxprint [-g diskgroup] -F%inst_invalid snapshot_volume VxVM can usually recover the snapshot volume without intervention.
Failure of vxsnap make for Space-Optimized Instant Snapshots Failure of vxsnap make for Space-Optimized Instant Snapshots If a vxsnap make operation fails during the creation of a space-optimized instant snapshot, the snapshot volume may go into the INSTSNAPTMP state. VxVM can usually recover the snapshot volume without intervention. However, in certain situations, this recovery may not succeed.
Copy-on-write Failure 3. Use the following command to start the volume: # vxvol [-g diskgroup] start volume 4. Re-run the failed reattach or refresh command. Note This results in a full resynchronization of the volume. Alternatively, remove the snapshot volume and recreate it if required.
I/O Errors During Resynchronization I/O Errors During Resynchronization Snapshot resynchronization (started by vxsnap syncstart, or by specifying sync=on to vxsnap) stops if an I/O error occurs, and displays the following message on the system console: VxVM vxsnap ERROR V-5-1-6840 Synchronization of the volume volume stopped due to I/O error After correcting the source of the error, use the following command to restart the resynchronization operation: # vxsnap [-b] [-g diskgroup] syncstart volume See the
I/O Failure on a DCO Volume 30 VERITAS Volume Manager Troubleshooting Guide
3 Recovery from Boot Disk Failure VERITAS Volume Manager (VxVM) protects systems from disk and other hardware failures and helps you to recover from such events. This chapter describes recovery procedures and provides information that help to prevent loss of data or system access due to the failure of the boot (root) disk. For information about recovering volumes and their data on non-boot disks, see “Recovery from Hardware Failure” on page 1.
Recovery by Booting from Recovery Media Recovery by Booting from Recovery Media If there is a failure to boot from the VxVM boot disk on HP-UX 11i version 2, and no bootable root mirror is available, it may be necessary to boot from an alternate boot source, or from recovery media such as the following: ◆ HP-UX 11i version 2 installation disc. ◆ Bootable recovery tape. ◆ Secondary boot disk in the configuration. ◆ HP-UX Ignite-UX server that is accessible over a LAN.
Recovery by Booting from Recovery Media Starting VxVM after Booting from Recovery Media You can use the vx_emerg_start utility to start VxVM after booting a system from recovery media. This command allows a rootable VxVM configuration to be repaired in the event of a catastrophic failure. The command is invoked as shown here: # vx_emerg_start hostname The hostname argument specifies the name (node name) of the system that you are repairing.
Using VxVM Maintenance Mode Boot (MMB) When you have recovered the volumes on the VxVM root disk, and performed any other necessary repairs, reboot the system: # reboot Fixing a Missing or Corrupt /etc/vx/volboot File The following messages may be displayed at boot time if the /etc/vx/volboot file is missing or its contents are incorrect: vxvm:vxconfigd: ERROR: enable failed: Volboot file not loaded transactions are disabled.
Recovery by Reinstallation Caution The VxVM configuration daemon, vxconfigd, does not normally run in MMB mode, and only one copy of the root volume data is used. If the system has a mirrored root volume, writing to the root file system can thus cause file system corruption when both mirrors are subsequently configured. To prevent this, start VxVM in MMB mode by running the vx_emerg_start command.
Recovery by Reinstallation 36 VERITAS Volume Manager Troubleshooting Guide
4 Logging Commands and Transactions This chapter provides information on how to administer logging of commands and transactions in VERITAS Volume Manager (VxVM). For information on how to administer error logging, see “Error Messages” on page 49. Logging Commands The vxcmdlog command allows you to log the invocation of other VxVM commands to a file. The following table demonstrates the usage of vxcmdlog: Command Description vxcmdlog -l List current settings for command logging.
Logging Commands Note The .cmdlog file is a binary and should not be edited. The size of the command log is checked after an entry has been written so the actual size may be slightly larger than that specified. When the log reaches a maximum size, the current command log file, cmdlog, is renamed as the next available historic log file, cmdlog.number, where number is an integer from 1 up to the maximum number of historic log files that is currently defined, and a new current log file is created.
Logging Transactions Logging Transactions The vxtranslog command allows you to log VxVM transactions to a file. The following table demonstrates the usage of vxtranslog: Command Description vxtranslog -l List current settings for transaction logging. vxtranslog -m on Turn on transaction logging. vxtranslog -s 512k Set the maximum transaction log file size to 512KB. vxtranslog -n 10 Set the maximum number of historic transaction log files to 10.
Logging Transactions Each log file contains a header that records the host name, host ID, and the date and time that the log was created.
Associating Command and Transaction Logs Associating Command and Transaction Logs The Client and process IDs that are recorded for every request and command assist you in correlating entries in the command and transaction logs.
Associating Command and Transaction Logs 42 VERITAS Volume Manager Troubleshooting Guide
Backing Up and Restoring Disk Group Configurations 5 Disk group configuration backup and restoration allows you to backup and restore all configuration data for VERITAS Volume Manager (VxVM) disk groups, and for VxVM objects such as volumes that are configured within the disk groups. Using this feature, you can recover from corruption of a disk group’s configuration that is stored as metadata in the private region of a VM disk.
Backing Up a Disk Group Configuration If VxVM cannot update a disk group’s configuration because of disk errors, it disables the disk group and displays the following error: VxVM vxconfigd ERROR V-5-1-123 Disk group group: Disabled by errors If such errors occur, you can restore the disk group configuration from a backup after you have corrected any underlying problem such as failed or disconnected hardware.
Restoring a Disk Group Configuration To back up a disk group manually, use this command: # /etc/vx/bin/vxconfigbackup diskgroup To back up all disk groups, use this version of the command: # /etc/vx/bin/vxconfigbackup For more information, see the vxconfigbackup(1M) manual page. Restoring a Disk Group Configuration You can use the vxconfigrestore utility to restore or recreate a disk group from its configuration backup. The restoration process has two stages: precommit and commit.
Restoring a Disk Group Configuration To commit the changes that are required to restore the disk group configuration, use the following command: # /etc/vx/bin/vxconfigrestore -c [-l directory] {diskgroup | dgid} If no disk headers are reinstalled, the configuration copies in the disks’ private regions are updated from the latest binary copy of the configuration that was saved for the disk group.
Restoring a Disk Group Configuration The following is a sample extract from such a backup file that shows the timestamp and disk group ID information: TIMESTAMP Tue Apr 15 23:27:01 PDT 2003 . . . DISK_GROUP_CONFIGURATION Group: mydg dgid: 1047336696.19.xxx.veritas.com . . . Use the timestamp information to decide which backup contains the relevant information, and use the vxconfigrestore command to restore the configuration by specifying the disk group ID instead of the disk group name.
Restoring a Disk Group Configuration 48 VERITAS Volume Manager Troubleshooting Guide
6 Error Messages This chapter provides information on error messages associated with the VERITAS Volume Manager (VxVM) configuration daemon (vxconfigd), the kernel, and other utilities. It covers most informational, failure, and error messages displayed on the console by vxconfigd, and by the VERITAS Volume Manager kernel driver, vxio. These include some errors that are infrequently encountered and difficult to troubleshoot. Note Some error messages described here may not apply to your system.
Configuring Logging in the Startup Script To enable logging of console output to the file /var/adm/configd.log, edit the startup script for vxconfigd as described in “Configuring Logging in the Startup Script,” or invoke vxconfigd under the C locale as shown here: # vxconfigd [-x [1-9]] -x log There are 9 possible levels of debug logging; 1 provides the least detail, and 9 the most.
Understanding Messages Note By default, vxconfigd is started at boot time with the -x syslog option. This redirects vxconfigd console messages to syslog. If you want to retain this behavior when restarting vxconfigd from the command line, include the -x syslog argument, as restarting vxconfigd does not preserve the option settings with which it was previously running.
Understanding Messages ◆ FATAL ERROR A fatal error message from a configuration daemon, such as vxconfigd, indicates a severe problem with the operation of VxVM that prevents it from running. The following is an example of such a message: VxVM vxconfigd FATAL ERROR V-5-0-591 Disk group bootdg: Inconsistency -- Not loaded into kernel ◆ ERROR An error message from a command indicates that the requested operation cannot be performed correctly.
Understanding Messages Messages This section contains a list of messages that you may encounter during the operation of VERITAS Volume Manager. However, the list is not exhaustive and the second field may contain the name of different command, driver or module from that shown here. If you encounter a product error message, record the unique message number preceding the text of the message.
Understanding Messages V-5-0-35 VxVM vxdmp NOTICE V-5-0-35 Attempt to disable controller controller_name failed. Rootdisk has just one enabled path. ◆ Description: An attempt is being made to disable the one remaining active path to the root disk controller. ◆ Action: The path cannot be disabled. V-5-0-55 VxVM vxio WARNING V-5-0-55 Cannot find device number for boot_path ◆ Description: The boot path retrieved from the system PROMs cannot be converted to a valid device number.
Understanding Messages V-5-0-110 VxVM vxdmp NOTICE V-5-0-110 disabled controller controller_name connected to disk array disk_array_serial_number ◆ Description: All paths through the controller connected to the disk array are disabled. This usually happens if a controller is disabled for maintenance. ◆ Action: None. V-5-0-111 VxVM vxdmp NOTICE V-5-0-111 disabled dmpnode dmpnode_device_number ◆ Description: A DMP node has been marked disabled in the DMP database.
Understanding Messages V-5-0-145 VxVM vxio WARNING V-5-0-145 DRL volume volume is detached ◆ Description: A Dirty Region Logging volume became detached because a DRL log entry could not be written. If this is due to a media failure, other errors may have been logged to the console. ◆ Action: The volume containing the DRL log continues in operation.
Understanding Messages V-5-0-164 VxVM vxio WARNING V-5-0-164 Failed to join cluster name, aborting ◆ Description: A node failed to join a cluster. This may be caused by the node being unable to see all the shared disks. Other error messages may provide more information about the disks that cannot be found. ◆ Action: Use the vxdisk -s list command on the master node to see what disks should be visible to the slave node.
Understanding Messages V-5-0-181 VxVM vxio WARNING V-5-0-181 Illegal vminor encountered ◆ Description: An attempt was made to open a volume device before vxconfigd loaded the volume configuration. ◆ Action: None; under normal startup conditions, this message should not occur. If necessary, start VxVM and re-attempt the operation. V-5-0-194 VxVM vxio WARNING V-5-0-194 Kernel log full: volume detached ◆ Description: A plex detach failed because the kernel log was full.
Understanding Messages V-5-0-216 VxVM vxio WARNING V-5-0-216 mod_install returned errno ◆ Description: A call made to the operating system mod_install function to load the vxio driver failed. ◆ Action: Check for additional console messages that may explain why the load failed. Also check the console messages log file for any additional messages that were logged but not displayed on the console.
Understanding Messages V-5-0-249 VxVM vxio WARNING V-5-0-249 RAID-5 volume entering degraded mode operation ◆ Description: An uncorrectable error has forced a subdisk to detach. At this point, not all data disks exist to provide the data upon request. Instead, parity regions are used to regenerate the data for each stripe in the array. Consequently, access takes longer and involves reading from all drives in the stripe.
Understanding Messages V-5-0-252 VxVM vxio NOTICE V-5-0-252 read error on object subdisk of mirror plex in volume volume (start offset length length) corrected ◆ Description: A read error occurred, which caused a read of an alternate mirror and a writeback to the failing region. This writeback was successful and the data was corrected on disk. ◆ Action: None; the problem was corrected automatically. Note the location of the failure for future reference.
Understanding Messages V-5-1-91 VxVM vxconfigd WARNING V-5-1-91 Cannot create device device_path: reason ◆ Description: vxconfigd cannot create a device node either under /dev/vx/dsk or under /dev/vx/rdsk. This should happen only if the root file system has run out of inodes. ◆ Action: Remove some unwanted files from the root file system.
Understanding Messages V-5-1-116 VxVM vxconfigd WARNING V-5-1-116 Cannot open log file log_filename: reason ◆ Description: The vxconfigd console output log file could not be opened for the given reason. ◆ Action: Create any needed directories, or use a different log file path name as described in “Logging Error Messages” on page 49.
Understanding Messages V-5-1-122 VxVM vxconfigd WARNING V-5-1-122 Detaching plex plex from volume volume ◆ Description: This error only happens for volumes that are started automatically by vxconfigd at system startup. The plex is being detached as a result of I/O failure, disk failure during startup or prior to the last system shutdown or crash, or disk removal prior to the last system shutdown or crash.
Understanding Messages V-5-1-124 VxVM vxconfigd ERROR V-5-1-124 Disk group group: update failed: reason ◆ Description: I/O failures have prevented vxconfigd from updating any active copies of the disk group configuration. This usually implies a large number of disk failures.
Understanding Messages V-5-1-169 VxVM vxconfigd ERROR V-5-1-169 cannot open /dev/vx/config: reason ◆ ◆ Description: The /dev/vx/config device could not be opened. vxconfigd uses this device to communicate with the VERITAS Volume Manager kernel drivers. The most likely reason is “Device is already open.” This indicates that some process (most likely vxconfigd) already has /dev/vx/config open. Less likely reasons are “No such file or directory” or “No such device or address.
Understanding Messages V-5-1-249 VxVM vxconfigd NOTICE V-5-1-249 Volume volume entering degraded mode ◆ Description: Detaching a subdisk in the named RAID-5 volume has caused the volume to enter “degraded” mode. While in degraded mode, performance of the RAID-5 volume is substantially reduced. More importantly, failure of another subdisk may leave the RAID-5 volume unusable. Also, if the RAID-5 volume does not have an active log, then failure of the system may leave the volume unusable.
Understanding Messages V-5-1-525 VxVM vxconfigd NOTICE V-5-1-525 Detached log for volume volume ◆ Description: The DRL or RAID-5 log for the named volume was detached as a result of a disk failure, or as a result of the administrator removing a disk with vxdg -k rmdisk. A failing disk is indicated by a “Detached disk” message. ◆ Action: If the log is mirrored, hot-relocation tries to relocate the failed log automatically. Use either vxplex dis or vxsd dis to remove the failing logs.
Understanding Messages V-5-1-543 VxVM vxconfigd ERROR V-5-1-543 Differing version of vxconfigd installed ◆ Description: A vxconfigd daemon was started after stopping an earlier vxconfigd with a non-matching version number. This can happen, for example, if you upgrade VxVM and then run vxconfigd without first rebooting. ◆ Action: Reboot the system.
Understanding Messages poorly-attached cable, or from a disk that fails to spin up fast enough. Alternately, this may happen as a result of a disk being physically removed from the system, or from a disk that has become unusable due to a head crash or electronics failure. Any RAID-5 plexes, DRL log plexes, RAID-5 subdisks or mirrored plexes containing subdisks on this disk are unusable. Such disk failures (particularly on multiple disks) may cause one or more volumes to become unusable.
Understanding Messages This will result in the disk being taken out of active use in its disk group, if it has not already been taken out of use. If the disk is still operational, which should not be the case, vxdisk prints: device: Okay If the disk is listed as “Okay,” try running vxdctl hostid again. If it still results in an error, contact VERITAS Technical Support.
Understanding Messages A more serious failure is indicated by errors such as: Configuration records are inconsistent Disk group has no valid configuration copies Duplicate record in configuration Format error in configuration copy Invalid block number Invalid magic number These errors indicate that all configuration copies have become corrupt (due to disk failures, writing on the disk by an application or the administrator, or bugs in VxVM).
Understanding Messages ◆ ◆ A disk group cannot be auto-imported due to some temporary failure. If you create a new disk group with the same name as the failed disk group and reboot, the new disk group is imported first. The auto-import of the older disk group fails because more recently modified disk groups have precedence over older disk groups. ◆ A disk group is deported from one host using the -h option to cause the disk group to be auto-imported on reboot from another host.
Understanding Messages ◆ Action: If some of the copies failed due to transient errors (such as cable failures), then a reboot or re-import may succeed in importing the disk group. Otherwise, the disk group configuration may have to be restored.
Understanding Messages To clear the locks during import, use the following command: # vxdg -C import diskgroup Caution Be careful when using the vxdisk clearimport or vxdg -C import command on systems that have dual-ported disks. Clearing the locks allows those disks to be accessed at the same time from multiple hosts and can result in corrupted data. An import operation fails if some disks for the disk group cannot be found among the disk drives attached to the system.
Understanding Messages volume that was remapped may no longer be remapped. Also, volumes that are remapped once are not guaranteed to be remapped to the same device number in further reboots. ◆ Action: Use the vxdg reminor command to renumber all volumes in the offending disk group permanently. See the vxdg(1M) manual page for more information.
Understanding Messages V-5-1-809 VxVM vxplex ERROR V-5-1-809 Plex plex in volume volume is locked by another utility. ◆ Description: The vxplex command fails because a previous operation to attach a plex did not complete. The vxprint command should show that one or both of the temporary and persistent utility fields (TUTIL0 and PUTIL0) of the volume and one of its plexes are set.
Understanding Messages ◆ ◆ Case 1: The /etc/fstab file was erroneously updated to indicate the device for the /usr file system is a volume, but the volume named is not in the boot disk group. This should happen only as a result of direct manipulation by the administrator. ◆ Case 2: The system somehow has a duplicate boot disk group, one of which contains the /usr file system volume and one of which does not (or uses a different volume name), and vxconfigd somehow chose the wrong boot disk group.
Understanding Messages ◆ Action: If the root file system is full, increase its size or remove files to make space for the tempdb file. If /var is a separate file system, make sure that it has an entry in /etc/fstab. Otherwise, look for I/O error messages during the boot process that indicate either a hardware problem or misconfiguration of any logical volume management software being used for the /var file system. Also verify that the encapsulation (if configured) of your boot disk is complete and correct.
Understanding Messages ◆ Action: This error can result from a kernel error that has made the configuration daemon process unkillable, from some other kind of kernel error, or from some other user starting another configuration daemon process after the SIGKILL signal. This last condition can be tested for by running vxconfigd -k again. If the error message reappears, contact VERITAS Technical Support.
Understanding Messages V-5-1-2353 VxVM vxconfigd ERROR V-5-1-2353 Disk group group: Cannot recover temp database: reason Consider use of "vxconfigd -x cleartempdir" [see vxconfigd(1M)]. ◆ Description: This can happen if you kill and restart vxconfigd, or if you disable and enable it with vxdctl disable and vxdctl enable. This error indicates a failure related to reading the file /var/vxvm/tempdb/group.
Understanding Messages V-5-1-2630 VxVM vxconfigd WARNING V-5-1-2630 library and vxconfigd disagree on existence of client number ◆ Description: This warning may safely be ignored. ◆ Action: None required. V-5-1-2824 VxVM vxconfigd ERROR V-5-1-2824 Configuration daemon error 242 ◆ Description: A node failed to join a cluster, or a cluster join is taking too long. If the join fails, the node retries the join automatically.
Understanding Messages V-5-1-2860 VxVM vxdg ERROR V-5-1-2860 Transaction already in progress ◆ Description: One of the disk groups specified in a disk group move, split or join operation is currently involved in another unrelated disk group move, split or join operation (possibly as the result of recovery from a system failure). ◆ Action: Use the vxprint command to display the status of the disk groups involved.
Understanding Messages V-5-1-2870 VxVM vxdg ERROR V-5-1-2870 volume: Volume or plex device is open or mounted ◆ Description: An attempt was made to perform a disk group move, split or join on a disk group containing an open volume. ◆ Action: It is most likely that a file system configured on the volume is still mounted. Stop applications that access volumes configured in the disk group, and unmount any file systems configured in the volumes.
Understanding Messages V-5-1-2922 VxVM vxconfigd ERROR V-5-1-2922 Disk group exists and is imported ◆ Description: A slave tried to join a cluster, but a shared disk group already exists in the cluster with the same name as one of its private disk groups. ◆ Action: Use the vxdg -n newname import diskgroup operation to rename either the shared disk group on the master, or the private disk group on the slave.
Understanding Messages V-5-1-3009 VxVM vxdg ERROR V-5-1-3009 object: Name conflicts with imported diskgroup ◆ Description: The target disk group of a split operation already exists as an imported disk group. ◆ Action: Choose a different name for the target disk group.
Understanding Messages V-5-1-3024 VxVM vxconfigd ERROR V-5-1-3024 vxclust not there ◆ Description: An error during an attempt to join a cluster caused vxclust to fail. This may be caused by the failure of another node during a join or by the failure of vxclust. ◆ Action: Retry the join. An error message on the other node may clarify the problem.
Understanding Messages V-5-1-3032 VxVM vxconfigd ERROR V-5-1-3032 Master sent no data ◆ Description: During the slave join protocol, a message without data was received from the master. This message is only likely to be seen in the case of an internal VxVM error. ◆ Action: Contact VERITAS Technical Support. V-5-1-3033 VxVM vxconfigd ERROR V-5-1-3033 Join in progress ◆ Description: An attempt was made to import or deport a shared disk group during a cluster reconfiguration.
Understanding Messages V-5-1-3049 VxVM vxconfigd ERROR V-5-1-3049 Retry rolling upgrade ◆ Description: An attempt was made to upgrade a cluster to a higher protocol version when a transaction was in progress. ◆ Action: Retry the upgrade at a later time. V-5-1-3050 VxVM vxconfigd ERROR V-5-1-3050 Version out of range for at least one node ◆ Description: Before trying to upgrade a cluster by running vxdctl upgrade, all nodes should be able to support the new protocol version.
Understanding Messages V-5-1-3243 VxVM vxdmpadm ERROR V-5-1-3243 The VxVM restore daemon is already running. You can stop and restart the restore daemon with desired arguments for changing any of its parameters. ◆ Description: The vxdmpadm start restore command has been executed while the restore daemon is already running. ◆ Action: Stop the restore daemon and restart it with the required set of parameters as shown in the vxdmpadm(1M) manual page.
Understanding Messages V-5-1-3828 VxVM vxconfigd ERROR V-5-1-3828 upgrade operation failed: Already at highest version ◆ Description: An upgrade operation has failed because a cluster is already running at the highest protocol version supported by the master. ◆ Action: No further action is possible as the master is already running at the highest protocol version it can support.
Understanding Messages V-5-1-4267 VxVM vxassist WARNING V-5-1-4267 volume volume already has at least one snapshot plex Snapshot volume created with these plexes will have a dco volume with no associated dco plex. ◆ Description: An error was detected while adding a DCO object and DCO volume to a mirrored volume. There is at least one snapshot plex already created on the volume. Because this snapshot plex was created when no DCO was associated with the volume, there is no DCO plex allocated for it.
Understanding Messages V-5-1-4620 VxVM vxassist WARNING V-5-1-4620 Error while retrieving information from SAL ◆ Description: The vxassist command does not recognize the version of the SAN Access Layer (SAL) that is being used, or detects an error in the output from SAL. ◆ Action: If a connection to SAL is desired, ensure that the correct version of SAL is installed and configured correctly.
Understanding Messages V-5-1-5161 VxVM vxplex ERROR V-5-1-5161 Plex plex not attached. ◆ Description: An attempt was made to snap back a detached plex. ◆ Action: Reattach the snapshot plex to the snapshot volume. V-5-1-5162 VxVM vxplex ERROR V-5-1-5162 Plexes do not belong to the same snapshot volume. ◆ Description: An attempt was made to snap back plexes that belong to different snapshot volumes. ◆ Action: Specify the plexes in separate invocations of vxplex snapback.
Understanding Messages The following example shows a vxvm.exclude file with paths c8t0d0, c8t0d1, and c8t0d2 excluded from VxVM: exclude_all 0 paths c8t0d0 /0/4/0/0.8.0.108.0.0.0 c8t0d1 /0/4/0/0.8.0.108.0.0.1 c8t0d2 /0/4/0/0.8.0.108.0.0.2 # controllers # product # pathgroups ◆ Case 2: Some arrays such as EMC and HDS provide mirroring in hardware. When a LUN pair is split, depending on how the process is performed, this may result in two disks with the same disk ID.
Understanding Messages 96 VERITAS Volume Manager Troubleshooting Guide
Index Symbols .cmdlog file 37 .translog file 39 /etc/vx/cbr/bk/diskgroup.dgid dgid .binconfig file 44 dgid .cfgrec file 44 dgid .diskinfo file 44 dgid.dginfo file 44 /etc/vx/log logging directory 37, 39 /stand/rootconf file 34 /var/adm/configd.log file 49 /var/adm/syslog/syslog.
EMPTY plex state 3 ENABLED plex kernel state 3 ENABLED volume kernel state 12 ERROR messages 52 error messages A virtual disk device is open 67 All transactions are disabled 64 Already at highest version 91 Attempt to disable controller failed 90 Attempt to enable a controller that is not available 80 can’t import diskgroup 91 Can’t locate disk(s) 92 Cannot assign minor 87 Cannot auto-import group 43, 71 Cannot find disk on slave node 86 Cannot kill existing daemon 79 cannot open /dev/vx/config 66 Cannot re
Record volume is in disk group diskgroup1 plex is in group diskgroup2 77 Reimport of disk group failed 74 Request crosses disk group boundary 84 Retry rolling upgrade 89 Return from cluster_establish is Configuration daemon error 82 Rootdg cannot be imported during boot 34 Skip disk group with duplicate name 72 some subdisks are unusable and the parity is stale 16 startup script 50 Synchronization of the volume stopped due to I/O error 29 System startup failure 65 The VxVM restore daemon is already running
recovering mirrored volumes 5 process ID in command logging file 38 in transaction logging file 40 N NEEDSYNC volume state 13 NOTICE messages 52 notice messages added disk array 53 Attempt to disable controller failed 54 Detached disk 63 Detached log for volume 68 Detached plex in volume 68 Detached subdisk in volume 68 Detached volume 68 disabled controller connected to disk array 55 disabled dmpnode 55 disabled path belonging to dmpnode 55 enabled controller connected to disk array 56 enabled dmpnode 56
recovering stale RAID-5 14 stale, starting volume 17 SYNC volume state 11, 13 syslog error log file 50 system reinstalling 35 system failures 9 T Technical assistance ix transactions associating with commands 41 logging 39 translog file 39 TUTIL0 field clearing MOVE flag 18 V V-5-0-106 54 V-5-0-108 54 V-5-0-110 55 V-5-0-111 55 V-5-0-112 55 V-5-0-144 55 V-5-0-145 56 V-5-0-146 56 V-5-0-147 56 V-5-0-148 56 V-5-0-164 57 V-5-0-166 57 V-5-0-168 57 V-5-0-181 58 V-5-0-194 58 V-5-0-196 58 V-5-0-2 53 V-5-0-207 58 V-5
V-5-1-3031 87 V-5-1-3032 88 V-5-1-3033 88 V-5-1-3034 88 V-5-1-3042 88 V-5-1-3046 88 V-5-1-3049 89 V-5-1-3050 89 V-5-1-3091 89 V-5-1-3212 89 V-5-1-3243 90 V-5-1-3362 90 V-5-1-3486 90 V-5-1-3689 90 V-5-1-3828 91 V-5-1-3848 91 V-5-1-4220 91 V-5-1-4267 92 V-5-1-4277 92 V-5-1-4551 92 V-5-1-4620 93 V-5-1-4625 93 V-5-1-480 67 V-5-1-484 67 V-5-1-5150 93 V-5-1-5160 93 V-5-1-5161 94 V-5-1-5162 94 V-5-1-525 68 V-5-1-526 68 V-5-1-527 68 V-5-1-528 68 V-5-1-543 69 V-5-1-544 69 V-5-1-545 69 V-5-1-546 69 V-5-1-554 70 V-5-1
vxsnap make recovery from failure of 26 vxsnap prepare recovery from failure of 25 vxsnap reattach recovery from failure of 27 vxsnap refresh recovery from failure of 27 vxsnap restore recovery from failure of 27 vxtranslog controlling transaction logging 39 VxVM emergency startup 33 RAID-5 recovery process 11 starting after booting from recovery media 33 using Maintenance Mode Boot (MMB) 34 vxvol recover command 14 vxvol resync command 13 vxvol start command 5 W WARNING messages 52 warning messages Cannot