HP Matrix Operating Environment 7.2 Recovery Management User Guide Abstract The HP Matrix Operating Environment 7.2 Recovery Management User Guide contains information on installation, configuration, testing, and troubleshooting HP Matrix Operating Environment recovery management (Matrix recovery management).
© Copyright 2013 Hewlett-Packard Development Company, L.P. Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. The information contained herein is subject to change without notice.
Contents 1 Matrix recovery management Overview........................................................5 2 Installing and configuring Matrix recovery management..................................8 Installation and configuration overview........................................................................................8 Installation and configuration prerequisites...................................................................................8 Installing and licensing Matrix recovery management........
5 Issues, limitations, and suggested actions.....................................................48 Limitations.............................................................................................................................48 Hyper-V support limitation for bidirectional configuration.........................................................48 No automatic synchronization of configuration between sites...................................................
1 Matrix recovery management Overview Matrix recovery management is a component of the HP Matrix Operating Environment that provides disaster recovery protection for logical servers and for Matrix infrastructure orchestration services. Logical servers and Matrix infrastructure orchestration services (IO services) that are included in a Matrix recovery management configuration are referred to as DR Protected logical servers and IO services.
Figure 1 Recovery Group Sets Features and benefits of Matrix 7.2 recovery management • Provides an automated failover mechanism for DR Protected logical servers, DR Protected IO services, and associated storage. • Provides a disaster recovery solution for logical servers and IO services managed by the HP Matrix Operating Environment. NOTE: Supports DR Protection of IO services with virtual server groups only.
• Includes Recovery Group startup order settings that let you determine which Recovery Groups are recovered first during a site failover. • Includes a Copy feature that makes it easy to create multiple Storage Replication Groups with the same configuration parameters. By reading this HP Matrix Operating Environment 7.2 Recovery Management User Guide, you will gain a better understanding of Matrix recovery management concepts and configuration testing.
2 Installing and configuring Matrix recovery management This chapter contains sections on Matrix recovery management installation prerequisites, networking setup, storage setup, logical server setup, Matrix recovery management configuration, export and import operations, and DR Protection for IO services. IMPORTANT: If you intend to create DR Protected IO services, see “DR protection for IO services” (page 22) before starting the Matrix recovery management installation and configuration process.
NOTE: • It is assumed that networking and storage replication links are present between the Local Site and the Remote Site. • When planning for disaster recovery solution using Matrix recovery management with VMware hypervisor based IO services, proper planning is required to ensure the hosts at both the primary and the remote sites match.
if you configure a logical server to use a different IP or subnet at each site in the Matrix recovery management configuration. • When running on physical targets (VC hosted) or non VMware ESX virtual targets (VM hosted), Matrix recovery management does not ensure that logical servers use the same MAC addresses at both sites. When running on VMware ESX hosted virtual targets, Matrix recovery management does ensure that logical servers use the same MAC address at both sites.
includes EMC as a storage server type. For more information see “Creating and installing a User Defined storage adapter” (page 14). • If a DR Protected logical server at the Local Site is VC hosted, the replicated boot and data LUNs on the array at the Remote Site must be presented to the corresponding recovery logical server.
General storage setup notes • For information on storage setup of cross-technology logical servers (logical servers capable of being VC hosted or VM hosted), see: Dynamic workload movement with CloudSystem Matrix. • For a list of supported storage, see the HP Insight Management Support Matrix at http:// www.hp.com/go/matrixoe/docs. • Hyper-V virtual machines in clustered environments must be stored on cluster shared volumes.
on HP P9000 RAID Manager Software to manage P9000 storage replication. HP P9000 RAID Manager Software instances and configuration files must be configured to manage various device groups that are configured in Matrix recovery management. For more information, see the following: • ◦ HP P9000 Cluster Extension Software documentation available at http:// h20000.www2.hp.com. Click Manuals, then go to Storage→Storage Software→Storage Replication Software→HP Cluster Extension Software.
NOTE: After you install and configure Matrix recovery management at the Local Site, you must manually failover 3PAR remote copy groups to the Remote Site before attempting to configure install and configure Matrix recovery management at the Remote Site. For 3PAR periodic (asynchronous) remote copy, the manual failover action will not synchronize the data in the remote copy group volumes.
7. 8. Create a Recovery Group by using the Matrix recovery management GUI, and associate that Recovery Group with the Storage Replication Group for the nonintegrated storage. Perform a Matrix recovery management Export operation at the Local Site to generate an exportconfig file, then perform an Import operation to import that exportconfig file at the Remote Site.
Multiple User Defined storage adapters Matrix recovery management supports multiple User Defined storage adapters to co-exist in a Matrix recovery management configuration. For each User Defined storage adapter type, you can create a new subdirectory and place your implementation of the three commands for that storage type.
a. Rescan storage using VM host management tools, for example, VMware Virtual Center or Microsoft Hyper-V Management Console, to ensure that the VM host recognizes the failed over storage. • The replicated disk on the Remote Site Hyper-V host must be configured with the same drive letter that is assigned to the Local Site disk it is replicated from.
Matrix recovery management GUI overview The home screen for the Matrix recovery management user interface includes tabs for configuration and administration tasks—see Figure 2 (page 18).
Matrix recovery management configuration steps Figure 3 (page 19) illustrates the six-step Matrix recovery management configuration process. After the Matrix recovery management configuration process is completed at the Local Site, Matrix recovery management must be configured at the Remote Site.
6. 7. 8. 9. From the Sites tab at the Remote Site, import the Local Site Matrix recovery management configuration. For more information, see “Matrix recovery management export and import operations” (page 20). Test the recovery logical servers. All imported Recovery Groups are in maintenance mode, allowing activation of the recovery logical servers. For more information, see “Testing Recovery Groups” (page 26).
◦ For HP P9000, the user name and RAID instance number must match. ◦ For HP 3PAR storage system, the password file name must match. If any one of the above items is not matched between the exporting site and the importing site, the import operation fails. NOTE: To manage HP 3PAR remote copy, the encrypted password file for both the Local Site and Remote Site Inserv storage servers must be available on the CMS at each site, and the name of the password file must be the same on the CMS at each site.
NOTE: • Recovery Groups can be imported one at a time only. You must repeat the import...→Select import file... procedure for each Recovery Group that you import. • If a Recovery Group in the exportconfig file has the same name as a Recovery Group at the importing site, it is not imported.
NOTE: 6. 7. 8. 9. • When you use Matrix recovery management to DR protect VMware ESX based IO services that have been deployed from an IO template using an ICVirt template or a VM template on the vCenter server at the Local Site, an ICVirt or VM template with same name must be available at the Remote Site before you perform a Matrix recovery management import operation to import the site configuration.
NOTE: If one datastore is specified in volume.dr.list, the DR Protected IO services are provisioned on the datastore specified. If multiple datastores are specified in volume.dr.list, the DR Protected IO services are provisioned on the datastore in volume.dr.list that is both available for that server pool and also has the most free disk space. If multiple datastores are specified in volume.dr.list, and the IO template specifies one of the datastores in volume.dr.
For more information, see the HP Matrix Operating Environment 7.2 Infrastructure Orchestration User Guide available at http://www.hp.com/go/matrixoe/docs.
3 Testing and failover operations This chapter describes Recovery Group testing, planned failovers, and unplanned failovers using the Matrix recovery management Activate and Deactivate operations. Testing Recovery Groups There are two ways to test Recovery Groups: • Using Maintenance Mode to test individual Recovery Groups.
4. Place the Recovery Group into Maintenance Mode at the Remote Site using the Enable Maintenance Mode button in the Matrix recovery management Recovery Groups tab. For logical servers, use the Logical Servers Activate operation in the Tools menu of the Visualization tab in Matrix OE visualization to activate the logical servers in the Recovery Group at the Remote Site. Depending on the type of logical server, the activation may be on VC blades, VM hosts, or both.
1. 2. Shut down the applications and operating system on each Matrix recovery management DR Protected logical server and each server associated with DR Protected IO services. Click on the Deactivate... button and the Deactivate Recovery Groups at the Local Site window will appear. For more information about the Recovery Groups contained in a Recovery Group Set, select the Recovery Group Set and click View Recovery Group.
3. 4. Select each Recovery Group Set that you want to activate or click the check-box on the left side of the banner of the Activate Recovery Groups at the Local Site window to select all of the Recovery Group Sets for activation. Click Activate Recovery Groups to start the activation operation. A window will appear asking if it is OK to proceed. Click OK and you will be directed to the Jobs tab where you can monitor the progress of the activation Job.
3. 4. • Start Order • Power-Up Delay Select each Recovery Group Set that you want to activate at the recovery site. The objective is for all Recovery Group Sets that were previously activated at the site where the site-wide event occurred to be activated at the recovery site. Click Activate Recovery Groups to start the activation operation. A window will appear asking if it is OK to proceed. Click OK and you will be directed to the Jobs tab where you can monitor the progress of the activation Job.
4 Dynamic workload movement with CloudSystem Matrix This chapter explains how you can configure cross-technology logical servers that can be managed with Matrix recovery management. The HP Matrix Operating Environment facilitates the fluid movement of workloads between dissimilar servers within a site and across sites. Workloads can be moved between physical servers and virtual machines and between dissimilar physical servers.
Capabilities and limitations Using the tools and procedures described in this chapter you can: • Configure and manage a logical server that can perform physical to virtual cross-technology movements within the datacenter. • Configure and manage a DR Protected logical server that can be failed over across data centers in a cross-technology movement.
Figure 4 Same LUN number across physical and virtual targets • The target WWN values used to present the Logical Unit must be the same across virtual and physical targets. NOTE: The recovery logical server that provides DR protection at the Remote Site has its own set of target WWN/LUN values that differ from the target WWN/LUN values for the logical server at the Local Site.
Figure 5 ESX host network name Figure 6 Virtual Connect Enterprise Manager network name • • 34 When moving a logical server between physical and virtual servers within a site, the following server IDs are not preserved: ◦ Network MAC addresses ◦ Server/Initiator WWNs (On a virtual machine, the storage adapter is a virtual SCSI controller.
both types of servers. The recovery site can have a physical/virtual combination also, or have only virtual machine hosts. Supported platforms The procedures for enabling movement between physical and virtual servers described in this chapter apply to physical servers, hypervisors, and workload operating systems supported by Matrix recovery management. For more information, see the HP Insight Management Support Matrix at http://www.hp.com/go/matrixoe/docs.
i. ii. Copy the executable cp011231.exe to the physical server where the image is currently running. Run cp011231.exe to install PINT and start the PINT service. For more information, see “Configuring and managing portable OS images” (page 38). 2. Create a portability group that includes all potential physical and VM host targets. This step sets up the portability group that defines the list of potential targets for the logical server.
NOTE: When the logical server is first moved to a virtual machine, you may want to add additional tools to the server, for example, VMware tools. In the HP Matrix Operating Environment, the VM configuration created does not include a virtual CD/DVD drive. You can use the VM management console to modify the VM configuration to include a virtual CD/DVD drive. 5. Configure inter-site movement between physical and virtual targets (disaster recovery use case).
NOTE: The Matrix recovery management Site configuration can be set up to preferentially activate the logical server on a physical server at one site and a VM host at the other site. For more information, see “Setting a failover target type preference” (page 46).
The command-line interface for PISA is described below. The options are mutually exclusive. PISA runs on supported versions of Windows only, and it requires that the user be a member of the Administrator user group. Usage: hppisa -h, -?, -help Show this information -e, -enable Enable the LSI driver -d, -disable Disable the LSI driver After these changes are made, the OS image can be moved back and forth between physical servers and virtual machines.
The HP Matrix Operating Environment provides default portability groups depending on the resources found within your data center. The Default portability groups include: • ESX—All ESX Hypervisors • HYPERV—All Hyper-V Hypervisors • Each Virtual Connect Domain Group—Each VCDG has its own Default portability group. You can also create User Defined portability groups that extend the portability of a logical server to unlike technologies.
Figure 10 Selecting group members and targets Provide a name and optional description for the portability group. The name will be used for defining logical servers. The set of Group Types is selected automatically based on the targets inserted into the portability group. Valid combinations of targets include: • A single Virtual Connect Domain Group (VCDG) • A set of ESX Hypervisors • A set of Hyper-V Hypervisors • A set consisting of a single VCDG plus a set of ESX Hypervisors.
Figure 11 Selecting a portability group To view the portability group for any logical server, click the View movable logical server details icon in Matrix OE visualization as illustrated in Figure 12 (page 42). Figure 12 View movable logical server details icon The details for this logical server are displayed as illustrated in Figure 13 (page 42).
Logical servers can be made portable through techniques described in “Portability groups” (page 39). NOTE: You must determine whether the provisioned operating system within a logical server performs as desired on a variety of platforms. If a logical server has never been active on a platform type, the HP Matrix Operating Environment shows a warning for each target of that type in the Target Selection page during moves and activations. You must determine whether the target is valid.
When defining storage for a portable logical server, you must select SAN Storage Entry. For flexibility and movement between underlying technology types, storage must be presented to the WWNs tied to the Virtual Connect server profile, and storage must also be presented to any ESX VM hosts that are potential targets for the logical server.
Targets for a logical server are selected from that logical server's portability group. The portability group members are then further filtered based on resource availability, including CPU and memory resources as well as network and SAN connectivity. NOTE: Networks in Virtual Connect must be named identically to their corresponding networks (port groups) on ESX Hypervisors. Differences in names prevent the Unlike Move operation from identifying networks with similar connectivity.
Moving between blade types For logical servers with target attributes, the logical server management software can identify more possible targets when moving or activating a server. As with all cross-technology logical servers, you must ensure that the logical server can function appropriately on various platforms. If a particular target is proven to be unsuitable, it is easy to remove that type of target to more accurately describe the logical server's portability.
You must specify the target type preferred for all sites on the CMS at each site: • If you specify Virtual as the target type preferred for a site, all cross-technology logical servers whose Recovery Groups prefer that site are activated on VM hosts during an Activate operation at that site. A physical server is chosen only if no VM hosts are available.
5 Issues, limitations, and suggested actions This chapter lists issues and limitations for this release, categorized as follows: Limitations Limitations of the implemented functions and features of this release Major issues Issues that may significantly affect functionality and usability in this release Minor issues Issues that may be noticeable but do not have a significant impact on functionality or usability Limitations • Only IO services that include virtual servers and on-premise (not cloud) res
ESX configuration setting required for VMFS datastores of Matrix recovery management managed logical servers to be visible at Remote Site Under the following conditions, Matrix recovery management requires a specific ESX configuration setting to retain the signature of a VMFS datastore so it will be visible at the Remote Site: • You have asymmetric HP P6000 Continuous Access Software array models at the Local and Remote Site.
recommends as a best practice that you keep LUN numbers the same for corresponding disks across sites. Suggested actions Assess the impact of these discrepancies on any licensing arrangements in use for the operating system and applications running on DR Protected logical servers.
6 Troubleshooting This chapter provides troubleshooting information in the following categories: • “Configuration troubleshooting” (page 51) • “Configuration error messages” (page 53) • “Warning messages” (page 56) • “Matrix recovery management Job troubleshooting” (page 57) • “Failover error messages” (page 60) • “Matrix recovery management log files” (page 61) • “DR Protected IO serivces troubleshooting” (page 61) Configuration troubleshooting To troubleshoot Matrix recovery management confi
• Unable to add or edit HP P6000 Storage Replication Group Possible causes include: ◦ • Matrix recovery management is unable to obtain Storage Replication Group information from Command View servers to validate the Storage Replication Group information provided by the user. Unable to add or edit HP P9000 Storage Replication Group Possible causes include: • ◦ The Storage Replication Group is not configured to be managed by the RAID manager instances.
• No configuration operation can be run Possible causes include: • ◦ An Activate, Deactivate, or Import operation is in progress. ◦ Another configuration operation may be in progress Unable to import Storage Management Servers as part of an import operation Possible causes include: • ◦ The Storage Management Server was not discovered in the HP Matrix Operating Environment user interface.
Error message Cannot verify the host name specified. Cause The hostname specified for the CMS for either the Local Site or the Remote Site is not locatable in the DNS. Action Verify that a valid DNS entry with a fully qualified domain name exists for each CMS. Error message Cannot create/edit the site information. Cause The hostname specified for the CMS does not include a fully qualified domain name associated with the local CMS.
then go to Storage→Storage Software→Storage Device Management Software→HP P6000 Command View Software. 2. Confirm that the port number specified during Storage Management Server configuration in Matrix recovery management is the same as the WBEM port number configured on the HP P6000 Command View server (for example, 5989). For more information, see the “CIMOM” server configuration section in the HP P6000 Command View Software Installation Guide. 3.
Error message Unable to run Matrix recovery management operations because Matrix recovery management Job is in progress or another Matrix recovery management configuration operation is in progress. Cause If an Activate or Deactivate operation is in progress, no configuration operation is allowed, because the Job is in progress. If a Matrix recovery management configuration operation is in progress, no other Matrix recovery management configuration operations are allowed.
Warning message Warning: Matrix recovery management is quiesced. No new operations will be allowed. Cause Matrix recovery management has been quiesced. All configuration buttons (Create, Edit, Delete, etc…) are disabled. Action Wait for Matrix recovery management to be unquiesced. Warning message Warning: Unable to remove CLX credentials for (these server credentials may not exist in CLX).
an Activate Job. It has an Entity of type site and an Operation of type activate. You will also notice the Failed icon in the Status column indicating that Job 3288 has failed. Figure 21 Jobs screen For a failed Job, click the check box next to the Job Id to get detailed information about the associated Sub Jobs. A site Job contains a Sub Job for each Recovery Group. Similarly, each Recovery Group has Sub Jobs for its Storage Replication Group and logical server, respectively.
Figure 23 Restarting a failed job NOTE: Restarting the Job retries only Sub Jobs that previously failed; servers associated with completed Jobs or Sub Jobs are not impacted. IMPORTANT: If correcting the problem that caused the Job to fail included reconfiguration of logical servers, before you restart the Job, go to the Recovery Groups tab and delete the Recovery Groups that contain the reconfigured logical servers.
• Matrix recovery management job failed because of unlocatable logical server in Matrix OE logical server management. Possible causes include: ◦ • A logical server managed by Matrix recovery management was removed from Matrix OE logical server management before it was unmanaged in Matrix recovery management. Matrix recovery management job failed because an operation failed in Matrix OE logical server management for the logical server.
Matrix recovery management log files There are several log files available with detailed information that you can view to help identify the sources of Matrix recovery management failover or failback problems: • For errors that occur during the initial Matrix recovery management configuration steps, view the mxdomainmgr(0).log file located in the logs directory where HP Systems Insight Manager is installed on the system. • For errors that occur during a failover, check the lsdt.
DR Protected IO services configuration troubleshooting In addition to the configuration issues addressed in this User Guide that are common to both logical servers and IO services, the following configuration issues apply to IO services only: • Failed to get a list of IO services that can be included in a recovery group Possible Causes: • ◦ Matrix infrastructure orchestration Windows service is not running. ◦ There are no IO services that are DR protection enabled.
IO services configuration error messages Error message Unable to get the IO service. Cause The IO service does not exist or Matrix recovery management failed to get the IO service information from the Matrix infrastructure. Action Check the Matrix recovery management and IO log files for more information on the failure. If the IO service does not exist in IO, it is possible that the IO service was removed. If the IO service exists, restart IO and retry the operation.
DR Protected IO services failover troubleshooting In addition to the failover issues addressed in this User Guide that are common to both logical servers and IO services, the following failover issues apply to IO services only: • Failed to activate IO service in a Recovery Group Possible Causes: • ◦ Storage resources are not available. ◦ The IO service is in an invalid state for activation. ◦ The IO service does not exist. ◦ The Matrix infrastructure orchestration Windows service is not running.
7 Support and other resources Information to collect before contacting HP Be sure to have the following information available before you contact HP: • Software product name • Hardware product model number • Operating system type and version • Applicable error message • Third-party hardware or software • Technical support registration number (if applicable) How to contact HP Use the following methods to contact HP technical support: • In the United States, see the Customer Service / Contact HP U
With this service, Insight Management customers benefit from expedited problem resolution as well as proactive notification and delivery of software updates. For more information about this service, see the following website: http://www.hp.com/services/insight. Registration for this service takes place following online redemption of the license certificate.
Matrix recovery management documentation For more information on Matrix recovery management, see the following sources: • HP Insight Management Support Matrix Provides Matrix recovery management support information along with other HP Insight hardware, software, and firmware support information. Available at http://www.hp.com/ go/matrixoe/docs. • HP Matrix Operating Environment 7.
WARNING An alert that calls attention to important information that, if not understood or followed, results in personal injury. CAUTION An alert that calls attention to important information that, if not understood or followed, results in data loss, data corruption, or damage to hardware or software. IMPORTANT An alert that calls attention to essential information. NOTE An alert that contains additional or supplementary information. TIP An alert that provides helpful information.
A Recover the CSV from online pending state If you perform an activate operation at the remote site without taking the CSV offline at the local site, you might see the following symptoms at the local site: • No storage information is available when navigating to the storage view in the Windows Failover Cluster Management tool. • The CSV is in the online pending state. To recover from these symptoms, perform the following steps: 1.
B Documentation feeback Documentation feeback HP is committed to providing documentation that meets your needs. To help us improve the documentation, send any error, suggestions, or comments to Documentation Feedback (docsfeedback@hp.com). Include the document title and part number, version number, or the URL when submitting your feedback.
Glossary bidirectional failover A Matrix recovery management feature that allows Recovery Group Sets to be activated or deactivated at either the Local Site or the Remote Site. At any point in time there can be activated and deactivated Recovery Group Sets at both sites. In the event of a disaster, or to accommodate site maintenance, all of the Recovery Group Sets in the Matrix recovery management configuration can be deactivated at one site, and activated at the other site.
rehearsal, the Recovery Group and its corresponding logical servers and IO services can be brought back under the control of Matrix recovery management. Matrix infrastructure orchestration services Matrix infrastructure orchestration services (IO services) quickly provision infrastructure to automatically activate physical and virtual servers, storage, and networking from pools of shared resources. More information on Matrix infrastructure orchestration is available at http:// www.hp.
Recovery Group Set A set of Recovery Groups that share the same Preferred and Secondary sites. Recovery Groups cannot be activated or deactivated individually. Instead, all Recovery Groups that share the same Preferred and Secondary site must be activated or deactivated as a set. Recovery Group Sets can be selected for activation or deactivation at the Local site. Recovery Group Start Order An optional number that specifies the order in which a Recovery Group is to be started during a site failover.
VM hosted logical server 74 Glossary A logical server running on a virtual machine under the control of a hypervisor.