Maintenance Best Practices for Direct-Attached SCSI Solutions Maintenance Best Practices for Direct-Attached SCSI Solutions Revision 1.
Maintenance Best Practices for Direct-Attached SCSI Solutions (This page left intentionally blank) Page 2 of 29
Maintenance Best Practices for Direct-Attached SCSI Solutions 1.0 Revision History. ............................................................................................... 4 2.0 Introduction. ...................................................................................................... 4 3.0 Overview of Steps Ensuring RAID Best Practices. ........................................ 5 CABLING PRACTICES .......................................................................................
Maintenance Best Practices for Direct-Attached SCSI Solutions 1.0 Revision History. Date 02/25/04 04/19/04 05/11/04 05/20/04 Revision 0.1 0.2 0.3 0.4 05/25/04 05/25/04 0.5 1.0 2.0 Explanation of Changes Initial draft Adding Core Team Input. Format rewrite Format Changes, adding cabling best practice, and bus mode comments Adding Core Team Input. Official Release. Introduction.
Maintenance Best Practices for Direct-Attached SCSI Solutions 3.0 Overview of Steps Ensuring RAID Best Practices. Cabling Practices • Assure that properly qualified cables are being used • Assure that SCSI cables are properly secured to the PV22x and controller. • Verify cables are not excessively bent • Inspect cabling for cuts, exposed shielding • Cable length should be appropriate for installation Maintenance of Arrays. • Run regular consistency checks on the system.
Maintenance Best Practices for Direct-Attached SCSI Solutions 4.0 Cabling Practices 4.1 Only properly qualified cables should be used The SCSI standard has been through several iterations with speed increases at each change in the standard. The result is that there are several different SCSI cables that may look identical but have differing capabilities. Part numbers should be carefully verified to ascertain whether each cable is capable of operating properly at the transmission speed of the system. 4.
Maintenance Best Practices for Direct-Attached SCSI Solutions 5.0 Maintenance of Arrays. RAID arrays are an industry standard for the protection of important data through redundancy. This redundancy may take the form of parity calculations that are dispersed throughout the array. It could also be the simple mirroring of data to maintain a complete copy that does not require parity calculations to reconstruct the missing elements.
Maintenance Best Practices for Direct-Attached SCSI Solutions Consistency Check Consistency check is a data level check that both verifies data inside the block or stripe, as well as checks for bad blocks. Consistency check will perform read operations on both the user data areas of the logical drive as well as the currently unused areas not containing any user data. This check finds and repairs stripes where data and parity are not matched.
Maintenance Best Practices for Direct-Attached SCSI Solutions 5.1 Setup for automated scheduling of consistency checks on Windows systems. 1. For systems with a Windows OS system and Array Manager installed, you can use the ‘Scheduled Tasks’ option from the menu under the ‘Accessories’ folder.
Maintenance Best Practices for Direct-Attached SCSI Solutions 2. Click ‘Next’ and the following screen will appear: 3. Click ‘Browse’ and locate the file ‘amcli.exe’. The AMCLI executable is located in the Array Manager installation directory. 4. Select the file and click ‘OK’. 5. Click ‘Next’ and the following screen will appear: 6. Enter a name for this task and select how often the task should be run. The minimum recommendation for this task is to be run at least once a month.
Maintenance Best Practices for Direct-Attached SCSI Solutions 7. Click ‘Next’ and the following screen will appear: 8. Select the time at which the Consistency Check should run. Remember that there will be a system performance impact so you want to run this at a low traffic time. 9. Click ‘Next’ and the following screen will appear: 10. Fill in the name and password fields appropriately so the task can be executed correctly.
Maintenance Best Practices for Direct-Attached SCSI Solutions 11. Click ‘Next’ and the following screen will appear: 12. Select the checkbox for ‘Open advanced properties for this task when I click Finish.
Maintenance Best Practices for Direct-Attached SCSI Solutions 13. Click ‘Finish’ and the following screen will appear: 14. In the ‘Run’ textbox you can type different parameters. The following is example syntax for scheduling a check consistency on virtual disk 1. ‘ "C:\PathName\amcli.exe" /c1 ’ where PathName is the path to the AMCLI executable. This will be the command executed by the scheduler every time it runs this event. 15.
Maintenance Best Practices for Direct-Attached SCSI Solutions To view the other AMCLI command options, display the AMCLI help information by entering ‘ amcli /? ‘ or look at the Array Manager help file. Sample of command line options: amcli /da where the d option indicates display and the a option indicates adapter. Displays the status of the system controllers (adapters). The possible status values are: • None -- The controller does not have a battery.
Maintenance Best Practices for Direct-Attached SCSI Solutions 5.2 Upgrade firmware, drivers and Array Manager concurrently to the latest versions. Enclosure firmware, RAID controller firmware, the RAID controller driver, and Array Manager are maintained in block releases. These block releases should be observed and installed/upgraded to receive maximum performance, dependability and functionality from your direct-attached SCSI solution. All the latest versions of code are available through support.dell.
Maintenance Best Practices for Direct-Attached SCSI Solutions 5.3.1 Reconditioning the Battery using Array Manager. Using Array Manager to recondition the battery simply requires right-clicking on the controller icon and the selecting the ”Properties” option. Once the “Controller Properties” dialog box (Figure 6) is displayed, click the “Recondition” button. Once the reconditioning has started, the status of the battery displayed in Array Manager may not change.
Maintenance Best Practices for Direct-Attached SCSI Solutions 5.4 Monitor System Event Logs and Array Manager Event Logs. System Events are generated for informational purposes, such as for record-keeping activities, or for notifying the user of events that may affect the physical security and availability of their data. If you are using a Windows-based OS and you have Array Manager installed, a comprehensive list of event types is provided in the help file of the application.
Maintenance Best Practices for Direct-Attached SCSI Solutions You will also be able to get the events from any Netware server with the Windows console. These events will be shown on the Array Manager Event Log. The use of Array Manager is recommended for Windows or Netware based OS. 5.5 Backup and Recovery of Data. The implementation of comprehensive backup and recovery strategy is recommended to guarantee the preservation of your data.
Maintenance Best Practices for Direct-Attached SCSI Solutions You should alternatively use Array Manager for Netware-based systems to retrieve issue information. All the latest Array Manager versions for Windows and Netware offer the ability to retrieve NVRAM information for all the RAID controllers except the Perc 2\SC\DC. This NVRAM dump will contain all the configuration information as well as the contents of the NVRAM logs.
Maintenance Best Practices for Direct-Attached SCSI Solutions Appendix. To avoid loss of data integrity or the recovery of lost arrays the following steps should be followed: PERC 3\SC\DC\QC and PERC 4\SC\DC\Di RAID rebuild behavior 1.1 Pre-May 2003 behavior When a media error is found during a rebuild, the PERC RAID Firmware will not complete the rebuild process. The new HDD is marked failed, and the Virtual Disk is kept in degraded state.
Maintenance Best Practices for Direct-Attached SCSI Solutions When the rebuild process find the media error, Array Manager Pop-up the message "rebuild failed" as follows: Screenshot for Array Manager showing multiple media errors (03 11) with a “rebuild completed” message: Page 21 of 29
Maintenance Best Practices for Direct-Attached SCSI Solutions 1.2 Post May 2003 behavior For the May release, the PERC Firmware, and Array Manager has changes introduced to improve the customer experience in the RAID rebuild scenario. RAID 1 and RAID 5 Customer Experience: • The new RAID FW handles media errors and completes the rebuild process, creating an Array Manager event for each of the media errors and completing the rebuild with a "completed with error" message.
Maintenance Best Practices for Direct-Attached SCSI Solutions Screenshot for Array Manager: Page 23 of 29
Maintenance Best Practices for Direct-Attached SCSI Solutions 1.3 Array Manager Online help files The online help for Array Manager was modified to include more information and explanation for the above message “Completed with Errors”. The following is the content added to the last version of Array Manager (build 555): 1.3.1 Rebuild Completes with Errors on PERC Subsystem 1 Controller. 1.3.1.
Maintenance Best Practices for Direct-Attached SCSI Solutions 1.3.1.3 Event 691 Received during a Rebuild or while a Virtual Disk is Degraded Do the following if you receive event 691 during a rebuild or while the virtual disk is in a degraded state: 1. Replace the damaged drive. 2. Create a new virtual disk and allow the virtual disk to completely resynchronize.
Maintenance Best Practices for Direct-Attached SCSI Solutions 3. After changing the switch to Split Bus Mode, power down both servers and reboot the PV220s/PV221s. 4. After the PV220s/PV221s has been rebooted, power up each of the attached servers. 5. Failure to power cycle in this instance will likely show symptoms such as no post or hang on the HBA/PERC controller initialization. 6.
Maintenance Best Practices for Direct-Attached SCSI Solutions Customer using Linux DellMgr utility. The administrator will see the percentage increase from 0% to 100% and change to “INCON” as follows: As in the case of Ctrl-M application, DellMgr does not message when media errors are found during the rebuild. The administrator can run DellMon in the Linux system to obtain further events and notifications related to the RAID subsystem.
Maintenance Best Practices for Direct-Attached SCSI Solutions mentioned above. The file shows the media errors (01 33) between the “rebuild started” and “rebuild completed” message. Sample Event logs generated by DellMon: [01/13/2004 (11:53:23)]: Adapter 1: No of Charge Cycles = 4 [01/13/2004 (11:54:14)]: Adapter 1: Battery Voltage GOOD. [01/13/2004 (22:57:53)]: Adapter 1: Battery Fast Charging FAILED.
Maintenance Best Practices for Direct-Attached SCSI Solutions 2.2.2 Customer using Ctrl+M utility. The PERC 3\SC\DC\QC and the PERC 4\SC\DC\Di RAID controllers can also be managed with the Ctrl-M utility. This utility is accessed during the booting process by pressing Ctrl-M. Administrator can monitor the rebuild process from Ctrl-M screen. The screen will show the percentage completed during the rebuild process until a 100% is displayed.