PolyServe® Matrix Server Release Notes PolyServe Matrix Server 3.1.2 for Red Hat Enterprise Linux AS/ES 4.
Copyright © 2004-2006 PolyServe, Inc. Use, reproduction and distribution of this document and the software it describes are subject to the terms of the software license agreement distributed with the product (“License Agreement”). Any use, reproduction, or distribution of this document or the described software not explicitly permitted pursuant to the License Agreement is strictly prohibited unless prior written permission from PolyServe has been received.
Contents PolyServe Matrix Server Contents of the Matrix Server 3.1.2 Release . . . . . . . . . . . . . . . . . . . . . . . . 1 Contents of the Matrix Server 3.1.1 Release . . . . . . . . . . . . . . . . . . . . . . . . 6 New Features in Matrix Server 3.1.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Implementation Changes in Matrix Server 3.1.0 . . . . . . . . . . . . . . . . . . . 10 PolyServe Kernel Patches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
PolyServe Matrix Server Contents of the Matrix Server 3.1.2 Release This release includes the following new features: • A configurable Global Event Delay for device monitors. The delay can minimize unwanted failover/failback operations that can occur when a monitor becomes active after it has been down on all servers. • New mx server markdown command that can be used to tell Matrix Server that a server is down and does not need to be fenced. • Support for RHEL4 Update 3 and the 2.6.9-34.EL kernel.
PolyServe Matrix Server Release Notes 2 • Defect 12381. Multiple processes could hang while attempting to perform write operations. • Defect 12385. Down nodes were included in aggregate values on the Performance Dashboard. • Defect 12389. Unwanted failover/failback operations occurred when a monitor became active after it had been down on all servers. • Defect 12421. A race condition could prevent the perfservice process from starting. • Defect 12463.
PolyServe Matrix Server Release Notes 3 • Defect 12522. A method was needed to notify Matrix Server that a node is down and does not need to be fenced. • Defect 12528. UIDs used for the MSA 1500, EVA 3000, EVA 6000 and EVA 8000 arrays needed to be upgraded to the new format used by Matrix Server. • Defect 12567. When fabric fencing was configured with Cisco switches running the 2.1b firmware, Matrix Server did not fence the correct node. • Defect 12647.
PolyServe Matrix Server Release Notes 4 • Defect 12875. A flaw in mxregd’s recovery processing could cause it to not finish, which would then cause dependent components of the system to not function. • Defect 12877. The perfservice logs grew too large and could eventually fill the /var filesystem. • Defect 12892. grpcommd needed to be able to shut down Matrix Server if ClusterPulse was hung. • Defect 12899. grpcommd crashed during a membership transition.
PolyServe Matrix Server Release Notes 5 • Defect 13300. Sanpulse could terminate unexpectedly, causing MatrixServer to shut down on the node. • Defect 13316. A disk that temporarily appeared to have a different UID caused Matrix Server to shut down. • Defect 13345. mxcheck did not verify that the operating system was for a supported machine architecture. • Defect 13580. Reads or writes with large block sizes on a dboptimized filesystem could cause the system to crash in psfs_get_blocks_direct_IO().
PolyServe Matrix Server Release Notes 6 Contents of the Matrix Server 3.1.1 Release This release includes the following new features: • A Performance Dashboard that can be used for monitoring activities such as cluster-wide resource utilization and PSFS filesystem I/O traffic. • Support for EVA snapclones. A snapclone is similar to a hardware snapshot, except that it completely copies the source filesystem data at a particular point in time. • Support for RHEL4 Update 2 and the 2.6.9-22.EL and 2.6.9-22.0.
PolyServe Matrix Server Release Notes 7 • Defect 11067. On the Applications tab, a “drag and drop” operation could remove a virtual host or device monitor configured on only one server. A service monitor could also be removed. • Defect 11113. On the Quotas window, a warning was displayed that multiple (unequal) filesystem quotas were selected even though the group had only one filesystem. • Defect 11126. The Management Console crashed if a quota limit was not specified on the Add Quota dialog.
PolyServe Matrix Server Release Notes 8 • Defect 11420. Problems with message-queue handling could cause a temporary slowdown of Matrix Server administrative operations or could cause Matrix Server to shut down on a server. • Defect 11431. Snapshots occasionally did not appear on the Management Console. • Defect 11446.
PolyServe Matrix Server Release Notes 9 • Defect 11774. The operation to destroy a snapshot did not always complete successfully. • Defect 11831.When the Properties dialog was opened for a SHARED_FILESYSTEM monitor, a space was added to the file name. causing the monitor to go down. • Defect 12042. psfsck crashed when rebuilding a large filesystem. • Defect 12052.
PolyServe Matrix Server Release Notes 10 Implementation Changes in Matrix Server 3.1.0 The following implementation changes have been made in this release: • The Management Console has been enhanced with new icons and features for improved presentation and control. • The /etc/polyserve directory has been moved to /etc/opt/polyserve, and the /var/polyserve directory has been moved to /var/opt/polyserve. • The unregistered network port numbers used by Matrix Server can now be modified if necessary.
PolyServe Matrix Server Release Notes 11 monitors. Changes to the application affect all Matrix Server entities assigned to the application. PolyServe Kernel Patches The PolyServe Matrix Server 3.1.2 distribution contains a Linux kernel patch set that can be applied at the customer’s discretion. These problems have been reported to Red Hat. The following patches are provided for the 2.6.9-11.EL kernel: • 00_pid_alive.patch. This patch corrects a process management bug in the Linux kernel.
PolyServe Matrix Server Release Notes 12 • 02_rescan-partitions.patch. This patch restores the original return value of check_partition() to 0 instead of -EIO. Returning -EIO breaks some of the partition code, which can prevent psd devices from being imported into the matrix. If this patch is not installed, you will need to ensure that all LUNs have a valid partition table in place. • 03_imapping_race.patch.
PolyServe Matrix Server Release Notes 13 Open Issues and Workarounds The following open issues affect Matrix Server operations. Matrix Server Defect Description 982 Service monitor attempts to start before filesystem is mounted PSFS filesystems can be configured to be mounted automatically when the system is booted. This configuration is called a persistent mount. When a server is booted, Matrix Server does not wait for the persistent mounts to complete before it begins other matrix operations.
PolyServe Matrix Server Release Notes 14 Defect Description 1735 Matrix Server does not shut down applications When Matrix Server is shut down, it terminates processes that have open files on PSFS filesystems. To avoid this problem, we recommend that you stop applications that are using the filesystem before you shut down Matrix Server.
PolyServe Matrix Server Release Notes 15 Defect Description 7823 Password prompt requires old password When you change the admin password on the Management Console and then export it to other servers in the matrix, you will be prompted for the password on those servers. You will need to specify the old password, not the newly assigned password. (The old password is in effect on the servers until the export operation is complete.
PolyServe Matrix Server Release Notes 16 Defect Description 8473 mxcheck asks for FibreChannel switch information When mxcheck is run, it asks for the names or addresses of the FibreChannel switches in the matrix. It uses this information to test the access to the switches. If you will not be placing FibreChannel switches under matrix control, or you do not want to test switch access at this time, simply press Enter at the prompt. mxcheck will then continue to execute.
PolyServe Matrix Server Release Notes 17 Defect Description 9615 Nodes stall waiting for locks If you are seeing alerts stating that nodes are stalled waiting for locks for a particular filesystem, the filesystem may be experiencing contention on Full Zone Bitmaps. To aid in diagnosing this problem, determine whether the following apply: • Full Zone Bitmaps (FZBMs) are enabled on the filesystem.
PolyServe Matrix Server Release Notes 18 Defect Description 10191 Using hostnames in .matrixrc file can cause connection delays When servers are specified by hostname in the .matrixrc file, long connection delays (possibly minutes per hostname entry) can occur if there is a slow or unresponsive DNS server on the network. During this time, the Management Console and mx commands might be unresponsive.
PolyServe Matrix Server Release Notes 19 Defect Description Complete the following steps: 1. Locate the following line in the mxinit.conf file: # grpcommd_start_options = { "--nodaemon", "--signalparent" }; 2. Remove the “#” symbol at the beginning of the line to uncomment it. 3. Add “--multicast 255.255.255.255” near the end of the line. Be sure to insert a comma after the “--signalparent” option. For example: grpcommd_start_options = { "--nodaemon", "--signalparent", "--multicast 255.255.255.
PolyServe Matrix Server Release Notes 20 Defect Description 12331 mxcheck fails without an error message If the /tmp directory is full, the mxcheck utility will fail and Matrix Server will not start. When this situation occurs, mxcheck should display an error message explaining that there is insufficient space in /tmp.
PolyServe Matrix Server Release Notes 21 Defect Description 14027 mx fs destroy partially destroys snapshots Do not use the mx fs destroy command to remove a snapshot. If you attempt to delete a snapshot with the mx fs destroy command, the snapshot filesystem will be removed but the snapshot will remain on the array. You will then need to manually remove the snapshot LUNs from the array. (After removing the LUNs, you will see disk import errors on the console for the LUNs that were removed.
PolyServe Matrix Server Release Notes 22 Defect Description 14200 Emulex driver is incompatible with certain kernel versions The Emulex driver provided with Matrix Server 3.1.2 is incompatible with Red Hat kernel versions later than 2.6.9-22 running on the RHEL4 operating system. Workaround. Download the Emulex 8.0.16.27 driver from the Emulex Web site and then install and configure it as described in the MatrixLink Knowledge Base article “Emulex Driver Does Not Load on RHEL4.
PolyServe Matrix Server Release Notes 23 Defect Description For example, server A may have a process that executes cd /a/b/c. Server B has a process that renames /a/b/c to a/c and removes directory b. If the process on server A executes cd .., it will be allowed to change directory into directory b, which no longer exists. Any operations that attempt to modify the contents of directory b will fail with the appropriate error. To get out of directory b, you will need to either execute another cd ..
PolyServe Matrix Server Release Notes 24 Defect Description 5814 QLogic driver can exhaust kernel memory Under certain high-stress I/O conditions, the QLogic FibreChannel driver can exhaust kernel memory. This condition typically occurs while a server is running at very high I/O rates, and is possibly triggered by storage subsystem error recovery.
PolyServe Matrix Server Release Notes 25 Defect Description 8346 Partition tables can be constructed incorrectly When partitioning a disk, make sure the first partition begins at an offset beyond sector one on the disk. This is necessary because the Linux kernel supports a number of different partition table formats, and some of them make use of sector one in addition to sector zero.
Using Oracle with PolyServe Matrix Server PolyServe Matrix Server has undergone a high degree of Oracle performance and stress testing by the PolyServe Database Engineering team. See the PolyServe Web site for the recommended Oracle release for use with PolyServe Matrix Server. Asynchronous I/O Support While certain Linux distributions may support Asynchronous I/O for raw partitions and non-clustered filesystems, these implementations are not supported on clustered filesystems.
PolyServe Matrix Server Release Notes 27 • disk_async_io = FALSE • _lgwr_async_io = FALSE • _dbwr_async_io = FALSE If 10 DBWR slaves are not sufficient for a given workload, the Oracle session wait event “free buffer waits” will be a predominant wait event as reported through statspack or utlestat.sql. To address this, simply increase the value assigned to the init.ora parameter dbwr_io_slaves. Copyright © 1999-2006 PolyServe, Inc. All rights reserved.