Installation guide

ManualsBrandsRed Hat ManualsComputer equipmentENTERPRISE LINUX 4 - ADMINISTRATION

Red Hat Enterprise Linux 6

Storage

Administration Guide

Deploying and configuring single-node

storage in Red Hat Enterprise Linux 6

Josef Bacik

Kamil Dudka

Hans de Goede

Doug Ledford

Daniel Novotny

Summary of content (178 pages)

PAGE 1
Red Hat Enterprise Linux 6 Storage Administration Guide Deploying and configuring single-node storage in Red Hat Enterprise Linux 6 Josef Bacik Kamil Dudka Hans de Goede Doug Ledford Daniel Novotny
PAGE 2
Storage Administration Guide Nathan Straz David Wysochanski Michael Christie Sachin Prabhu Rob Evers David Howells David Lehman Jeff Moyer Eric Sandeen Mike Snitzer
PAGE 3
Red Hat Enterprise Linux 6 Storage Administration Guide Deploying and configuring single-node storage in Red Hat Enterprise Linux 6 Edition 0 Author Author Author Author Author Author Author Author Author Author Author Author Author Author Author Editor Josef Bacik Kamil Dudka Hans de Goede Doug Ledford Daniel Novotny Nathan Straz David Wysochanski Michael Christie Sachin Prabhu Rob Evers David Howells David Lehman Jeff Moyer Eric Sandeen Mike Snitzer Don Domingo jwhiter@redhat.com kdudka@redhat.
PAGE 4
Storage Administration Guide This guide provides instructions on how to effectively manage storage devices and file systems on Red Hat Enterprise Linux 6. It is intended for use by system administrators with basic to intermediate knowledge of Red Hat Enterprise Linux or Fedora.
PAGE 5
Preface ix 1. Document Conventions ................................................................................................... ix 1.1. Typographic Conventions ..................................................................................... ix 1.2. Pull-quote Conventions ......................................................................................... x 1.3. Notes and Warnings ............................................................................................ xi 2.
PAGE 6
Storage Administration Guide 9. The XFS File System 9.1. Creating an XFS File System ...................................................................................... 9.2. Mounting an XFS File System ..................................................................................... 9.3. XFS Quota Management ............................................................................................ 9.4. Increasing the Size of an XFS File System .......................................................
PAGE 7
13.6. RAID Support in the Installer ..................................................................................... 89 13.7. Configuring RAID Sets .............................................................................................. 89 13.8. Advanced RAID Device Creation ............................................................................... 90 14. Swap Space 14.1. What is Swap Space? .............................................................................................. 14.2.
PAGE 8
Storage Administration Guide 20. Solid-State Disk Deployment Guidelines 119 20.1. Deployment Considerations ..................................................................................... 119 20.2. Tuning Considerations ............................................................................................ 120 21. Online Storage Management 21.1. Fibre Channel ......................................................................................................... 21.1.1. Fibre Channel API .......
PAGE 9
Preface 1. Document Conventions This manual uses several conventions to highlight certain words and phrases and draw attention to specific pieces of information. 1 In PDF and paper editions, this manual uses typefaces drawn from the Liberation Fonts set. The Liberation Fonts set is also used in HTML editions if the set is installed on your system. If not, alternative but equivalent typefaces are displayed. Note: Red Hat Enterprise Linux 5 and later includes the Liberation Fonts set by default. 1.1.
PAGE 10
Preface Close to switch the primary mouse button from the left to the right (making the mouse suitable for use in the left hand). To insert a special character into a gedit file, choose Applications → Accessories → Character Map from the main menu bar. Next, choose Search → Find… from the Character Map menu bar, type the name of the character in the Search field and click Next. The character you sought will be highlighted in the Character Table.
PAGE 11
Notes and Warnings public class ExClient { public static void main(String args[]) throws Exception { InitialContext iniCtx = new InitialContext(); Object ref = iniCtx.lookup("EchoBean"); EchoHome home = (EchoHome) ref; Echo echo = home.create(); System.out.println("Created Echo"); System.out.println("Echo.echo('Hello') = " + echo.echo("Hello")); } } 1.3. Notes and Warnings Finally, we use three visual styles to draw attention to information that might otherwise be overlooked.
PAGE 12
Preface 2.2. We Need Feedback! If you find a typographical error in this manual, or if you have thought of a way to make this manual better, we would love to hear from you! Please submit a report in Bugzilla: http://bugzilla.redhat.com/ against the product Red_Hat_Enterprise_Linux. When submitting a bug report, be sure to mention the manual's identifier: docStorage_Administration_Guide If you have a suggestion for improving the documentation, try to be as specific as possible when describing it.
PAGE 13
Chapter 1. Overview The Storage Administration Guide contains extensive information on supported file systems and data storage features in Red Hat Enterprise Linux 6. This book is intended as a quick reference for administrators managing single-node (i.e. non-clustered) storage solutions. 1.1.
PAGE 14
2
PAGE 15
Chapter 2. Storage Considerations During Installation Many storage device and file system settings can only be configured at install time. Other settings, such as file system type, can only be modified up to a certain point without requiring a reformat. As such, it is prudent that you plan your storage configuration accordingly before installing Red Hat Enterprise Linux 6. This chapter discusses several considerations when planning a storage configuration for your system.
PAGE 16
Chapter 2. Storage Considerations During Installation 1 File System Max Supported Size Max File Size Max Max Subdirectories Depth of (per Symbolic directory) Links ACL Support Details Ext3 16TB 2TB 32,000 8 Yes Chapter 6, The Ext3 File System Ext4 16TB 16TB 65,000 1 8 Yes Chapter 7, The Ext4 File System XFS 100TB 16TB 65,000 1 8 Yes Chapter 9, The XFS File System When the link count exceeds 65,000, it is reset to 1 and no longer increases.
PAGE 17
iSCSI Detection and Configuration Warning Removing/deleting RAID metadata from disk could potentially destroy any stored data. Red Hat recommends that you back up your data before proceeding. To delete RAID metadata from the disk, use the following command: dmraid -r -E /device/ For more information about managing RAID devices, refer to man dmraid and Chapter 13, Redundant Array of Independent Disks (RAID).
PAGE 18
6
PAGE 19
Chapter 3. LVM (Logical Volume Manager) LVM is a tool for logical volume management which includes allocating disks, striping, mirroring and resizing logical volumes. With LVM, a hard drive or set of hard drives is allocated to one or more physical volumes. LVM physical volumes can be placed on other block devices which might span two or more disks. The physical volumes are combined into logical volumes, with the exception of the /boot/ partition.
PAGE 20
Chapter 3. LVM (Logical Volume Manager) Figure 3.2. Logical Volumes On the other hand, if a system is partitioned with the ext3 file system, the hard drive is divided into partitions of defined sizes. If a partition becomes full, it is not easy to expand the size of the partition. Even if the partition is moved to another hard drive, the original hard drive space has to be reallocated as a different partition or not used.
PAGE 21
Using system-config-lvm LogVol03 - (LVM) swap (28 extents). The logical volumes above were created in disk entity /dev/hda2 while /boot was created in / dev/hda1. The system also consists of 'Uninitialised Entities' which are illustrated in Figure 3.7, “Uninitialized Entities”. The figure below illustrates the main window in the LVM utility. The logical and the physical views of the above configuration are illustrated below. The three logical volumes exist on the same physical volume (hda2). Figure 3.3.
PAGE 22
Chapter 3. LVM (Logical Volume Manager) Figure 3.5. Logical View Window On the left side column, you can select the individual logical volumes in the volume group to view more details about each. In this example the objective is to rename the logical volume name for 'LogVol03' to 'Swap'. To perform this operation select the respective logical volume and click on the Edit Properties button.
PAGE 23
Utilizing Uninitialized Entities 3.2.1. Utilizing Uninitialized Entities 'Uninitialized Entities' consist of unpartitioned space and non LVM file systems. In this example partitions 3, 4, 5, 6 and 7 were created during installation and some unpartitioned space was left on the hard disk. Please view each partition and ensure that you read the 'Properties for Disk Entity' on the right column of the window to ensure that you do not delete critical data.
PAGE 24
Chapter 3. LVM (Logical Volume Manager) Figure 3.8. Unallocated Volumes Clicking on the Add to Existing Volume Group button will display a pop up window listing the existing volume groups to which you can add the physical volume you are about to initialize. A volume group may span across one or more hard disks. In this example only one volume group exists as illustrated below. Figure 3.9.
PAGE 25
Migrating Extents The figure below illustrates the logical view of 'VolGroup00' after adding the new volume group. Figure 3.10. Logical view of volume group In the figure below, the uninitialized entities (partitions 3, 5, 6 and 7) were added to 'VolGroup00'. Figure 3.11. Logical view of volume group 3.2.3. Migrating Extents To migrate extents from a physical volume, select the volume and click on the Migrate Selected Extent(s) From Volume button.
PAGE 26
Chapter 3. LVM (Logical Volume Manager) the volume group, a pop up window will be displayed from which you can select the destination for the extents or automatically let LVM choose the physical volumes (PVs) to migrate them to. This is illustrated below. Figure 3.12. Migrate Extents The figure below illustrates a migration of extents in progress. In this example, the extents were migrated to 'Partition 3'. Figure 3.13.
PAGE 27
Adding a New Hard Disk Using LVM which were initially in hda2 are now in hda3. Migrating extents allows you to move logical volumes in case of hard disk upgrades or to manage your disk space better. Figure 3.14. Logical and physical view of volume group 3.2.4. Adding a New Hard Disk Using LVM In this example, a new IDE hard disk was added. The figure below illustrates the details for the new hard disk. From the figure below, the disk is uninitialized and not mounted.
PAGE 28
Chapter 3. LVM (Logical Volume Manager) 3.2.5. Adding a New Volume Group Once initialized, LVM will add the new volume to the list of unallocated volumes where you can add it to an existing volume group or create a new volume group. You can also remove the volume from LVM. The volume if removed from LVM will be listed in the list of 'Uninitialized Entities' as illustrated in Figure 3.15, “Uninitialized hard disk”. In this example, a new volume group was created as illustrated below. Figure 3.16.
PAGE 29
Extending a Volume Group Figure 3.17. Create new logical volume The figure below illustrates the physical view of the new volume group. The new logical volume named 'Backups' in this volume group is also listed. Figure 3.18. Physical view of new volume group 3.2.6. Extending a Volume Group In this example, the objective was to extend the new volume group to include an uninitialized entity (partition). This was to increase the size or number of extents for the volume group.
PAGE 30
Chapter 3. LVM (Logical Volume Manager) volume group, click on the Extend Volume Group button. This will display the 'Extend Volume Group' window as illustrated below. On the 'Extend Volume Group' window, you can select disk entities (partitions) to add to the volume group. Please ensure that you check the contents of any 'Uninitialized Disk Entities' (partitions) to avoid deleting any critical data (see Figure 3.15, “Uninitialized hard disk”).
PAGE 31
Editing a Logical Volume Clicking on the Edit Properties button will display the 'Edit Logical Volume' popup window from which you can edit the properties of the logical volume. On this window, you can also mount the volume after making the changes and mount it when the system is rebooted. Please note that you should indicate the mount point. If the mount point you specify does not exist, a popup window will be displayed prompting you to create it. The 'Edit Logical Volume' window is illustrated below.
PAGE 32
Chapter 3. LVM (Logical Volume Manager) Figure 3.22. Edit logical volume - specifying mount options The figure below illustrates the logical and physical view of the volume group after the logical volume was extended to the unused space. Please note in this example that the logical volume named 'Backups' spans across two hard disks. A volume can be stripped across two or more physical devices using LVM.
PAGE 33
References Figure 3.23. Edit logical volume 3.3. References Use these sources to learn more about LVM. Installed Documentation • rpm -qd lvm2 — This command shows all the documentation available from the lvm package, including man pages. • lvm help — This command shows all LVM commands available. Useful Websites • http://sources.redhat.com/lvm2 — LVM2 webpage, which contains an overview, link to the mailing lists, and more. • http://tldp.
PAGE 34
22
PAGE 35
Chapter 4. Partitions The utility parted allows users to: • View the existing partition table • Change the size of existing partitions • Add partitions from free space or additional hard drives By default, the parted package is included when installing Red Hat Enterprise Linux. To start parted, log in as root and type the command parted /dev/sda at a shell prompt (where /dev/ sda is the device name for the drive you want to configure).
PAGE 36
Chapter 4.
PAGE 37
Creating a Partition • ntfs • reiserfs • hp-ufs • sun-ufs • xfs If a Filesystem of a device shows no value, this means that its file system type is unknown. The Flags column lists the flags set for the partition. Available flags are boot, root, swap, hidden, raid, lvm, or lba. Tip To select a different device without having to restart parted, use the select command followed by the device name (for example, /dev/sda). Doing so allows you to view or configure the partition table of a device. 4.2.
PAGE 38
Chapter 4. Partitions Tip If you use the mkpartfs command instead, the file system is created after the partition is created. However, parted does not support creating an ext3 file system. Thus, if you wish to create an ext3 file system, use mkpart and create the file system with the mkfs command as described later. The changes start taking place as soon as you press Enter, so review the command before executing to it.
PAGE 39
Removing a Partition mount /work 4.3. Removing a Partition Warning Do not attempt to remove a partition on a device that is in use. Before removing a partition, boot into rescue mode (or unmount any partitions on the device and turn off any swap space on the device). Start parted, where /dev/sda is the device on which to remove the partition: parted /dev/sda View the current partition table to determine the minor number of the partition to remove: print Remove the partition with the command rm.
PAGE 40
Chapter 4. Partitions print To resize the partition, use the resize command followed by the minor number for the partition, the starting place in megabytes, and the end place in megabytes. For example: resize 3 1024 2048 Warning A partition cannot be made larger than the space available on the device After resizing the partition, use the print command to confirm that the partition has been resized correctly, is the correct partition type, and is the correct file system type.
PAGE 41
Chapter 5. File System Structure 5.1. Why Share a Common Structure? The file system structure is the most basic level of organization in an operating system. Almost all of the ways an operating system interacts with its users, applications, and security model are dependent on how the operating system organizes files on storage devices. Providing a common file system structure ensures users and programs can access and write files. File systems break files down into two logical categories: • Shareable vs.
PAGE 42
Chapter 5. File System Structure Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/VolGroup00-LogVol00 11675568 6272120 4810348 57% / /dev/sda1 100691 9281 86211 10% /boot none 322856 0 322856 0% /dev/shm By default, df shows the partition size in 1 kilobyte blocks and the amount of used/available disk space in kilobytes. To view the information in megabytes and gigabytes, use the command df h. The -h argument stands for "human-readable" format.
PAGE 43
FHS Organization Figure 5.1. GNOME System Monitor File Systems tab 5.2.1.2. The /boot/ Directory The /boot/ directory contains static files required to boot the system, e.g. the Linux kernel. These files are essential for the system to boot properly. Warning Do not remove the /boot/ directory. Doing so renders the system unbootable. 5.2.1.3.
PAGE 44
Chapter 5. File System Structure Table 5.1. Examples of common files in the /dev File Description /dev/hda The master device on primary IDE channel. /dev/hdb The slave device on primary IDE channel. /dev/tty0 The first virtual console. /dev/tty1 The second virtual console. /dev/sda The first device on primary SCSI or SATA channel. /dev/lp0 The first parallel port. 5.2.1.4. The /etc/ Directory The /etc/ directory is reserved for configuration files that are local to the machine.
PAGE 45
FHS Organization 5.2.1.9. The /proc/ Directory The /proc/ directory contains special files that either extract information from the kernel or send information to it. Examples of such information include system memory, cpu information, and hardware configuration. For more information about /proc/, refer to Section 5.4, “The /proc Virtual File System”. 5.2.1.10. The /sbin/ Directory The /sbin/ directory stores binaries essential for booting, restoring, recovering, or repairing the system.
PAGE 46
Chapter 5. File System Structure 5.2.1.13. The /usr/ Directory The /usr/ directory is for files that can be shared across multiple machines. The /usr/ directory is often on its own partition and is mounted read-only.
PAGE 47
FHS Organization 5.2.1.14. The /var/ Directory Since the FHS requires Linux to mount /usr/ as read-only, any programs that write log files or need spool/ or lock/ directories should write them to the /var/ directory. The FHS states /var/ is for variable data files, which include spool directories/files, logging data, transient/temporary files, and the like.
PAGE 48
Chapter 5. File System Structure in directories for the program using the file. The /var/spool/ directory has subdirectories that store data files for some programs.
PAGE 49
The /proc Virtual File System 5.4. The /proc Virtual File System Unlike most file systems, /proc contains neither text not binary files. Instead, it houses virtual files; hence, /proc is normally referred to as a virtual file system. These virtual files are typically zero bytes in size, even if they contain a large amount of information. The /proc file system is not used for storage per se.
PAGE 50
38
PAGE 51
Chapter 6. The Ext3 File System The ext3 file system is essentially an enhanced version of the ext2 file system. These improvements provide the following advantages: Availability After an unexpected power failure or system crash (also called an unclean system shutdown), each mounted ext2 file system on the machine must be checked for consistency by the e2fsck program.
PAGE 52
Chapter 6. The Ext3 File System Note If you upgrade to Red Hat Enterprise Linux 6 with the intention of keeping any ext3 file systems intact, you do not need to remake the file system. New Mount Option: data_err A new mount option has been added: data_err=abort. This option instructs ext3 to abort the journal if an error occurs in a file data (as opposed to metadata) buffer in data=ordered mode. This option is disabled by default (i.e. set as data_err=ignore).
PAGE 53
Reverting to an Ext2 File System • A mapped device — A logical volume in a volume group, for example, /dev/mapper/ VolGroup00-LogVol02. • A static device — A traditional storage volume, for example, /dev/sdbX, where sdb is a storage device name and X is the partition number. Issue the df command to display mounted file systems. 6.3.
PAGE 54
42
PAGE 55
Chapter 7. The Ext4 File System The ext4 file system is a scalable extension of the ext3 file system, which was the default file system of Red Hat Enterprise Linux 5. Ext4 is now the default file system of Red Hat Enterprise Linux 6, and can support files and file systems of up to 16 terabytes in size. It also supports an unlimited number of sub-directories (the ext3 file system only supports up to 32,000).
PAGE 56
Chapter 7. The Ext4 File System • Subsecond timestamps 7.1. Creating an Ext4 File System To create an ext4 file system, use the mkfs.ext4 command. In general, the default options are optimal for most usage scenarios, as in: mkfs.ext4 /dev/device Below is a sample output of this command, which displays the resulting file system geometry and features: mke2fs 1.41.
PAGE 57
Mounting an Ext4 File System Note It is possible to use tune2fs to enable some ext4 features on ext3 file systems, and to use the ext4 driver to mount an ext3 file system. These actions, however, are not supported in Red Hat Enterprise Linux 6, as they have not been fully tested. Because of this, Red Hat cannot guarantee consistent performance and predictable behavior for ext3 file systems converted or mounted thusly. 7.2.
PAGE 58
Chapter 7. The Ext4 File System • G — gigabytes For more information about resizing an ext4 file system, refer to man resize2fs. 7.4. Other Ext4 File System Utilities Red Hat Enterprise Linux 6 also features other utilities for managing ext4 file systems: e2fsck Used to repair an ext4 file system. This tool checks and repairs an ext4 file system more efficiently than ext3, thanks to updates in the ext4 disk structure. e2label Changes the label on an ext4 file system.
PAGE 59
Chapter 8. Global File System 2 The Red Hat GFS2 file system is a native file system that interfaces directly with the Linux kernel file system interface (VFS layer). When implemented as a cluster file system, GFS2 employs distributed metadata and multiple journals. GFS2 is based on a 64-bit architecture, which can theoretically accommodate an 8 EB file system. However, the current supported maximum size of a GFS2 file system is 100 TB.
PAGE 60
48
PAGE 61
Chapter 9. The XFS File System XFS is a highly scalable, high-performance file system which was originally designed at Silicon Graphics, Inc. It was created to support extremely large filesystems (up to 16 exabytes), files (8 exabytes) and directory structures (tens of millions of entries). Main Features XFS supports metadata journaling, which facilitates quicker crash recovery. The XFS file system can also be defragmented and enlarged while mounted and active.
PAGE 62
Chapter 9. The XFS File System log =internal log = realtime =none bsize=4096 sectsz=512 extsz=4096 blocks=6400, version=2 sunit=0 blks, lazy-count=1 blocks=0, rtextents=0 Note After an XFS file system is created, its size cannot be reduced. However, it can still be enlarged using the xfs_growfs command (refer to Section 9.4, “Increasing the Size of an XFS File System”). For striped block devices (e.g., RAID5 arrays), the stripe geometry can be specified at the time of file system creation.
PAGE 63
XFS Quota Management mount -o nobarrier /dev/device /mount/point For more information about write barriers, refer to Chapter 17, Write Barriers. 9.3. XFS Quota Management The XFS quota subsystem manages limits on disk space (blocks) and file (inode) usage. XFS quotas control and/or report on usage of these items on a user, group, or directory/project level. Also, note that while user, group, and directory/project quotas are enabled independently, group and project quotas a mutually exclusive.
PAGE 64
Chapter 9. The XFS File System ---------- --------------------------------root 0 0 0 00 [------] testuser 103.4G 0 0 00 [------] ... To set a soft and hard inode count limit of 500 and 700 respectively for user john (whose home directory is /home/john), use the following command: xfs_quota -x -c 'limit isoft=500 ihard=700 /home/john' By default, the limit sub-command recognizes targets as users. When configuring the limits for a group, use the -g option (as in the previous example).
PAGE 65
Repairing an XFS File System Note While XFS file systems can be grown while mounted, their size cannot be reduced at all. For more information about growing a file system, refer to man xfs_growfs. 9.5. Repairing an XFS File System To repair an XFS file system, use xfs_repair, as in: xfs_repair /dev/device The xfs_repair utility is highly scalable, and is designed to repair even very large file systems with many inodes efficiently.
PAGE 66
Chapter 9. The XFS File System Note You can also use the xfs_freeze utility to freeze/unfreeze an ext3, ext4, GFS2, XFS, and BTRFS, file system. The syntax for doing so is also the same. For more information about freezing and unfreezing an XFS file system, refer to man xfs_freeze. 9.7. Backup and Restoration of XFS File Systems XFS file system backup and restoration involves two utilities: xfsdump and xfsrestore. To backup or dump an XFS file system, use the xfsdump utility.
PAGE 67
xfsrestore Simple Mode resumed: NO subtree: NO streams: 1 stream 0: pathname: /mnt/test2/backup start: ino 0 offset 0 end: ino 1 offset 0 interrupted: NO media files: 1 media file 0: mfile index: 0 mfile type: data mfile size: 21016 mfile start: ino 0 offset 0 mfile end: ino 1 offset 0 media label: "my_dump_media_label" media id: 4a518062-2a8f-4f17-81fd-bb1eb2e3cb4f xfsrestore: Restore Status: SUCCESS xfsrestore Simple Mode The simple mode allows users to restore an entire file system from a level 0 dump.
PAGE 68
Chapter 9. The XFS File System xfs_fsr Used to defragment mounted XFS file systems. When invoked with no arguments, xfs_fsr defragments all regular files in all mounted XFS file systems. This utility also allows users to suspend a defragmentation at a specified time and resume from where it left off later. In addition, xfs_fsr also allows the defragmentation of only one file, as in xfs_fsr /path/ to/file.
PAGE 69
Chapter 10. Network File System (NFS) A Network File System (NFS) allows remote hosts to mount file systems over a network and interact with those file systems as though they are mounted locally. This enables system administrators to consolidate resources onto centralized servers on the network. This chapter focuses on fundamental NFS concepts and supplemental information. 10.1. How It Works Currently, there are three versions of NFS. NFS version 2 (NFSv2) is older and is widely supported.
PAGE 70
Chapter 10. Network File System (NFS) Important In order for NFS to work with a default installation of Red Hat Enterprise Linux with a firewall enabled, configure IPTables with the default TCP port 2049. Without proper IPTables configuration, NFS will not function properly. The NFS initialization script and rpc.nfsd process now allow binding to any specified port during system start up. However, this can be error-prone if the port is unavailable, or if it conflicts with another daemon. 10.1.1.
PAGE 71
NFS Client Configuration rpc.nfsd rpc.nfsd allows explicit NFS versions and protocols the server advertises to be defined. It works with the Linux kernel to meet the dynamic demands of NFS clients, such as providing server threads each time an NFS client connects. This process corresponds to the nfs service. rpc.lockd rpc.lockd allows NFS clients to lock files on the server. If rpc.lockd is not started, file locking will fail. rpc.lockd implements the Network Lock Manager (NLM) protocol.
PAGE 72
Chapter 10. Network File System (NFS) If an NFS share was mounted manually, the share will not be automatically mounted upon reboot. Red Hat Enterprise Linux offers two methods for mounting remote file systems automatically at boot time: the /etc/fstab file and the autofs service. Refer to Section 10.2.1, “Mounting NFS File Systems using /etc/fstab” and Section 10.3, “autofs” for more information. 10.2.1.
PAGE 73
Improvements in autofs Version 5 over Version 4 autofs uses /etc/auto.master (master map) as its default primary configuration file. This can be changed to use another supported network source and name using the autofs configuration (in /etc/sysconfig/autofs) in conjunction with the Name Service Switch (NSS) mechanism. An instance of the autofs version 4 daemon was run for each mount point configured in the master map and so it could be run manually from the command line for any given mount point.
PAGE 74
Chapter 10. Network File System (NFS) An example is seen in the connectathon test maps for the direct mounts below: /- /tmp/auto_dcthon /- /tmp/auto_test3_direct /- /tmp/auto_test4_direct 10.3.2. autofs Configuration The primary configuration file for the automounter is /etc/auto.master, also referred to as the master map which may be changed as described in the Section 10.3.1, “Improvements in autofs Version 5 over Version 4”.
PAGE 75
Overriding or Augmenting Site Configuration Files location This refers to the file system location such as a local file system path (preceded with the Sun map format escape character ":" for map names beginning with "/"), an NFS file system or other valid file system location. The following is a sample of contents from a map file (i.e. /etc/auto.
PAGE 76
Chapter 10. Network File System (NFS) * fileserver.example.com:/export/home/& • The file map /etc/auto.home does not exist. Given these conditions, let's assume that the client system needs to override the NIS map auto.home and mount home directories from a different server. In this case, the client will need to use the following /etc/auto.master map: /home -/etc/auto.home +auto.master And the /etc/auto.home map contains the entry: * labserver.example.
PAGE 77
Using LDAP to Store Automounter Maps # # # # # LDAPv3 base <> with scope subtree filter: (&(objectclass=automountMap)(automountMapName=auto.master)) requesting: ALL # auto.master, example.com dn: automountMapName=auto.master,dc=example,dc=com objectClass: top objectClass: automountMap automountMapName: auto.master # # # # # # # extended LDIF LDAPv3 base with scope subtree filter: (objectclass=automount) requesting: ALL # /home, auto.master, example.
PAGE 78
Chapter 10. Network File System (NFS) 10.4. Common NFS Mount Options Beyond mounting a file system via NFS on a remote host, you can also specify other options at mount time to make the mounted share easier to use. These options can be used with manual mount commands, /etc/fstab settings, and autofs. The following are options commonly used for NFS mounts: intr Allows NFS requests to be interrupted if the server goes down or cannot be reached.
PAGE 79
Starting and Stopping NFS sec=krb5i uses Kerberos V5 for user authentication and performs integrity checking of NFS operations using secure checksums to prevent data tampering. sec=krb5p uses Kerberos V5 for user authentication, integrity checking, and encrypts NFS traffic to prevent traffic sniffing. This is the most secure setting, but it also involves the most performance overhead. tcp Instructs the NFS mount to use the TCP protocol. udp Instructs the NFS mount to use the UDP protocol.
PAGE 80
Chapter 10. Network File System (NFS) service nfs restart The condrestart (conditional restart) option only starts nfs if it is currently running. This option is useful for scripts, because it does not start the daemon if it is not running. To conditionally restart the server, as root, type: service nfs condrestart To reload the NFS server configuration file without restarting the service, as root, type: service nfs reload 10.6.
PAGE 81
The /etc/exports Configuration File export host1(options1) host2(options2) host3(options3) For information on different methods for specifying hostnames, refer to Section 10.6.4, “Hostname Formats”. In its simplest form, the /etc/exports file only specifies the exported directory and the hosts permitted to access it, as in the following example: /exported/directory bob.example.com Here, bob.example.com can mount /exported/directory/ from the NFS server.
PAGE 82
Chapter 10. Network File System (NFS) In this example 192.168.0.3 can mount /another/exported/directory/ read/write and all writes to disk are asynchronous. For more information on exporting options, refer to man exportfs. Additionally, other options are available where no default value is specified. These include the ability to disable sub-tree checking, allow access from insecure ports, and allow insecure file locks (necessary for certain early NFS client implementations).
PAGE 83
Running NFS Behind a Firewall same way they are specified in /etc/exports. Refer to Section 10.6.1, “ The /etc/exports Configuration File” for more information on /etc/exports syntax. This option is often used to test an exported file system before adding it permanently to the list of file systems to be exported. -i Ignores /etc/exports; only options given from the command line are used to define exported file systems. -u Unexports all shared directories.
PAGE 84
Chapter 10. Network File System (NFS) To configure a firewall to allow NFS, perform the following steps: 1. Allow TCP and UDP port 2049 for NFS. 2. Allow TCP and UDP port 111 (rpcbind/sunrpc). 3. Allow the TCP and UDP port specified with MOUNTD_PORT="port" 4. Allow the TCP and UDP port specified with STATD_PORT="port" 5. Allow the TCP port specified with LOCKD_TCPPORT="port" 6. Allow the UDP port specified with LOCKD_UDPPORT="port" 10.6.4.
PAGE 85
Host Access in NFSv4 pointed to an unauthorized machine. At this point, the unauthorized machine is the system permitted to mount the NFS share, since no username or password information is exchanged to provide additional security for the NFS mount. Wildcards should be used sparingly when exporting directories via NFS, as it is possible for the scope of the wildcard to encompass more systems than intended. 1 You can also to restrict access to the rpcbind service via TCP wrappers.
PAGE 86
Chapter 10. Network File System (NFS) When exporting an NFS share as read-only, consider using the all_squash option. This option makes every user accessing the exported file system take the user ID of the nfsnobody user. 10.8. NFS and rpcbind Note The following section only applies to NFSv2 or NFSv3 implementations that require the rpcbind service for backward compatibility. 1 The rpcbind utility maps RPC services to the ports on which they listen.
PAGE 87
Using NFS over TCP 100005 100005 100005 2 3 3 tcp udp tcp 839 836 839 mountd mountd mountd If one of the NFS services does not start up correctly, rpcbind will be unable to map RPC requests from clients for that service to the correct port. In many cases, if NFS is not present in rpcinfo output, restarting NFS causes the service to correctly register with rpcbind and begin working. For instructions on starting NFS, refer to Section 10.5, “Starting and Stopping NFS”.
PAGE 88
Chapter 10. Network File System (NFS) becomes available. since UDP is connectionless, the client continues to pound the network with data until the server re-establishes a connection. The main disadvantage with TCP is that there is a very small performance hit due to the overhead associated with the protocol. 10.10. References Administering an NFS server can be a challenge. Many options, including quite a few not mentioned in this chapter, are available for exporting or mounting NFS shares.
PAGE 89
Chapter 11. FS-Cache FS-Cache is a persistent local cache that can be used by file systems to take data retrieved from over the network and cache it on local disk. This helps minimize network traffic for users accessing data from a file system mounted over the network (for example, NFS). The following diagram is a high-level illustration of how FS-Cache works: Figure 11.1. FS-Cache Overview FS-Cache is designed to be as transparent as possible to the users and administrators of a system.
PAGE 90
Chapter 11. FS-Cache FS-Cache cannot arbitrarily cache any file system, whether through the network or otherwise: the shared file system's driver must be altered to allow interaction with FS-Cache, data storage/retrieval, and metadata setup and validation. FS-Cache needs indexing keys and coherency data from the cached file system to support persistence: indexing keys to match file system objects to cache objects, and coherency data to determine whether the cache objects are still valid. 11.1.
PAGE 91
Using the Cache With NFS tune2fs -o user_xattr /dev/device Alternatively, extended attributes for a file system can be enabled at mount time, as in: mount /dev/device /path/to/cache -o user_xattr The cache back-end works by maintaining a certain amount of free space on the partition hosting the cache. It grows and shrinks the cache in response to other elements of the system using up free space, making it safe to use on the root file system (for example, on a laptop).
PAGE 92
Chapter 11. FS-Cache Here, /home/fred and /home/jim will likely share the superblock as they have the same options, especially if they come from the same volume/partition on the NFS server (home0). Now, consider the next two subsequent mount commands: mount home0:/disk0/fred /home/fred -o fsc,rsize=230 mount home0:/disk0/jim /home/jim -o fsc,rsize=231 In this case, /home/fred and /home/jim will not share the superblock as they have different network access parameters, which are part of the Level 2 key.
PAGE 93
Statistical Information When dealing with file system size, the CacheFiles culling behavior is controlled by three settings in / etc/cachefilesd.conf: brun N% If the amount of free space rises above N% of total disk capacity, cachefilesd disables culling. bcull N% If the amount of free space falls below N% of total disk capacity, cachefilesd starts culling.
PAGE 94
Chapter 11. FS-Cache FS-Cache statistics includes information on decision points and object counters. For more details on the statistics provided by FS-Cache, refer to the following kernel document: /usr/share/doc/kernel-doc-version/Documentation/filesystems/caching/ fscache.txt 11.6. References For more information on cachefilesd and how to configure it, refer to man cachefilesd and man cachefilesd.conf. The following kernel documents also provide additional information: • /usr/share/doc/cachefilesd-0.
PAGE 95
Chapter 12. Encrypted File System Red Hat Enterprise Linux 6 now supports eCryptfs, a "pseudo-file system" which provides data and filename encryption on a per-file basis. The term "pseudo-file system" refers to the fact that eCryptfs does not have an on-disk format; rather, it is a file system layer that resides on top of an actual file system. The eCryptfs layer provides encryption capabilities. eCryptfs works like a bind mount, as it intercepts file operations that write to the underlying (i.e.
PAGE 96
Chapter 12. Encrypted File System ecryptfs_unlink_sigs ecryptfs_key_bytes=16 ecryptfs_cipher=aes ecryptfs_sig=c7fed37c0a341e19 Mounted eCryptfs The options in this display can then be passed directly to the command line to encrypt and mount a file system using the same configuration. To do so, use each option as an argument to the -o option of mount. For example: mount -t ecryptfs /home /home -o ecryptfs_unlink_sigs \ 1 ecryptfs_key_bytes=16 ecryptfs_cipher=aes ecryptfs_sig=c7fed37c0a341e19 12.2.
PAGE 97
Chapter 13. Redundant Array of Independent Disks (RAID) The basic idea behind RAID is to combine multiple small, inexpensive disk drives into an array to accomplish performance or redundancy goals not attainable with one large and expensive drive. This array of drives appears to the computer as a single logical storage unit or drive. 13.1. What is RAID? RAID allows information to be spread across several disks.
PAGE 98
Chapter 13. Redundant Array of Independent Disks (RAID) RAID controller cards function like a SCSI controller to the operating system, and handle all the actual drive communications. The user plugs the drives into the RAID controller (just like a normal SCSI controller) and then adds them to the RAID controllers configuration, and the operating system won't know the difference. Software RAID Software RAID implements the various RAID levels in the kernel disk (block device) code.
PAGE 99
RAID Levels and Linear Support member disks of the array, allowing high I/O performance at low inherent cost but provides no redundancy. Many RAID level 0 implementations will only stripe the data across the member devices up to the size of the smallest device in the array. This means that if you have multiple devices with slightly different sizes, each device will get treated as though it is the same size as the smallest drive.
PAGE 100
Chapter 13. Redundant Array of Independent Disks (RAID) you have a sufficiently large number of member devices in a software RAID5 array such that the combined aggregate data transfer speed across all devices is high enough, then this bottleneck can start to come into play. As with level 4, level 5 has asymmetrical performance, with reads substantially outperforming writes. The storage capacity of RAID level 5 is calculated the same way as with level 4.
PAGE 101
dmraid mdraid also supports other metadata formats, known as external metadata. Red Hat Enterprise Linux 6 uses mdraid with external metadata to access ISW / IMSM (Intel firmware RAID) sets. mdraid sets are configured and controlled through the mdadm utility. dmraid Device-mapper RAID or dmraid refers to device-mapper kernel code that offers the mechanism to piece disks together into a RAID set. This same kernel code does not provide any RAID configuration mechanism.
PAGE 102
Chapter 13. Redundant Array of Independent Disks (RAID) As mentioned earlier in Section 13.5, “ Linux RAID Subsystems”, the dmraid tool cannot configure RAID sets after creation. For more information about using dmraid, refer to man dmraid. 13.8. Advanced RAID Device Creation In some cases, you may wish to install the operating system on an array that can't be created after the installation completes.
PAGE 103
Chapter 14. Swap Space 14.1. What is Swap Space? Swap space in Linux is used when the amount of physical memory (RAM) is full. If the system needs more memory resources and the RAM is full, inactive pages in memory are moved to the swap space. While swap space can help machines with a small amount of RAM, it should not be considered a replacement for more RAM. Swap space is located on hard drives, which have a slower access time than physical memory.
PAGE 104
Chapter 14. Swap Space 14.2.1. Extending Swap on an LVM2 Logical Volume By default, Red Hat Enterprise Linux 6 uses all available space during installation. If this is the case with your system, then you must first add a new physical volume to the volume group used by the swap space. For instructions on how to do so, refer to Section 3.2.2, “Adding Unallocated Volumes to a Volume Group”. After adding additional storage to the swap space's volume group, it is now possible to extend it.
PAGE 105
Removing Swap Space 1. Determine the size of the new swap file in megabytes and multiply by 1024 to determine the number of blocks. For example, the block size of a 64 MB swap file is 65536. 2. At a shell prompt as root, type the following command with count being equal to the desired block size: dd if=/dev/zero of=/swapfile bs=1024 count=65536 3. Setup the swap file with the command: mkswap /swapfile 4. To enable the swap file immediately but not automatically at boot time: swapon /swapfile 5.
PAGE 106
Chapter 14. Swap Space 14.3.2. Removing an LVM2 Logical Volume for Swap To remove a swap volume group (assuming /dev/VolGroup00/LogVol02 is the swap volume you want to remove): 1. Disable swapping for the associated logical volume: swapoff -v /dev/VolGroup00/LogVol02 2. Remove the LVM2 logical volume of size 512 MB: lvremove /dev/VolGroup00/LogVol02 3.
PAGE 107
Chapter 15. Disk Quotas Disk space can be restricted by implementing disk quotas which alert a system administrator before a user consumes too much disk space or a partition becomes full. Disk quotas can be configured for individual users as well as user groups. This makes it possible to manage the space allocated for user-specific files (such as email) separately from the space allocated to the projects a user works on (assuming the projects are given their own groups).
PAGE 108
Chapter 15. Disk Quotas • Issue the umount command followed by the mount command to remount the file system. Refer to the man page for both umount and mount for the specific syntax for mounting and unmounting various file system types. • Issue the mount -o remount file-system command (where file-system is the name of the file system) to remount the file system. For example, to remount the /home file system, the command to issue is mount -o remount /home.
PAGE 109
Assigning Quotas per User 15.1.4. Assigning Quotas per User The last step is assigning the disk quotas with the edquota command. To configure the quota for a user, as root in a shell prompt, execute the command: edquota username Perform this step for each user who needs a quota.
PAGE 110
Chapter 15. Disk Quotas This command displays the existing quota for the group in the text editor: Disk quotas for group devel (gid 505): Filesystem blocks soft /dev/VolGroup00/LogVol02 440400 0 hard 0 inodes 37418 soft 0 hard 0 Modify the limits, then save the file. To verify that the group quota has been set, use the command: quota -g devel 15.1.6. Setting the Grace Period for Soft Limits If a given quota has soft limits, you can edit the grace period (i.e.
PAGE 111
Reporting on Disk Quotas 15.2.2. Reporting on Disk Quotas Creating a disk usage report entails running the repquota utility.
PAGE 112
Chapter 15. Disk Quotas quotaon -vaug /file_system Running quotacheck on a running system If necessary, it is possible to run quotacheck on a machine during a time when no users are logged in, and thus have no open files on the file system being checked. Run the command quotacheck -vaug file_system ; this command will fail if quotacheck cannot remount the given file_system as read-only. Note that, following the check, the file system will be remounted read-write.
PAGE 113
Chapter 16. Access Control Lists Files and directories have permission sets for the owner of the file, the group associated with the file, and all other users for the system. However, these permission sets have limitations. For example, different permissions cannot be configured for different users. Thus, Access Control Lists (ACLs) were implemented. The Red Hat Enterprise Linux kernel provides ACL support for the ext3 file system and NFS-exported file systems.
PAGE 114
Chapter 16. Access Control Lists 3. Via the effective rights mask 4. For users not in the user group for the file The setfacl utility sets ACLs for files and directories. Use the -m option to add or modify the ACL of a file or directory: setfacl -m rules files Rules (rules) must be specified in the following formats. Multiple rules can be specified in the same command if they are separated by commas. u:uid:perms Sets the access ACL for a user. The user name or UID may be specified.
PAGE 115
Archiving File Systems With ACLs getfacl home/john/picture.png The above command returns the following output: # file: home/john/picture.png # owner: john # group: john user::rwgroup::r-other::r-- If a directory with a default ACL is specified, the default ACL is also displayed as illustrated below.
PAGE 116
Chapter 16. Access Control Lists Option Description files are newer than the files of the same name in the archive. This option only works if the archive is a file or an unblocked tape that may backspace. -x Extracts the files from the archive. If used with -U and a file in the archive is older than the corresponding file on the file system, the file is not extracted. -help Displays the most important options. -xhelp Displays the least important options.
PAGE 117
Chapter 17. Write Barriers A write barrier is a kernel mechanism used to ensure that file system metadata is correctly written and ordered on persistent storage, even when storage devices with volatile write caches lose power. File systems with write barriers enabled also ensure that data transmitted via fsync() is persistent throughout a power loss. Enabling write barriers incurs a substantial performance penalty for some applications.
PAGE 118
Chapter 17. Write Barriers cached. However, because the cache's volatility is not visible to the kernel, Red Hat Enterprise Linux 6 enables write barriers by default on all supported journaling file systems. Note Write caches are designed to increase I/O performance. However, enabling write barriers means constantly flushing these caches, which can significantly reduce performance.
PAGE 119
High-End Arrays MegaCli64 -LDSetProp -DisDskCache -Lall -aALL Note Hardware RAID cards recharge their batteries while the system is operational. If a system is powered off for an extended period of time, the batteries will lose their charge, leaving stored data vulnerable during a power failure. High-End Arrays High-end arrays have various ways of protecting data in the event of a power failure. As such, there is no need to verify the state of the internal drives in external RAID storage.
PAGE 120
108
PAGE 121
Chapter 18. Storage I/O Alignment and Size Recent enhancements to the SCSI and ATA standards allow storage devices to incidate their preferred (and in some cases, required) I/O alignment and I/O size. This information is particularly useful with newer disk drives that increase the physical sector size from 512 bytes to 4k bytes. This information may also be beneficial for RAID devices, where the chunk size and stripe size may impact performance.
PAGE 122
Chapter 18. Storage I/O Alignment and Size 18.2. Userspace Access Always take care to use properly aligned and sized I/O. This is especially important for Direct I/O access. Direct I/O should be aligned on a logical_block_size boundary, and in multiples of the logical_block_size. With native 4K devices (i.e. logical_block_size is 4K) it is now critical that applications perform direct I/O in multiples of the device's logical_block_size.
PAGE 123
ATA ATA ATA devices must report appropriate information via the IDENTIFY DEVICE command. ATA devices only report I/O parameters for physical_block_size, logical_block_size, and alignment_offset. The additional I/O hints are outside the scope of the ATA Command Set. SCSI I/O parameters support in Red Hat Enterprise Linux 6 requires at least version 3 of the SCSI Primary Commands (SPC-3) protocol.
PAGE 124
Chapter 18. Storage I/O Alignment and Size For instance, a 512-byte device and a 4K device may be combined into a single logical DM device, which would have a logical_block_size of 4K. File systems layered on such a hybrid device assume that 4K will be written atomically, but in reality it will span 8 logical block addresses when issued to the 512-byte device.
PAGE 125
File System tools This is the catch-all for "legacy" devices which don't appear to provide I/O hints. As such, by default all partitions will be aligned on a 1MB boundary. Note Red Hat Enterprise Linux 6 cannot distinguish between devices that don't provide I/O hints and those that do so with alignment_offset=0 and optimal_io_size=0. Such a device might be a single SAS 4K device; as such, at worst 1MB of space is lost at the start of the disk. File System tools The different mkfs.
PAGE 126
114
PAGE 127
Chapter 19. Setting Up A Remote Diskless System The Network Booting Service (provided by system-config-netboot) is no longer available in Red Hat Enterprise Linux 6. Deploying diskless systems is now possible in this release without the use of system-config-netboot.
PAGE 128
Chapter 19. Setting Up A Remote Diskless System allow booting; allow bootp; class "pxeclients" { match if substring(option vendor-class-identifier, 0, 9) = "PXEClient"; next-server server-ip; filename "linux-install/pxelinux.0"; } Replace server-ip with the IP address of the host machine on which the tftp and DHCP services reside. Now that tftp and DHCP are configured, all that remains is to configure NFS and the exported file system; refer to Section 19.
PAGE 129
Configuring an Exported File System for Diskless Clients exported/root/directory) as read-write. To do this, configure /var/lib/tftpboot/ pxelinux.cfg/default with the following: default rhel6 label rhel6 kernel vmlinuz-kernel-version append initrd=initramfs-kernel-version.img root=nfs:server-ip:/exported/root/directory rw Replace server-ip with the IP address of the host machine on which the tftp and DHCP services reside. The NFS share is now ready for exporting to diskless clients.
PAGE 130
118
PAGE 131
Chapter 20. Solid-State Disk Deployment Guidelines Solid-state disks (SSD) are storage devices that use NAND flash chips to persistently store data. This sets them apart from previous generations of disks, which store data in rotating, magnetic platters. In an SSD, the access time for data across the full Logical Block Address (LBA) range is constant; whereas with older disks that use rotating media, access patterns that span large address ranges incur seek costs.
PAGE 132
Chapter 20. Solid-State Disk Deployment Guidelines In addition, keep in mind that logical volumes, device-mapper targets, and md targets do not support TRIM. As such, the default Red Hat Enterprise Linux 6 installation will not allow the use of the TRIM command, since this install uses DM-linear targets. Red Hat also warns that software RAID levels 1, 4, 5, and 6 are not recommended for use on SSDs.
PAGE 133
Chapter 21. Online Storage Management It is often desirable to add, remove or re-size storage devices while the operating system is running, and without rebooting. This chapter outlines the procedures that may be used to reconfigure storage devices on Red Hat Enterprise Linux 6 host systems while the system is running. It covers iSCSI and Fibre Channel storage interconnects; other interconnect types may be added it the future.
PAGE 134
Chapter 21. Online Storage Management • node_name • port_name • dev_loss_tmo — number of seconds to wait before marking a link as "bad". Once a link is marked bad, I/O running on its corresponding path (along with any new I/O on that path) will be failed. The default dev_loss_tmo value varies, depending on which driver/device is used. If a Qlogic adapter is used, the default is 35 seconds, while if an Emulex adapter is used, it is 30 seconds.
PAGE 135
iSCSI Host issue_lip lpfc qla2xxx X X 1 Supported as of Red Hat Enterprise Linux 5.4 2 Supported as of Red Hat Enterprise Linux 6.0 zfcp mptfc 21.2. iSCSI This section describes the iSCSI API and the iscsiadm utility. Before using the iscsiadm utility, install the iscsi-initiator-utils package first; to do so, run yum install iscsiinitiator-utils. In addition, the iSCSI service must be running in order to discover or log in to targets. To start the iSCSI service, run service iscsi start 21.2.1.
PAGE 136
Chapter 21. Online Storage Management • the Logical Unit Number (LUN) This path-based address is not persistent. It may change any time the system is reconfigured (either by on-line reconfiguration, as described in this manual, or when the system is shutdown, reconfigured, and rebooted). It is even possible for the path identifiers to change when no physical reconfiguration has been done, as a result of timing variations during the discovery process when the system boots, or when a bus is re-scanned.
PAGE 137
UUID and Other Persistent Identifiers If there are multiple paths from a system to a device, device-mapper-multipath uses the WWID to detect this. Device-mapper-multipath then presents a single "pseudo-device" in /dev/mapper/ wwid, such as /dev/mapper/3600508b400105df70000e00000ac0000. The command multipath -l shows the mapping to the non-persistent identifiers: Host:Channel:Target:LUN, /dev/sd name, and the major:minor number.
PAGE 138
Chapter 21. Online Storage Management • File system label These identifiers are persistent, and based on metadata written on the device by certain applications. They may also be used to access the device using the symlinks maintained by the operating system in the /dev/disk/by-label/ (e.g. boot -> ../../sda1 ) and /dev/disk/by-uuid/ (e.g. f8bf09e3-4c16-4d91-bd5e-6f62da165c08 -> ../../sda1) directories. md and LVM write metadata on the storage device, and read that data when they scan devices.
PAGE 139
Removing a Path to a Storage Device Another variation of this operation is echo 1 > /sys/class/scsi_device/h:c:t:l/ device/delete, where h is the HBA number, c is the channel on the HBA, t is the SCSI target ID, and l is the LUN. Note The older form of these commands, echo "scsi remove-single-device 0 0 0 0" > /proc/scsi/scsi, is deprecated.
PAGE 140
Chapter 21. Online Storage Management 21.6. Adding a Storage Device or Path When adding a device, be aware that the path-based device name (/dev/sd name, major:minor number, and /dev/disk/by-path name, for example) the system assigns to the new device may have been previously in use by a device that has since been removed. As such, ensure that all old references to the path-based device name have been removed. Otherwise, the new device may be mistaken for the old device.
PAGE 141
Configuring a Fibre-Channel Over Ethernet Interface commands, such as lsscsi, scsi_id, multipath -l, and ls -l /dev/disk/by-*. This information, plus the LUN number of the new device, can be used as shown above to probe and configure that path to the new device. After adding all the SCSI paths to the device, execute the multipath command, and check to see that the device has been properly configured. At this point, the device can be added to md, LVM, mkfs, or mount, for example.
PAGE 142
Chapter 21. Online Storage Management ifconfig ethX up 6. Start FCoE using: /etc/init.d/fcoe start The FCoE device should appear shortly, assuming all other settings on the fabric are correct. To view configured FCoE devices, run: fcoeadmin -i After correctly configuring the ethernet interface to use FCoE, Red Hat recommends that you set FCoE and lldpad to run at startup.
PAGE 143
Scanning Storage Interconnects local fcoe_disks=($(egrep 'by-path\/fc-.*_netdev' /etc/fstab | cut -d ' ' -f1)) test -z $fcoe_disks && return 0 echo -n "Waiting for fcoe disks . " while [ $timeout -gt 0 ]; do for disk in ${fcoe_disks[*]}; do if ! test -b $disk; then done=0 break fi done test $done -eq 1 && break; sleep 1 echo -n ".
PAGE 144
Chapter 21. Online Storage Management interconnect scanning is not recommended if free memory is less than 5% of the total memory in more than 10 samples per 100. It is also not recommended if swapping is active (non-zero si and so columns in the vmstat output). The command free can also display the total memory. The following commands can be used to scan storage interconnects.
PAGE 145
Configuring iSCSI Offload and Interface Binding 3 iscsiadm -m discovery -t discovery_type -p target_IP:port -o delete Here, discovery_type can be either sendtargets, isns, or fw. 4 There are two ways to reconfigure discovery record settings: • Edit the /etc/iscsi/iscsid.conf file directly prior to performing a discovery.
PAGE 146
Chapter 21. Online Storage Management • Offload iSCSI — like the Chelsio cxgb3i, Broadcom bnx2i and ServerEngines be2iscsi modules, this stack allocates a scsi_host for each PCI device. As such, each port on a host bus adapter will show up as a different PCI device, with a different scsi_host per HBA port. To manage both types of initiator implementations, iscsiadm uses the iface structure.
PAGE 147
Configuring an iface for Software iSCSI Using the previous example, the iface settings of the same Chelsio video card (i.e. iscsiadm -m iface -I cxgb3i.00:07:43:05:97:07) would appear as: # BEGIN RECORD 2.0-871 iface.iscsi_ifacename = cxgb3i.00:07:43:05:97:07 iface.net_ifacename = iface.ipaddress = iface.hwaddress = 00:07:43:05:97:07 iface.transport_name = cxgb3i iface.initiatorname = # END RECORD 21.11.2.
PAGE 148
Chapter 21. Online Storage Management iscsiadm -m iface -I iface_name -o update -n iface.ipaddress -v target_IP For example, to set the iface IP address of a Chelsio card (with iface name cxgb3i.00:07:43:05:97:07) to 20.15.0.66, use: iscsiadm -m iface -I cxgb3i.00:07:43:05:97:07 -o update -n iface.ipaddress -v 20.15.0.66 21.11.4. Binding/Unbinding an iface to a Portal Whenever iscsiadm is used to scan for interconnects, it will first check the iface.
PAGE 149
Scanning iSCSI Interconnects However, if the targets do not send an iSCSI async event, you need to manually scan them using the iscsiadm utility. Before doing so, however, you need to first retrieve the proper --targetname and the --portal values.
PAGE 150
Chapter 21. Online Storage Management If your device supports multiple targets, you will need to issue a sendtargets command to the hosts to find new portals for each target. Then, rescan existing sessions to discover new logical units on existing sessions (i.e. using the --rescan option). Important The sendtargets command used to retrieve --targetname and --portal values overwrites the contents of the /var/lib/iscsi/nodes database.
PAGE 151
Logging In to an iSCSI Target Using our previous example (where proper_target_name is equallogic-iscsi1), the full command would be: iscsiadm --mode node --targetname \ iqn.2001-05.com.equallogic:6-8a0900ac3fe0101-63aff113e344a4a2-dl585-03-1 \ --portal 10.16.41.155:3260,0 -7 login 21.13. Logging In to an iSCSI Target As mentioned in Section 21.2, “iSCSI”, the iSCSI service must be running in order to discover or log into targets.
PAGE 152
Chapter 21. Online Storage Management Note In order to resize an online file system, the file system must not reside on a partitioned device. 21.14.1. Resizing Fibre Channel Logical Units After modifying the online logical unit size, re-scan the logical unit to ensure that the system detects the updated size.
PAGE 153
Updating the Size of Your Multipath Device is running using service multipathd status. Once you've verified that multipathd is operational, run the following command: multipathd -k"resize map multipath_device" The multipath_device variable is the corresponding multipath entry of your device in /dev/ mapper.
PAGE 154
Chapter 21. Online Storage Management 8 For more information about multipathing, refer to the Using Device-Mapper Multipath guide (in http:// www.redhat.com/docs/manuals/enterprise/). 21.15. Adding/Removing a Logical Unit Through rescanscsi-bus.sh The sg3_utils package provides the rescan-scsi-bus.sh script, which can automatically update the logical unit configuration of the host as needed (after a device has been added to the system). The rescan-scsi-bus.
PAGE 155
iSCSI Settings With dm-multipath 2. This command will return Blocked when the remote port (along with devices accessed through it) are blocked. If the remote port is operating normally, the command will return Online. 3. If the problem is not resolved within dev_loss_tmo seconds, the rport and devices will be unblocked and all I/O running on that device (along with any new I/O sent to that device) will be failed. Procedure 21.6.
PAGE 156
Chapter 21. Online Storage Management 21.16.2.1. NOP-Out Interval/Timeout To help monitor problems the SAN, the iSCSI layer sends a NOP-Out request to each target. If a NOPOut request times out, the iSCSI layer responds by failing any running commands and instructing the SCSI layer to requeue those commands when possible. When dm-multipath is being used, the SCSI layer will fail those running commands and defer them to the multipath layer. The multipath layer then retries those commands on another path.
PAGE 157
iSCSI Root By configuring a lower replacement_timeout, I/O is quickly sent to a new path and executed (in the event of a NOP-Out timeout) while the iSCSI layer attempts to re-establish the failed path/session. If all paths time out, then the multipath and device mapper layer will internally queue I/O based on the settings in /etc/multipath.conf instead of /etc/iscsi/iscsid.conf.
PAGE 158
Chapter 21. Online Storage Management 21.17. Controlling the SCSI Command Timer and Device Status The Linux SCSI layer sets a timer on each command. When this timer expires, the SCSI layer will quiesce the host bus adapter (HBA) and wait for all outstanding commands to either time out or complete. Afterwards, the SCSI layer will activate the driver's error handler. When the error handler is triggered, it attempts the following operations in order (until one successfully executes): 1. Abort the command. 2.
PAGE 159
Troubleshooting Procedure 21.7. Working Around Stale Logical Units 1. Determine which mpath link entries in /etc/lvm/cache/.cache are specific to the stale logical unit. To do this, run the following command: ls -l /dev/mpath | grep stale-logical-unit 2. For example, if stale-logical-unit is 3600d0230003414f30000203a7bc41a00, the following results may appear: lrwxrwxrwx 1 root root 7 Aug lrwxrwxrwx 1 root root 7 Aug 2 10:33 /3600d0230003414f30000203a7bc41a00 -> ..
PAGE 160
148
PAGE 161
Chapter 22. Device Mapper Multipathing and Virtual Storage Red Hat Enterprise Linux 6 also supports DM-Multipath and virtual storage. Both features are documented in detail in other stand-alone books also provided by Red Hat. 22.1. Virtual Storage Red Hat Enterprise Linux 6 supports the following file systems/online storage methods for virtual storage: • Fibre Channel • iSCSI • NFS • GFS2 Virtualization in Red Hat Enterprise Linux 6 uses libvirt to manage virtual instances.
PAGE 162
Chapter 22. Device Mapper Multipathing and Virtual Storage Redundancy DM-Multipath can provide failover in an active/passive configuration. In an active/passive configuration, only half the paths are used at any time for I/O. If any element of an I/O path (the cable, switch, or controller) fails, DM-Multipath switches to an alternate path. Improved Performance DM-Multipath can be configured in active/active mode, where I/O is spread over the paths in a round-robin fashion.
PAGE 163
Appendix A. Revision History Revision 1.0 initial build Thu Jul 09 2009 Don Domingo ddomingo@redhat.
PAGE 164
152
PAGE 165
Glossary This glossary defines common terms relating to file systems and storage used throughout the Storage Administration Guide. Defragmentation The act of reorganizing a file's data blocks so that they are more physically contiguous on disk. Delayed Allocation An allocator behavior in which disk locations are chosen when data is flushed to disk, rather than when the write occurs. This can generally lead to more efficient allocation because the allocator is called less often and with larger requests.
PAGE 166
Glossary available at mkfs time as well. Doing well-aligned allocation I/O can avoid inefficient read-modify-write cycles on the underlying storage. Write Barriers A method to enforce consistent I/O ordering on storage devices which have volatile write caches.
PAGE 167
Index Symbols 'software iSCSI offload and interface binding iSCSI, 135 /boot/ directory, 31 /dev/disk persistent naming, 124 /dev/shm , 30 /etc/fstab , 41, 60 /etc/fstab file enabling disk quotas with, 95 /local/directory (client configuration, mounting) NFS, 59 /proc /proc/devices, 37 /proc/filesystems, 37 /proc/mdstat, 37 /proc/mounts, 37 /proc/mounts/, 37 /proc/partitions, 37 /proc/devices virtual file system (/proc), 37 /proc/filesystems virtual file system (/proc), 37 /proc/mdstat virtual file system (
PAGE 168
Index FS-Cache, 80 cache setup FS-Cache, 78 cache sharing FS-Cache, 79 cachefiles FS-Cache, 78 cachefilesd FS-Cache, 78 caching, file system overview, 1 CCW, channel command word storage considerations during installation, 4 changing dev_loss_tmo fibre channel modifying link loss behavior, 143 channel command word (CCW) storage considerations during installation, 4 coherency data FS-Cache, 78 command timer (SCSI) Linux SCSI layer, 146 configuration discovery iSCSI, 132 configuring a tftp service for diskles
PAGE 169
quotacheck command, using to check, 99 reporting, 99 soft limit, 97 disk storage (see disk quotas) parted (see parted ) diskless systems DHCP, configuring, 115 exported file systems, 116 network booting service, 115 remote diskless systems, 115 required packages, 115 tftp service, configuring, 115 dm-multipath iSCSI configuration, 143 dmraid RAID, 89 dmraid (configuring RAID sets) RAID, 89 documentation LVM, 21 drivers (native), fibre channel, 122 du , 30 dump levels XFS, 54 E e2fsck , 41 e2image (other ex
PAGE 170
Index fibre channel online storage, 121 fibre channel API, 121 fibre channel drivers (native), 122 fibre channel over ethernet FCoE, 129 fibre-channel over ethernet storage considerations during installation, 3 file system FHS standard, 29 hierarchy, 29 organization, 29 structure, 29 file system caching overview, 1 file system encryption overview, 1 file system types encrypted file system, 83 ext4, 43 GFS2, 47 XFS, 49 file systems, 29 ext2 (see ext2) ext3 (see ext3) file systems, overview of supported types
PAGE 171
solid state disks, 120 iface (configuring for iSCSI offload) offload and interface binding iSCSI, 135 iface binding/unbinding offload and interface binding iSCSI, 136 iface configurations, viewing offload and interface binding iSCSI, 133 iface for software iSCSI offload and interface binding iSCSI, 135 iface settings offload and interface binding iSCSI, 134 importance of write barriers write barriers, 105 increasing file system size XFS, 52 indexing keys FS-Cache, 78 initiator implementations offload and in
PAGE 172
Index LUKS/dm-crypt, encrypting block devices using storage considerations during installation, 4 LUN (logical unit number) adding/removing, 142 known issues, 142 required packages, 142 rescan-scsi-bus.
PAGE 173
overriding/augmenting site configuration files (autofs), 62 proper nsswitch configuration (autofs version 5), use of, 61 reloading, 67 required services, 58 restarting, 67 rfc2307bis (autofs), 64 rpcbind , 74 security, 72 file permissions, 73 NFSv2/NFSv3 host access, 72 NFSv4 host access, 73 server (client configuration, mounting), 59 server configuration, 68 /etc/exports , 68 exportfs command, 70 exportfs command with NFSv4, 71 starting, 67 status, 67 stopping, 67 storing automounter maps, using LDAP to st
PAGE 174
Index proc directory, 33 processing, I/O limit overview, 1 project limits (setting) XFS, 52 proper nsswitch configuration (autofs version 5), use of NFS, 61 Q queue_if_no_path iSCSI configuration, 143 modifying link loss iSCSI configuration, 144 quota (other ext4 file system utilities) ext4, 46 quota management XFS, 51 quotacheck , 96 quotacheck command checking quota accuracy with, 99 quotaoff , 98 quotaon , 98 R RAID advanced RAID device creation, 90 Anaconda support, 89 configuring RAID sets, 89 dmraid
PAGE 175
rpcinfo, 74 status, 67 rpcinfo, 74 running sessions, retrieving information about iSCSI API, 123 running status Linux SCSI layer, 146 S sbin directory, 33 scanning interconnects iSCSI, 136 scanning storage interconnects, 131 SCSI command timer Linux SCSI layer, 146 SCSI Error Handler modifying link loss iSCSI configuration, 144 SCSI standards I/O alignment and size, 111 separate partitions (for /home, /opt, /usr/local) storage considerations during installation, 4 server (client configuration, mounting) NF
PAGE 176
Index LVM2 creating, 92 extending, 92 reducing, 93 removing, 94 moving, 94 recommended size, 91 removing, 93 symbolic links in /dev/disk persistent naming, 124 sys directory, 33 sysconfig directory, 36 sysfs overview online storage, 121 sysfs interface (userspace access) I/O alignment and size, 110 system information file systems, 29 /dev/shm , 30 T targets iSCSI, 139 TCP, using NFS over NFS, 75 technology preview overview, 1 tftp service, configuring diskless systems, 115 throughput classes solid state di
PAGE 177
storage considerations during installation, 3 World Wide Identifier (WWID) persistent naming, 124 write barriers battery-backed write caches, 106 definition, 105 disabling write caches, 106 enablind/disabling, 105 error messages, 106 ext4, 45 high-end arrays, 107 how write barriers work, 105 importance of write barriers, 105 NFS, 107 XFS, 50 write caches, disabling write barriers, 106 WWID persistent naming, 124 X XFS allocation features, 49 backup/restoration, 54 creating, 49 cumulative mode (xfsrestore),
PAGE 178
166