Oracle® Linux Release Notes for Unbreakable Enterprise Kernel Release 3 E48380-04 February 2014
Oracle® Linux: Release Notes for Unbreakable Enterprise Kernel Release 3 Copyright © 2013, 2014, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc.
Table of Contents Preface .............................................................................................................................................. v 1 New Features and Changes ............................................................................................................ 1 1.1 Notable Changes .................................................................................................................. 1 1.1.1 Architecture ...............................................
iv
Preface The Oracle Linux Unbreakable Enterprise Kernel Release Notes provides a summary of the new features, changes, and known issues in the Unbreakable Enterprise Kernel Release 3. Audience This document is written for system administrators who want to use the Unbreakable Enterprise Kernel with Oracle Linux. It is assumed that readers have a general understanding of the Linux operating system.
vi
Chapter 1 New Features and Changes The Unbreakable Enterprise Kernel Release 3 (UEK R3) is Oracle's third major release of its heavily tested and optimized operating system kernel for Oracle Linux 6 on the x86-64 architecture. It is based on the mainline Linux kernel version 3.8.13. The 3.8.13-16 release also updates drivers and includes bug and security fixes. Oracle actively monitors upstream checkins and applies critical bug and security fixes to UEK3.
Core Kernel Functionality 1.1.3 Core Kernel Functionality • To avoid binary incompatibility in applications that do not understand the 3.x versioning scheme, the UNAME26 personality patch can be used to report the kernel version as 2.6.x where x is derived from the real kernel version. The uname26 program is provided to activate the UNAME26 personality patch for 3.x kernels. uname26 does not replace the uname command.
Cryptography • The value of the SCSI error-handling timeout is now tunable. If a SCSI device times out while processing file system I/O, the kernel attempts to bring the device back online by resetting the device, followed by resetting the bus, and finally by resetting the controller. The error-handling timeout defines how many seconds the kernel should wait for a response after each recovery attempt before performing the next step in the process.
DTrace DTrace 0.4 in UEK R3 has the following additional features compared with DTrace 0.3.2 in UEK R2: • In UEK R2, you had to install separately available packages that contained a DTrace-enabled version of the kernel, and you had to boot the system with this kernel to be able to use DTrace. In UEK R3, DTrace support is integrated with the kernel.
DTrace • The DTrace header files in the kernel, kernel modules, and DTrace user-space utility have been restructured to provide better support for custom consumers and DTrace-related utilities. • The systrace provider has been updated to account for changes in the 3.8.13 kernel. • Symbol lookup can now be performed by the & operator.
File Systems • It is now possible to obtain correct value for the ERR registers. • For more information about DTrace, refer to the Oracle Linux 6 Administrator's Solutions Guide and the Oracle Linux 6 Dynamic Tracing Guide, which you can find in the Oracle Linux 6 documentation library at http://docs.oracle.com/cd/E37670_01/index.html. 1.1.8 File Systems btrfs In UEK R3, btrfs is based on version 3.8, whereas btrfs in the latest update to UEK R2 is based on version 3.
Memory Management • The fsync() system call writes the modified data of a file to the hard disk. (3.7) • Replacing devices without unmounting or otherwise disrupting access to the file system by using the replace subcommand to btrfs, for example: # btrfs replace failed_device replacement_device mountpoint You do not need to unmount the file system or to stop active tasks. If the power fails during replacment, the process resumes when the file system is next mounted. (3.
Networking oriented, pseudo-RAM device such as Xen Transcendent Memory (tmem) or in-kernel compressed memory (zmem). (3.5) • Safe swapping is supported using network block devices (NBDs) or NFS. (3.6) 1.1.10 Networking • TCP controlled delay management (CoDel) is a new active queue management algorithm that is designed to handle excessive buffering across a network connection (bufferbloat). The algorithm is based on for how long packets are buffered in the queue rather than the size of the queue.
Security • The perf trace command can be used to record a workload according to a specified script, and to display a detailed trace of a workload that was previously recorded. This command provides an alternative interface to strace. (3.7) 1.1.
Virtualization • Automatic Path Migration (APM) • Active Bonding (AB) • Shared Request Queue (SRQ) • Netfilter (NF) • Support for IB, OFED, and RDS is integrated into the kernel. The OFED user-space RPMs continue to be provided, but the kernel-ib and ofa-kernel RPMs are not required. • A new iSCSI implementation raises the supported iSCSI target framework to LIO version 4.1. (3.1) 1.1.
Driver Updates 1.3 Driver Updates The Unbreakable Enterprise Kernel supports a large number of hardware and devices. In close cooperation with hardware and storage vendors, Oracle has updated several device drivers. The list given below indicates the drivers whose versions differ from the versions in mainline Linux 3.8.13. 1.3.1 Storage Adapter Drivers Broadcom • NetXtreme II Fibre Channel over Ethernet driver (bnx2fc) version 2.3.4. • NetXtreme II iSCSI driver (bnx2i) version 2.7.6.1d.
Network Adapter Drivers Supports Open-iSCSI. 1.3.2 Network Adapter Drivers Broadcom • NetXtreme II network adapter driver (bnx2) version 2.2.3n. • NetXtreme II 10Gbps network adapter driver (bnx2x) version 1.76.54. • Converged Network Interface Card core driver (cnic) version 2.5.16g. • Tigon3 Ethernet adapter driver (tg3) version 3.131d. Emulex • Blade Engine 2 10Gbps adapter driver (be2net) version 4.6.63.0u. Intel • Legacy (PCI and PCI-X*) Gigabit network adapter driver (e1000) version 7.3.
New and Updated Packages • InfiniBand SCSI RDMA Protocol initiator (ib_srp) version 1.2. Oracle • Reliable Datagram Sockets driver (rds) version 4.1. RDS provides in-order, non-duplicated, highly-available, low-overhead, reliable delivery of datagrams between hundreds of thousands of non-connected endpoints. 1.
New and Updated Packages fuse-devel fuse-libs • ib-bonding (ip-bond, IPoIB bonding-interface utility) • ibacm (ib_acm daemon for InfiniBand fabrics) ibacm-devel • ibutils (OpenIB Mellanox InfiniBand diagnostic utilities) • infiniband-diags (OpenFabrics Alliance InfiniBand diagnostic utilities) infiniband-diags-compat • iscsi-initiator-utils (iSCSI daemon and utilities) iscsi-initiator-utils-devel • kernel-uek (UEK R3 kernel) kernel-uek-debug kernel-uek-debug-devel kernel-uek-devel kernel-uek-doc kernel-uek
New and Updated Packages libibumad-static • libibverbs (user-space RDMA (InfiniBand/iWARP) hardware library) libibverbs-devel libibverbs-devel-static libibverbs-utils • libmlx4 (Mellanox ConnectX InfiniBand HCA user-space driver) libmlx4-devel • librdmacm (user-space RDMA connection manager) librdmacm-devel librdmacm-utils • libsdp (user-space Sockets Direct Protocol library) libsdp-devel • libss (command-line interface parsing library) libss-devel • lxc (Linux Containers) lxc-devel lxc-libs • mstflint (Me
Technology Preview • rds-tools (RDS utilities) • sdpnetstat (sdpnetstat, InfiniBand SDP diagnostic utility) • srptools (InfiniBand SDP utilities) • uname26 (uname26, wrapper utility for the UNAME26 personality patch) • xfsdump (administrative utilities for the XFS file system) • xfsprogs (XFS file-system utilities) xfsprogs-devel xfsprogs-qa-devel For details of the channels on which these packages are available, see Chapter 3, Installation and Availability. 1.
Chapter 2 Known Issues This chapter describes the known issues for the Unbreakable Enterprise Kernel Release 3. ACFS Oracle ASM Cluster File System (ACFS) is currently not supported for use with UEK R3. (Bug ID 16318126) ACPI • On some systems you might see ACPI-related error messages in dmesg similar to the following: ACPI Error: [CDW1] Namespace lookup failure, AE_NOT_FOUND ACPI Error: Method parse/execution failed [_SB_.
• Converting an existing ext2, ext3, or ext4 root file system to btrfs does not carry over the associated security contexts that are stored as part of a file's extended attributes. With SELinux enabled and set to enforcing mode, you might experience many permission denied errors after reboot, and the system might be unbootable. To avoid this problem, enforce automatic file system relabeling to run at bootup time. To trigger automatic relabeling, create an empty file named .
• If you use the -s option to specify a sector size to mkfs.btrfs that is different from the page size, the created file system cannot be mounted. By default, the sector size is set to be the same as the page size. (Bug ID 17087232) CPU microcode update failures on PVM or PVHVM guests When running Oracle Linux 6 with UEK R3, you might see error messages in dmesg or /var/log/ messages similar to this one: microcode: CPU0 update to revision 0x6b failed. You can ignore this warning.
Firmware warning message You can safely ignore the following firmware warning message that might be displayed on some Sun hardware: [Firmware Warn]: GHES: Poll interval is 0 for generic hardware error source: 1, disabled. (Bug ID 13696512) Huge pages One-gigabyte (1 GB) huge pages are not currently supported for the following configurations: • HVM guests • PV guests • Oracle Database Two-megabyte (2 MB) huge pages have been tested and work with these configurations.
• To configure connected mode, specify CONNECTED_MODE=yes in the file. • To configure datagram mode, either specify CONNECTED_MODE=no in the file or do not specify this setting at all (datagram mode is enabled by default). Note Before saving your changes, make sure that you have not specified more than one setting for CONNECTED_MODE in the file. 2.
2. Shut down and reboot the host system. • The root user in a container can affect the configuration of the host system by setting some /proc entries. (Bug ID 17190287) • Using yum to update packages inside the container that use init scripts can undo changes made by the Oracle template. • Migrating live containers (lxc-checkpoint) is not yet supported. • Oracle Database is not yet supported for use with Linux Containers.
Soft lockup errors when booting When upgrading or installing the UEK R3 kernel on fast hardware, usually with SAN storage attached, the kernel can fail to boot and BUG: soft lockup messages are displayed in the console log. The workaround is to increase the baud rate from the default value of 9600 by amending the kernel boot line in /boot/grub/grub.
• On virtualized systems that are built on Xen version 3, including all releases of Oracle VM 2 including 2.2.2 and 2.2.
Chapter 3 Installation and Availability You can install Unbreakable Enterprise Kernel Release 3 on Oracle Linux 6 Update 4 or newer, running either the Red Hat compatible kernel or a previous version of the Unbreakable Enterprise Kernel. If you are still running an older version of Oracle Linux, first update your system to the latest available update release. The Unbreakable Enterprise Kernel Release 3 is supported on the x86-64 architecture but not on x86. 3.
Enabling Access to Public Yum Channels 5. Click Save Subscriptions. For information about using ULN, see the Oracle Linux Unbreakable Linux Network User's Guide at http:// docs.oracle.com/cd/E37670_01/index.html. 3.3 Enabling Access to Public Yum Channels At the Oracle Public Yum repository at http://public-yum.oracle.
Upgrading OFED Packages In this example, access is enabled to the ol6_latest and ol6_UEKR3_latest channels but not to the ol6_UEK_latest, ol6_playground_latest and ol6_ofed_UEK channels. You can find more information about installing the software at http://public-yum.oracle.com/, from where you download a copy of a suitable repository file (http://public-yum.oracle.com/public-yum-ol6.repo).
28
Appendix A Other Changes The following sections describe other features of Unbreakable Enterprise Kernel Release 3 (UEK R3). The mainline version in which a feature was introduced is noted in parentheses. A.1 Architecture • vsysscall emulation and vsyscall parameter. (3.1) • INTEL_MID configuration. (3.1) • mrst_pmu driver for Intel Moorestown Power Management Unit. (3.1) • Hardware memory error recovery support for ACPI, APEI, and GHES. (3.
Core Kernel Functionality • RAID-5 XOR checksumming is optimized by taking advantage of the 256-bit YMM registers introduced by Advanced Vector Extensions (AVX). (3.5) • RAID-6 includes Supplemental Streaming SIMD Extensions 3 (SSSE3) optimized recovery functions and a new algorithm for selecting the most appropriate function to use for recovery. (3.
Core Kernel Functionality • Add support for the implementation of SEEK_HOLE and SEEK_DATA in lseek(). (3.1) • Add the ! escape character to / in hostname and comm strings in core dumps. (3.1) • If the value of the sysctl parameter shm_rmid_forced is set to 11, all shared memory objects are marked for removal with IPC_RMID. As this change breaks POSIX compliance, you need to ensure that no threads are using the orphaned memory. (3.
Core Kernel Functionality • The prctl() PR_GET_CHILD_SUBREAPER and PR_SET_CHILD_SUBREAPER options implement simple process supervision of orphaned processes. (3.4) • Thread stacks are now marked correctly for proc/pid/maps under procfs. (3.4) • Restore the sysctl setting kernel.pty.max as the global limit of pseudo terminals (by default, 4096). (3.4) • Add abilities to turn the reboot notifier on or off, and to enter the debugger and stop kernel execution before rebooting. (3.
Core Kernel Functionality • The rcutree.rcu_fanout_leaf boot parameter allows the value of RCU_FANOUT_LEAF to be increased but not decreased. (3.6) • Firmware files can be loaded directly from the file system rather than from udev. (3.7) • xattr support in cgroups allow run-time metadata to be attached to cgroups. (3.7) • The disable_nmi command in kdb disables NMI-entry and releases the port. (3.
Cryptography • Add a sysfs node to present frequency transition information for power management. (3.8) A.4 Cryptography • Ablkcipher now support encryption and decryption for AES, DES, and 3DES. (3.1) • Add an eCryptfs mount option to check that the UID of the device being mounted is the same as the expected UID. (3.1). • The encrypted key type has been extended with the introduction of the ecryptfs format, intended for use with the eCryptfs file system.
Device Mapper • Add Tegra AES hardware driver supporting ecb, cbc, ofb, and ansi_x9.31rng modes, and 128, 192 and 256-bit key sizes. (3.4) • Add a slice-by-8 algorithm to the existing slice-by-4 algorithm in crc32. The BITS size is expanded from 32 to 64, tables are extended from tab[4][256] to tab[8][256], and inner-loop code is added. (3.4) • Improve performance of aesni_intel by using parallel LRW and XTS encryption with AES-NI hardware pipelines. (3.
File Systems • CEE information and statistics query • Flash configuration • Collect and reset fcport statistics • Configure LUN masking • Configure QoS and collect statistics • Support for obtaining SFP information • Support for FC-transport based Asynchronous Event Notification • Support for I/O profiling • Collect or reset fabric statistics • Configure and query flash boot partition • Configure trunking on Brocade adapter ports • store driver configuration in flash memory • Brocade-1860 Fabric Adapter 16
File Systems • Switching from tree locks to reader/writer locks improves the performance of read and write-intensive workloads. (3.1) • Performance improvements in several areas, particularly for random write workloads. (3.2) • Allowing overcommit of ENOSPC reservations to improve performance. (3.2) • Add automatic backup of superblock information about tree roots for the previous 4 commits. Add the -o recovery mount option to enable use the root history log if required. (3.
File Systems • Add backup mount option. (3.2) • Allow larger rsize (up to 16 MB) and change the default to 1 MB. (3.2) • Introduce credit-based flow control. (3.4) • Add the cache=strict|none mount option to specify the cache type instead of the strictcache and forcedirectio options. The legacy options are now mutually exclusive. (3.5) • The vers=2.1 mount option forces an SMB2 mount. By default, vers=1 (CIFS) is used. (3.5) • The vers=2.0 mount option forces an SMB2.02 mount. (3.
Memory Management • Add persistent function tracing. The kernel can save the function call chain log to a persistent RAM buffer, which can be decoded and dumped after a reboot. You can use the log to determine the function that was called immediately prior to a reset or panic. (3.6) tmpfs • Increase the file size limit for tmpfs. (3.1) • Support fallocate() FALLOC_FL_PUNCH_HOLE and preallocation. (3.5) XFS • Improve performance of the inode cache. (3.1) • Improve scalability of per-file-system quotas.
Networking • Charge the pages dirtied by an exited process to random dirtying tasks. (3.3) • Allow the poll time and call intervals to balance dirty pages to be controlled by the value of the max_pause parameter. (3.3) • Fix dirtied pages accounting on sub-page writes. (3.3) • Introduce the dirty rate limit to compensate a task's think time when computing the final pause time. (3.3) • Reduce dirty throttling polls and CPU overhead. (3.3) • Avoid tiny dirty poll intervals. (3.
Networking • Reduce the false sharing effect. (3.1) • Reduce CPU overhead of check_leaf() with the route cache disabled. (3.1) • Add support to the virtio_net driver to obtain Rx and Tx ring parameter information from an Ethernet device. Used by the ethtool -g ethX command. (3.2) • Implement AP isolation on the receiver and sender side for B.A.T.M.A.N. When a node receives a unicast packet, it checks whether the source and destination client can communicate due to the AP isolation. (3.
Networking • RCU conversion in TCP allows access to MD5 keys without locking the listener socket. (3.4) • For some workloads, allowing splice() to build full TSO packets can reduce number of logical packets sent by an order of magnitude, making zero-copy TCP faster than one-copy. (3.4) • Add the SO_PEEK_OFF socket option. (3.4) • Support peeking offset for datagram sockets, seqpacket sockets, and stream sockets. (3.
Networking interface to the vlan interface if this interface exists. This change allows the iptables REDIRECT target work with vlan-on-top-of-bridge configurations and the use of iptables -i" to match the vlan device name. (3.5) • Allow byte-based limit mode can be used with netfilter, for example, to support ingress-traffic policing or to detect when a host or port consumes more bandwidth than expected. (3.5) • Add support for sync threads to netfilter. (3.5) • Remove ip_queue support from netfilter. (3.
Networking • Add IPv6 REDIRECT target. • Add IPv6 AT support. • Support IPv6 FTP NAT helper. • Support IPv6 IRC NAT helper. • Support IPv6 SIP NAT helper. • Support IPv6 in the amanda NAT helper. • Add stateless IPv6-to-IPv6 Network Prefix Translation target. • Remove xt_NOTRACK. (3.7) • Add link layer control (LLC) core layer to HCI 2, add an SHDLC llc module to the lic core, and add LLCP raw socket support to NFC. (3.7) • Support IPv6 transmit hashing (and TCP or UDP over IPv6) in the bonding driver. (3.
perf Utility • Add support for per-association statistics by implementing the SCTP_GET_ASSOC_STATS call for the Stream Control Transmission Protocol (SCTP). (3.8) • Add a sysctl that allows the selection of the HMAC algorithm (static or dynamic) used by SCTP. (3.8) • Add support for SO_ATTACH_FILTER required to save the full state of a socket. (3.8) • Convert tun/tap into a multiqueue device and expose the queues as file descriptors in user space. (3.8) A.
Power Management • Add --list-opts option to print long option names for use with bash. (3.7) • Add script browser. (3.8) • Add new display options (-F, -p, and -P) to perf diff. (3.8) • perf inject now supports input from a file. 3.8 • Add --pre and --post options to perf stat. (3.8) • Add gtk.command config option to launch the GTK browser. This is equivalent to specifying --gtk option on command line (3.8) • Add new features to perf trace. (3.8) • Expose hardware events translations in sysfs. (3.
Storage • Add environment variable name restriction to TOMOYO. (3.2) • Add socket operation restriction to TOMOYO. (3.2) • Add control for generation of access granted logs in TOMOYO. (3.2) • Allow domain transition without execve() in TOMOYO. (3.2) • Allow audit matching on inode gid. (3.3) • Allow inter-field comparison in audit rules between the gid of a running task and the gid of an inode. (3.3) • Add a new audit filter type AUDIT_FIELD_COMPARE to indicate which fields should be compared. (3.
Virtualization A.14 Virtualization • Add memory hotplug support for the Xen balloon driver. (3.1) • Add Xen PCI backend driver. (3.1) • Implement discard requests and support old-style BARRIER. (3.2) • Increase recommended maximum number of VCPU from 64 to 160. (3.4) • Allow host IRQ sharing for assigned PCI 2.3 devices. (3.4) • Add infrastructure for software and hardware-based TSC rate. (3.4) • Move the Hyper-V storage driver out of the staging area. (3.4) • Add support for VLAN trunking to Hyper-V.