Action Alert 897
This Action Alert describes an issue which affects product functionality, reliability or safety
Intel Action Alert AA-0897-1
5200 NE Elam Young Parkway
Hillsboro, OR 97124
February 13, 2008
Data Inconsistencies may Occur during Online Capacity Expansion and
RAID Level Migration under Heavy I/O when using Intel® SAS/SATA
Hardware RAID Controllers
Information in this document is provided in connection with Intel products. No license, express or implied, by estoppel or otherwise,
to any intellectual property rights is granted by this document. Except as provided in Intel's Terms and Conditions of Sale for such
products, Intel assumes no liability whatsoever, and Intel disclaims any express or implied warranty, relating to sale and/or use of
Intel products including liability or warranties relating to fitness for a particular purpose, merchantability, or infringement of any patent,
copyright or other intellectual property right. Intel products are not intended for use in medical, life saving, or life sustaining
applications. Intel may make changes to specifications and product descriptions at any time, without notice. The SRCSAS18E,
SRCSAS144E, SROMBSAS18E, SRCSASJV, SRCSASRB and SRCSATAWB may contain design defects or errors known as
errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.
Products Affected
SRCSAS18E
SRCSAS144E
SROMBSAS18E
SRCSASJV
SRCSASRB
SRCSATAWB
Description
Intel has identified an issue with the controllers listed above may experience a data inconsistencies under heavy I/O
during Online Capacity Expansion (OCE) and RAID Level Migration (RLM).
Investigation shows that the inconsistencies are narrowed to specific instances that occur only when the RAID controller
cache is in use (Write Back enabled or Read Ahead enabled) and there are queued I/Os at the completion of a RLM or
OCE operation.
Root Cause
During a reconstruction (OCE or RLM), RAID firmware facilitates online data access to a reconstructing Virtual Disk (VD)
by internally maintaining two VDs – one represents the portion of capacity that has yet to be constructed (“original VD”),
while the other represents the data area that has completed data reorganization/reconstruction (“ghost VD”). When a
host request is received during a reconstruction, RAID Firmware determines which of these two internal VDs to which
the request belongs, and assigns the request to either the original or the ghost VD.
The problem may occur if the reconstructing cycle happens to be the final cycle, the reconstruction completes and the
removal of the ghost VD occurs; when RAID firmware processes the queued requests that were assigned to the ghost
VD at that point no longer exists. The inconsistency occurs because the cache buffers utilized in processing these
requests will be assigned to the ghost VD, creating potential cache aliases to the original VD; if there is a mix of read
and write commands, data in these cache line aliases may become stale relative to data updated on the disk for the
write commands.