Hub/Switch Installation Guide
Chapter 2 HPSS Planning
HPSS Installation Guide September 2002 61
Release 4.5, Revision 2
that an implementation is thread-safe provided only one thread makes MPI calls. With HPSS MPI-
IO, multiple threads will make MPI calls. HPSSMPI-IO attempts to impose thread-safety on these
hosts by utilizinga global lock thatmustbe acquiredin orderto make an MPIcall.However, there
are known problems with this approach, and the bottom line is that until these hosts provide true
thread-safety, the potential for deadlock within an MPI application will exist when using HPSS
MPI-IO in conjunction with other MPI operations. See the HPSS Programmers Reference Guide,
Volume 1, Release 4.5 for more details.
Filesreadandwritten throughtheHPSSMPI-IO canalsobeaccessedthroughtheHPSSClientAPI,
FTP, Parallel FTP, or NFS interfaces. So even though the MPI-IO subsystem does not offer all the
migration, purging, and caching operations that are available in HPSS, parallel applications can
still do these tasks through the HPSS Client API or other HPSS interfaces.
The details of the MPI-IO API are described in the HPSS Programmer’s Reference Guide, Volume 1.
2.5.7 DFS
DFS is offered by the Open Software Foundation (now the Open Group) as part of DCE. DFS is a
distributedfilesystemthat allowsusers to access filesusingnormal Unixutilities andsystem calls,
regardless of the file’s location. This transparency is one of the major attractions of DFS. The
advantageof DFSoverNFSisthatitprovidesgreatersecurity andallows filesto besharedglobally
between many sites using a common name space.
HPSS provides two options for controlling how DFS files are managed by HPSS: archived and
mirrored. The archived option gives users the impression of having an infinitely large DFS file
system that performs at near-native DFS speeds. This option is well suited to sites with large
numbersofsmallfiles.However,whenusingthisoption,thefilescanonlybeaccessedthroughDFS
interfaces and cannot be accessed with HPSS utilities, such as parallel FTP. Therefore, the
performance for data transfers is limited to DFS speeds.
Themirroredoption givesusers theimpressionofhaving asingle, common(mirrored)name space
whereobjectshavethesamepathnamesinDFSandHPSS.Withthisoption,largefilescanbestored
quickly on HPSS, then analyzed at a more leisurely pace from DFS. On the other hand, some
operations, suchas file creates, perform slowerwhen this optionis used, as compared towhen the
archived option is used.
HPSSandDFSdefinediskpartitionsdifferentlyfromoneanother.InHPSS,theoptionfor howfiles
aremirroredor archived is associated with a fileset. Recall that in DFS, multiple filesets may reside
on asingleaggregate. However, theXDSMimplementationprovided in DFS generateseventsona
per-aggregate basis. Therefore, in DFS this option applies to all filesets on a given aggregate.
To use the DFS/HPSS interface on an aggregate, the aggregate must be on a processor that has
Transarc’s DFS SMTkernel extensions installed. Theseextensionsareavailable for Sun Solarisand
IBM AIX platforms. Once an aggregate has been set up, end users can access filesets on the
aggregatefromanymachinethatsupportsDFSclientsoftware,includingPCs.Thewait/retrylogic
in DFS client software was modified to account for potential delays caused by staging data from
HPSS. Using a DFS client without this change may result in long delays for some IO requests.
HPSS servers and DFS both use Encina as part of their infrastructure. Since the DFS and HPSS
release cycles to support the latest version of Encina may differ significantly, running the DFS
server on a different machine from the HPSS servers is recommended.