HP Integrated Archive Platform User Guide Version 2.0 Includes information about using the Integrated Archive Platform (IAP) Web UI. For additional user information on Email Archiving software for Microsoft Exchange and IBM Domino, see the HP Email Archiving software for Microsoft Exchange User Guide and HP Email Archiving software for IBM Domino User Guide contained in those products.
Legal and notice information © Copyright 2004-2008 Hewlett-Packard Development Company, L.P. Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor’s standard commercial license. The information contained herein is subject to change without notice.
Contents About this guide . . . . . . . . . . . . . . . . . . . . . . . . . . Intended audience . . . . . . . . Prerequisites . . . . . . . . . . Related documentation . . . . . . Document conventions and symbols HP technical support . . . . . . . Subscription service . . . . . . . Other web sites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Fuzzy words . . . . . . . . . . . . Measuring word similarity . . . . . . . Matching word sequences . . . . . . . . Simple word sequences . . . . . . . . Proximity word sequences . . . . . . . Matching word sequences in attachments Boolean query expressions . . . . . . . . Nested Boolean query expressions . . . . . Query expression examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Figures 1 IAP Web Interface toolbar . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2 Simple Search page 20 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Advanced Search page (email content type) . . . . . . . . . . . . . . . . . . . . 21 4 Query Results page (email content type) . . . . . . . . . . . . . . . . . . . . . . 24 5 Query results navigation bar . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 6 Save Criteria page . . . . . . . . . . . . . . . . . .
Tables 1 Document conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 EAs applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3 Office 2007 supported file extensions and MIME types . . . . . . . . . . . . . . . . 13 4 Office 2007 supported features . . . . . . . . . . . . . . . . . . . . . . . . . . 14 5 Office 2007 supported properties . . . . . . . . . . . . . . . . . . . . . . . . . 14 6 Toolbar buttons, IAP Web Interface . . . . . . . . . . . .
About this guide This guide provides information about using the IAP Web Interface. For additional information on using and configuring Email Archiving software for Microsoft Exchange and IBM Domino, see the HP Email Archiving software for Microsoft Exchange User Guide and HP Email Archiving software for IBM Domino User Guide contained in those products. Intended audience This guide is intended for users of the IAP Web UI.
WARNING! Indicates that failure to follow directions could result in bodily harm or death. CAUTION: Indicates that failure to follow directions could result in damage to equipment or data. IMPORTANT: Provides clarifying information or specific instructions. NOTE: Provides additional information. TIP: Provides helpful hints and shortcuts. HP technical support Telephone numbers for worldwide technical support are listed on the HP support web site: http://www.hp.com/support/.
• • • • http://www.hp.com http://www.hp.com/go/storage http://www.hp.com/service_locator http://www.hp.
About this guide
1 IAP overview This section introduces HP Integrated Archive Platform from a user perspective. IAP is a fault-tolerant, secure system of hardware and software that archives files and email messages for your organization, and lets you search for archived documents. IAP provides the following main functions: • Automatic, active data archiving (email and specific types of documents) that helps your organization meet regulatory requirements.
Understanding searching and document indexing You can search for any documents archived in your repository (or any other repositories to which you have access), whether the documents are email messages or files. When you search for a document, your query is checked against an index of words that is updated each time a document is archived. Indexing the contents of a document involves cataloging the document words to prepare them for later searching.
An email message that is entirely plain text, not MIME, is indexed. Also, if an email message has been attached to another email message, the attached email message is not indexed. IAP 2.0 provides document indexing support for Microsoft Office 2007. The supported MIME types and extensions are shown in Table 3. Support for features is shown in Table 4. Support for properties is shown in Table 5. Office 2007 documents archived prior to installing IAP 1.6.1 or 2.0 will not be indexed or content searchable.
NOTE: The following items are not yet supported: • Notes within PowerPoint slides • Spread sheet names within Excel • Some embedded OLE objects • Certain text within Excel charts Also, some documents converted to Microsoft Office version 2007 by the Office File converter may not be properly indexed.
Type Advanced Properties: General Advanced Properties: Summary Advanced Properties: Statistics Advanced Properties: Contents Property Microsoft Word, PowerPoint, and Excel Type No Location No Size No MS-DOS name No Created No Modified No Accessed No Attributes No Title Yes Subject Yes Author Yes Manager No Company No Category Yes Keywords Yes Comments Yes Hyperlink base No Template No Created No Modified No Accessed No Printed No Last saved by Yes Revisio
Type Advanced Properties: Custom 16 IAP overview Property Microsoft Word, PowerPoint, and Excel Checked by Yes Client Yes Date completed Yes Department Yes Destination Yes Disposition Yes Division Yes Document number Yes Editor Yes Forward to Yes Group Yes Language Yes Mailstop Yes Office Yes Owner Yes Project Yes Publisher Yes Purpose Yes Received from Yes Recorded by Yes Recorded date Yes Reference Yes Source Yes Status Yes Telephone number Yes Typis
2 IAP Web Interface Use this web-based tool to search for documents archived in the system. You can also save and reuse query or search criteria and results. Major topics include: • • • • Logging in and out, page 17 Understanding the user interface, page 17 Common tasks, page 18 Troubleshooting, page 34 Logging in and out Before logging in for the first time, see your system administrator for the URL to use and for the list of supported web browsers. (Microsoft 6.0 and 7.0 are recommended.
Table 6 Toolbar buttons, IAP Web Interface Button Description New Search Click to display the Simple Search page, where you can submit a query. See “Completing simple searches” on page 19. To display the Advanced Search page, point to this button and click Advanced search from the menu. See “Completing advanced searches” on page 20.
Table 7 IAP Web Interface tasks Task Reference Search for archived documents “Completing simple searches” on page 19 and “Completing advanced searches” on page 20 Display or print the query or search results “Displaying query or search results” on page 24 Save query or search criteria “Saving query or search criteria” on page 26 Save the results of a search “Saving query or search results” on page 27 Send the results to your email account “Sending query or search results” on page 28 Export the r
1. Click New Search in the toolbar. The Simple Search page is displayed. Figure 2 Simple Search page 2. Search using all of the following fields on the Simple Search page: • Content Type: Use email to search for email message files. You would use document to search the AuditLog repository as described in Searching audit log repositories, or to search for files in a repository such as those migrated using the HP File Archiving software (formerly known as FMA).
1. Point to New Search in the toolbar and click Advanced search from the menu. The Advanced Search page is displayed. Figure 3 Advanced Search page (email content type) NOTE: Figure 3 shows the Advanced Search page for the email content type. The document content type form varies slightly. See Table 8 for an explanation of the differences.
2. Search using the following fields on the Advanced Search page: • Content Type: Use email to search for email message files. You would use document to search the AuditLog repository as described in Searching audit log repositories, or to search for files in a repository such as those migrated using the HP File Archiving software (formerly known as FMA).
Query Field Folder Name Matches (in the Document) The Outlook Exchange folder to which the email belongs. It supports wildcard search like other fields do. (Folder name will appear as a field if it has been enabled by the system administrator in the domain.jcml file.) Example queries: 2006 searches all leaf folders with the name “2006,” for example, \Inbox\2006 and \Inbox\test\2006, but not \Inbox. \Inbox\2006 searches only folder 2006 in path \Inbox\2006.
Query Field Matches (in the Document) Extension File extension. Example: doc for a Microsoft Word file. Title Title of the document. Only some files have associated titles. For example, to see the title of a Word document, select File > Properties in Word. The Title field is shown on the Summary panel of the displayed Properties dialog box. Author Author of the document. Only some files have associated authors. For example, to see the author of a Word document, select File > Properties in Word.
2. From the Query Results page, complete any of the following tasks: • To display the contents of an email or document in the viewing pane, click the item from the list once. Clicking the item twice will instead open the preview pane as a new window. • To display a different group of 50 results, click the different symbols in the query results navigation bar. See Query results navigation bar on page 25 for more information.
Table 9 Query results navigation bar Item Description bars: From left to right, the five bars represent subsequent pages of 50 results (maximum). Click a bar to display its page of results. The dark bar represents the currently displayed results. Note: To see just which documents a given bar represents, hold the mouse pointer over it momentarily to display a tooltip.
2. From the Query Results page, click More Options, and then click Save Current Search Criteria. Or right-click and select Save criteria. The Save Criteria page is displayed. Figure 6 Save Criteria page 3. Enter the name of the criteria you are saving in the Save Query Criteria as field. To erase text entered in the Save Query Criteria as field, click Clear. NOTE: Special characters @ $ % ^ & * # ( ) [ ] / \ { + } ‘ ~ = | are not allowed. 4. Click Save Now. 5.
To save results: 1. Display the Query Results page by completing one of the following tasks: • Submit a simple (see “Completing simple searches” on page 19) or advanced search (see “Completing advanced searches” on page 20). • Submit a search from previously saved criteria (see “Accessing saved criteria” on page 29). 2. From the Query Results page, click More Options, and then click Save Current Results. Or right-click and select Save results. The Save Results page is displayed.
1. Display the Query Results page by completing one of the following tasks: • Submit a simple (see “Completing simple searches” on page 19) or advanced search (see “Completing advanced searches” on page 20). • Submit a search from previously saved criteria (see “Accessing saved criteria” on page 29). • Access previously saved results (see “Accessing saved results” on page 29). 2. From the Query Results page, select the check box next to each item you want to send.
1. Click Query Manager in the toolbar. The default Query Manager page displays all saved results. You can also access this view by clicking Saved Results on the Query Manager page. Figure 9 Saved Results view, Query Manager page 2. Complete any of the following tasks: • To display the results, click Reload. The Query Results page is displayed. • To copy the saved results to the quarantine repository, click Start. A completed message appears in the row when the contents is in the quarantine repository.
NOTE: Deleting a quarantine repository does not delete the items on the IAP. The actual items remain on the IAP according to the retention period set by your administrator. To delete a quarantine repository: 1. Click Query Manager in the toolbar to access the saved results. 2. Click Delete in the Quarantine column to remove access to that quarantine repository and delete its contents during the next retention cycle. Searching audit log repositories Audit log repositories are not available to all users.
Figure 11 Advanced Search page (document content type) 2. From the Timeframe list, select the time period to search. This field searches the audit logs stored to the IAP during a specified time period. Advanced searches only: As an alternative to the By Timeframe field, you can define a time period to search by specifying the Start and end (To) dates. For example, to search for documents dated between March 8, 2003 and March 23, 2003, enter 03/08/2003 in the Start field and 03/23/2003 in the To field. 3.
4. In the Search for field, enter one of the following criterion to search for a specific user or action: • • • • User ID: Enter the login name of the user, such as jdoe. First Name: Enter the first name from the LDAP directory for the user, such as John. Last Name: Enter the last name from the LDAP directory for the user, such as Doe. Logged actions: Enter one or more of the actions listed in the following table. Or, leave this field blank to search for all logged actions.
6. Click Find Now to start the search. The Query Results page displays the following information: • User: User for which the audit log was created. • Session Start: Start time of the user session. • Session End: End time of the user session. • Size: Size of the session audit log file. • Server: Server (HTTP portal) on which the audit log session was captured. • Date: Date the audit log file was archived. 7. To display the contents of an audit log file in the viewing pane, click the item from the list.
Unable to display saved results Search results are saved for two weeks and then deleted. If you save the results of a query, but the retention settings delete the files before the end of the two weeks, the results still appear in the application, but cannot be reloaded from the saved search results. Clicking the saved results displays an error because the application cannot find the saved results on IAP.
IAP Web Interface
3 Query expression syntax and matching Query expression syntax and matching describes the IAP Web Interface syntax to use to search and retrieve archived documents (files or email messages), and explains how queries are matched against documents.
Word characters and separators Word characters include all uppercase and lowercase letters, digits, and the following additional characters: • _ (underscore) • # (number/pound/hash sign) • & (ampersand) All other characters are separators (except in queries, wildcards ? and *, and special query characters ~, ", -, and !). However, && by itself is not a word. It is a Boolean operator. When combined with at least one more word character, && can be part of a word. For example, a&&b is a word.
Table 12 Supported character sets Supported character set Description ISO-8859-1 Western European, extended ASCII WINDOWS-1252 (Code pages supported by Windows) Latin 1 US-ASCII 7-bit American Standard Code for Information Interchange UTF-8 Universal (all languages) ISO-8859-2 Eastern European KOI8-R Cyrillic (Russian and Bulgarian) ISO-8859-5 Cyrillic (Bulgarian, Belarusian, Russian) WINDOWS-1251 Cyrillic WINDOWS-1254 (Code pages supported by Windows) Turkish ISO-8859–9 Turkish GB1803
Matching similar words Topics include: • Fuzzy words, page 40 • Measuring word similarity, page 40 Fuzzy words You can search for document words that are textually similar to a given literal query word (that is, one containing no wildcards). To do this, append a tilde (~) character to the word, creating a fuzzy word. For example, the fuzzy word define~ matches the similar words defined and definite, but does not match defining, definition, indefinite, or pine. It also matches define itself.
Proximity word sequences You can use simple word sequences to search for words separated by separators but not by other words. To search for document words that are in an ordered sequence, but might be separated by other words, use a proximity word sequence. To write a proximity word sequence, use the same syntax as a simple word sequence, but append a tilde (~) character to the second quote, and follow that with a numeric proximity value.
Table 13 Excel spreadsheet United States Presidents named John John Adams 1797-1801 John Quincy Adams 1825-1829 John Fitzgerald Kennedy 1961-1963 John Tyler 1841-1845 The specific order in which the text in the cells is stored internally depends on: • The version of the product, for example Excel or Quattro Pro, used to generate the spreadsheet • The insertion order for the spreadsheet text For the spreadsheet above, assuming the cell text for names were entered in displayed order from top left
PDF documents PDF documents are another case where the internal text representation can vary widely from the visible presentation in PDF readers. Some issues that can arise: • Text sequences can appear out of order on the same page depending on how the page was composed. • Text can appear doubled or can have spacing inserted into or removed from the internal representation to assist some specific visual presentation.
Boolean operators must be surrounded by one or more separators, typically white space. For example, the query peas&&carrots is not equivalent to the query peas && carrots; peas&&carrots is a single word (& is a word character). Negation operators (- and !) are exceptions to this rule. They must be preceded by a separator, but they need not be followed by a separator. For example, carrot-a6 is a single query word, but carrot -a6, like carrot (- a6), is equivalent to the Boolean expression carrot AND (NOTa6).
Table 15 Query expression examples Query expression Finds documents with ... peace OR quiet Either peace or quiet, or both, in either order. peace quiet peace AND quiet peace && quiet Both peace and quiet, in either order. peace&&quiet The single word peace&&quiet. peace or quiet The three words peace, or, and quiet, in any order. or is a word. The OR operator must be uppercase. not quiet The words not and quiet. The NOT operator must be uppercase. NOT quiet Illegal.
Query expression syntax and matching
Index Symbols Boolean queries characters, 37 expressions, 43 nested, 44 definition access control list (ACL), 11 archiving, 11 document, 11 IAP, 11 indexing documents, 12 matching, Boolean query expression , 43 repository, 11 routing rule, 11 rule, routing, 11 deleting quarantine repositories, 30 query or search results, 30 saved criteria, 29 digits, 38 displaying Message ID, 23 results, 24 saved criteria, 29 saved results, 29 document conventions, 7 definition, 11 prerequisites, 7 related documentation,
H help obtaining, 8 HP storage web site, 8 Subscriber’s choice web site, 8 technical support, 8 I IAP definition, 11 IAP Web Interface advanced searching, 20 passwords, 34 Query Results page, 24 requirements, 17 searching, 19 toolbar, 17 troubleshooting, 34 implicit Boolean connective (AND), 43 indexed documents types, 12 indexing documents, 12 Integrated Archive Platform See IAP L languages, query expressions, 38, 38 letters, 38 Levenshtein distance, 40 list access control, definition, 11 literal word
S saving query or search criteria, 26 query results, 27 search criteria deleting, 29 displaying saved criteria, 29 saving, 26 Search for field IAP Web Interface, 18 search results deleting, 30 displaying, 24 displaying saved results, 29 exporting, 29 quarantine repository, 30 saving, 27 sending, 28 searching IAP Web Interface, 19, 20 sending query or search results, 28 separators characters, 37 matching word sequences, 40 sequences, matching, 40 similarity, matching words, 40 simple word sequences, 40 spre