User Guide

122 Chapter 9: Indexing Collections with Verity Spider
-indmimeexclude
Syntax:
-indmimeexclude mime_1 [mime_n] ...
Specifies that only those MIME types that match the expressions be followed but not indexed.
In Windows, include double-quotation marks around the argument to protect the special
characters, such as the asterisk (*). On UNIX, use single-quotation marks. This is only required
when you run the indexing job from a command line. Quotation marks are not necessary within
a command file (the
-cmdfile option).
Use this option to gather some documents, such as HTML tables of contents, to gain access to
other documents for indexing. The
-mimeexclude option, on the other hand, prevents specified
documents from being followed at all. For the mime variable, you can include the asterisk (*)
wildcard for text strings; for example:
'text/*'
You cannot use the question mark (?) wildcard, and the -regexp option does not let you use
regular expressions.
-indmimeinclude
Syntax:
-indmimeinclude mime_1 [mime_n] ...
Specifies that only those MIME types that match the expressions be followed and indexed.
The
-mimeinclude option does not let you index desired documents if the starting URL is not
followed. For the mime variable, you can include the asterisk (*) wildcard for text strings; for
example:
'text/*'
In Windows, include double-quotation marks around the argument to protect the special
character (*). On UNIX, use single-quotation marks. This is only required when you run the
indexing job from a command line. Quotation marks are not necessary within a command file
(the
-cmdfile option).
You cannot use the question mark (?) wildcard, and the
-regexp option does not allow you to use
regular expressions.
Example
If you want to index all Word documents at http://web.verity.com, you cannot use:
vspider -collection collname -style style_dir -start
http://web.verity.com -mimeinclude 'application/msword'
This is because the starting point does not match the -mimeinclude criteria. You can use the
-indmimeinclude option to follow all documents (unless you have specified any of the exclude
options) and index only those documents that match your criteria. Replace the
-mimeinclude
option with the -indmimeinclude option in the preceding example.