User Guide

Content Options 169
On Windows NT, you should include double quotes around the argument to protect
the special characters such as (*). On UNIX, you should use single quotes. Note that
this is only required when you run the indexing job from a command line. Quotes are
not necessary within a command file (
-cmdfile).
To use regular expressions, also specify the
-regexp option.
Keep in mind that if your starting points do not contain the specified -include
expressions, nothing will be indexed. The -include option prevents Verity Spider
from even following anything which does not match the specified expressions. You
may want to use -indinclude instead. Where
-include prevents Verity Spider from
even following anything which does not match the specified expressions,
-indinclude allows Verity Spider to follow what matches the specified expressions,
while not indexing.
For document types, use -mimeinclude instead. For example, specify -mimeinclude
text/html
rather than -include *.htm.
Note
When specifying an URL, you must use full, absolute paths using the same format as
appears in the HTML hyperlink. If the link is relative, you must change it to absolute
to use it with -include.
See also -regexp.
-indexclude
Syntax: -indexclude exp_1 [exp_n] ...
Specifies that the files and paths in URLs which match the expressions are not
indexed. They are, however, still followed. If you use backslashes, you must double
them so they are properly escaped. For example:
C:\\test\\docs\\path
You can use wildcard expressions, where the asterisk ( * ) is for text strings and the
question mark ( ? ) is for single characters. For example:
’/my_doc*/year199?’
On Windows NT, you should include double quotes around the argument to protect
the special characters such as (*). On UNIX, you should use single quotes. Note that
this is only required when you run the indexing job from a command line. Quotes are
not necessary within a command file (
-cmdfile).
To use regular expressions, also specify the -regexp option.
You would use this option to gather some documents, such as HTML tables of
contents, to gain access to other documents for indexing.
Where the -exclude option prevents Verity Spider from even following anything
which matches the specified expressions, -indexclude allows Verity Spider to follow
anything while only skipping that which matches the specified expressions.
For document types, use -indmimeexclude instead.