System information

140
CONFIGURING AND ADMINISTERING COLDFUSION 9
Indexing Collections with Verity Spider
Last updated 2/21/2012
'/my_doc*/year199?'
In Windows, include double-quotation marks around the argument to protect the special characters, such as the
asterisk (*). On UNIX, use single-quotation marks. This is only required when you run the indexing job from a
command line. Quotation marks are not necessary within a command file (the
-cmdfile option).
To use regular expressions, also specify the -regexp option.
Where the -include option prevents Verity Spider from even following anything that does not match the specified
expressions, the
-indinclude option allows Verity Spider to follow anything while only indexing that which matches
the specified expressions.
Example
If you want to index all documents that include "search" in the URL at http://web.verity.com, you cannot use the
following:
vspider -collection collname -start http://web.verity.com
-include '*search*'
This is because the starting point does not match the -include option criteria. Instead, use the -indinclude option
to follow all documents (unless you have specified any of the exclude options) and index only those documents that
match your criteria. Replace the
-include option with the -indinclude option in the preceding example.
Note: When specifying a URL, use full, absolute paths using the same format that appears in the HTML hypertext link.
If the link is relative, change it to absolute to use it with the
-indinclude option.
See also
-regexp” on page 128.
-indmimeexclude
Syntax
-indmimeexclude mime_1 [mime_n] ...
Specifies that only those MIME types that match the expressions be followed but not indexed.
In Windows, include double-quotation marks around the argument to protect the special characters, such as the
asterisk (*). On UNIX, use single-quotation marks. This is only required when you run the indexing job from a
command line. Quotation marks are not necessary within a command file (the
-cmdfile option).
Use this option to gather some documents, such as HTML tables of contents, to gain access to other documents for
indexing. The
-mimeexclude option, on the other hand, prevents specified documents from being followed at all. For
the MIME variable, you can include the asterisk (*) wildcard for text strings; for example:
'text/*'
You cannot use the question mark (?) wildcard, and the -regexp option does not let you use regular expressions.
-indmimeinclude
Syntax
-indmimeinclude mime_1 [mime_n] ...
Specifies that only those MIME types that match the expressions are followed and indexed.