System information
125
CONFIGURING AND ADMINISTERING COLDFUSION 9
Indexing Collections with Verity Spider
Last updated 2/21/2012
Specifies the maximum amount of memory, in kilobytes, used by each indexing thread. Specify the number of threads
with the
-indexers option.
By default, each indexing thread uses as much memory as is available from the system.
-maxnumdoc
Syntax
-maxnumdoc num_docs
Specifies the maximum number of documents to download or submit for indexing. The value for num_docs does not
necessarily correspond to the number of documents indexed. The following factors affect the actual number:
• Whether the value of num_docs falls within a block of documents dictated by the -submitsize option. If it does,
the entire block of documents must be processed.
• Whether documents retrieved are correctly indexed, because they can be invalid or corrupt.
-mimemap
Syntax
-mimemap path_and_filename
Specifies a control file (simple ASCII text) that maps filename extensions to MIME-types. This lets you make custom
associations and override defaults.
The following is the format for the control file:
#file_ext_no_dot mime-type
abc application/word
-nocache
Type
Web crawling only
Used with the -noindex or -nosubmit options, this option disables the caching of files during website indexing. This
has the effect of decreasing the demands on your disk space.
Normally, Verity Spider downloads URLs, then writes them to a bulk insert file and downloads the documents
themselves. When indexing occurs, once the
-submitsize option has been reached, the cached files are indexed and
then deleted. If you use the
-noindex option, the bulk insert file is submitted but not processed by Verity Spider, and
so the documents are not deleted until indexing occurs. This is mostly
mkvdk or collsvc, or you can use Verity Spider
again with the
-processbif option.
By using the -nocache option with the -noindex or -nosubmit option, you avoid storing files locally. Files are
downloaded only when indexing actually occurs.
See also
“-noindex” on page 126.