System information
121
CONFIGURING AND ADMINISTERING COLDFUSION 9
Indexing Collections with Verity Spider
Last updated 2/21/2012
vspider -initialize -collection coll [options]
Where -initialize is -start or -refresh (when starting points have changed), and -collection is required to
provide a target for the Verity Spider, and
[options] can be a near-limitless combination of the options described
later.
For example:
c:\coldfusion9\verity\k2\_nti40\bin\vspider -common c:\coldfusion9\verity\k2\common
-collection c:\new -start http://localhost -indinclude *
Dependencies exist for other options, depending on the nature of the indexing task. The following are some examples:
• To build a new collection, use -style.
• To control how Verity Spider operates, including which documents it indexes, use some Verity Spider options.
If you do not run the Verity Spider executable from its default installation directory, include that directory in your
path. This is because the Verity Spider executable depends on other files to run properly.
To use the vspider command on UNIX and Linux, the directory that contains the libvdk30.so file must be in your
LD_LIBRARY_PATH variable. In the server configuration, this directory is cf_root/verity/k2/platform/bin; in the
multiserver configuration, this directory is jrun_root/servers/cfusion/WEB-INF/cfusion/verity/k2/platform/bin. For
example, in the server configuration on Linux, this directory is cf_root/verity/k2/_ilnx21/bin.
Using a command file
For simpler reuse and archiving of your indexing commands, use the -cmdfile option for abstraction. By using an
ASCII text file to store a task’s options, you avoid the potential problem of using special characters in an option’s
parameter value. For example, the
-processbif option requires the use of "!*" and therefore any task using that option
must also use the
-cmdfile option.
command-line option reference
Verity Spider V 5.0 command-line options are case sensitive.
-start
Specifies a starting point for an indexing job. You can specify multiple instances, or use multiple values in a single
instance.
When you execute an indexing job from a command line, and you do not use a command file (with the -cmdfile
option), you must URL-escape any special characters in the starting point. To URL-escape a special character, use
"%hex-ASCII-character-number" in place of the character. For example, use /time%26/ instead of /time&/. This allows
the operating system to properly process the command string.
If an indexing task stops, you can rerun the task as-is. The persistent store for the specified collection is read, and only
those candidate URLs that are in the queue but not yet processed are parsed. Candidate URLs correspond to URLs of
the following status, as reported by vsdb:
cand, used, inse, upda, dele, fail