Installation guide
The CFINCLUDE PAGE attribute can be used to include CFML pages, in which case the
included page’s
Application.cfm (and any OnRequestEnd.cfm) will be processed,
unlike a typical
CFINCLUDE TEMPLATE. This behavior is the same as using
GetPagecontext().include() function.
4.4.6 CFINDEX
4.4.6.1 Spidering a Web Site
BlueDragon now adds the ability to index/spider the web pages of a web site. CFINDEX
has traditionally been used to index the content of files within a file system. If you
indexed a directory of CFML files, you were indexing the source code, not the result of
running the pages. Spidering a site actually executes the pages in the site and indexes the
results.
Spidering is supported by way of a new value for the
TYPE attribute: website. The KEY
attribute is used to specify the URL of the site to be spidered, and it must contain the full
URL of the web site to index, including
http:// or https://.
When spidering a web site, the URL provided in the
KEY attribute indicates the starting
page, which doesn't necessarily have to be the home page of the web site. For example,
you could create separate search collections for sub-sections of a web site. The KEY
value must specify a page; if you want to specify the default document for a directory, the
URL must end with a "/". For example, the following are valid KEY values:
<cfindex type="website" key="http://www.newatlanta.com/index.html">
<cfindex type="website"
key="http://www.newatlanta.com/bluedragon/index.cfm">
<cfindex type="website" key="http://www.newatlanta.com/">
<cfindex type="website" key="http://www.newatlanta.com/bluedragon/">
The following is not valid (no trailing "/"):
<cfindex type="website" key="http://www.newatlanta.com">
The spidering process simply follows the links found in the starting page, processing any
links that result in text/html files formats (.cfm, .htm, .jsp, .asp, etc.).
Note that it can be used to spider your own site or someone else’s. Please use this feature
responsibly when spidering the web sites of others. The spidering engine does not
currently honor the
robots.txt file exclusion standard, but this will be added in the
future.
BlueDragon 6.1 CFML Compatibility and Reference Guide 18