Specifications

B-4
Cisco Internet Streamer CDS 2.0-2.3 Software Configuration Guide
OL-13493-04
Appendix B Creating Manifest Files
Working with Manifest Files
prefix (Optional) Combines the hostname from the <server> tag and this field to create
a full prefix. Only content with URLs that match the full prefix are acquired, as
shown in this example:
<server name="xx"> <host name="www.cisco.com" proto="https" port=433/>
</server>
with the following <crawler> tag:
prefix="marketing/eng/"
The full prefix is “https://www.cisco.com:433/marketing/eng/.” Only URLs that
match this prefix are crawled. If a web page refers to “.../marketing/ops,” the
marketing/ops page and its children are not acquired.
If the prefix is omitted, the crawler checks the default full prefix, which is the
hostname portion of the URL from the server. In the example, the default full
prefix is “https://www.cisco.com:433.
accept (Optional) Uses a regular expression to define acceptable URLs to crawl, in
addition to having acceptable URLs match a prefix. For example, accept=“stock”
means that only URLs that meet two conditions are crawled: the URL matches the
prefix and also contains the regular expression string “stock.
reject (Optional) Uses a regular expression to reject a URL if it matches the expression.
The URL is first checked for a possible prefix match and then checked for a reject
regular expression. If a URL does not match the prefix, it is immediately rejected.
If a URL matches both the prefix and the reject regular expression, it is rejected
by the expression.
max-number (Optional) Specifies the maximum number of crawl job objects that can be
acquired.
maxTotalSizeInMB
maxTotalSizeInKB
maxTotalSizeInB
(Optional) Specifies the maximum size of content that this crawl job can acquire.
The size can be expressed in bytes (B), kilobytes (KB), or megabytes (MB).
Note The maximum size of the file that is acquired is going to be less than the
amount of disk space required to store the file. Files, when stored, contain
overhead that contributes to the amount of disk space used for the delivery
service. This overhead is approximately 20 KB per file. File size and
storage overhead need to be taken into account when you are configuring
the delivery service disk quota.
This attribute replaces the max-size-in-B/KB/MB attribute. The
max-size-in-B/KB/MB attribute continues to be supported for backward
compatibility only.
externalPrefixes (Optional) Specifies additional prefixes for crawl jobs to crawl multiple protocols
or multiple websites. Prefixes are separated with a bar (|).
externalServers (Optional) Specifies additional hosts for crawl jobs. Can be used for multiple host
crawl jobs where each host has a different user account. This attribute can be used
to refer to the <host> tag with the proper authentication information.
Table B-1 Website or FTP Server Crawl Job Attributes (continued)
Attribute Description