Specifications
B-7
Cisco Internet Streamer CDS 2.0-2.3 Software Configuration Guide
OL-13493-04
Appendix B Creating Manifest Files
Working with Manifest Files
You can also use the <options> tag to share attributes at the top-most level of the Manifest file. Shared
attributes in the <options> tag can be shared by every <item> tag or by the <crawler> tag in the Manifest
file. However, if a shared attribute is specified in both the <item-group> and the <item> tags or the
<options> and <item> tags, attribute values in the <item> tags take precedence over the <item-group>
and <options> tags.
The following example illustrates this precedence rule. The first <item> tag takes the ttl value 1440 from
the <options> tag, but the second <item> uses its own ttl value of 60.
<options
ttl="1440" >
<item src="index.html" />
<item src="index1.html" ttl="60" />
Specifying a Crawler Filter
With a rule-based crawler filter, you can crawl an entire website and only acquire contents with certain
predefined characteristics. In contrast, crawler attributes in the <crawler> tag do not act as filters but
only define the attributes for crawling. The <matchRule> tag is designed to act as a rule-based filter. You
can define rule-based matches for file extensions, size, content type, and timestamp. In the following
example, the crawl job is instructed to crawl the entire website starting at “index.html,” but to acquire
only files with the .jpg extension and those larger than 50 kilobytes.
<crawler
start-url="index.html" >
<matchRule>
<match minFileSizeIn-KB="50" extension="jpg" />
</matchRule>
</crawler>
There can be multiple <match> subtags within a <matchRule> tag. Table B-2 lists and describes the
<match> subtag attributes.
Table B-2 <match> Subtag Attributes
Attribute Description
mime-type Specifies match of these MIME-types.
extension Specifies match of files with these extensions.
time-before Specifies match of files modified before this time (using the Greenwich
mean time [GMT] time zone) in yyyy-mm-dd hh:mm:ss format.
time-after Specifies match of files modified after this time (using the Greenwich
mean time [GMT] time zone) in yyyy-mm-dd hh:mm:ss format.
minFileSizeInMB
minFileSizeInKB
minFileSizeInB
(Optional) Specifies match of content size equal to or larger than this
value. The size can be expressed in megabytes (MB), kilobytes (KB), or
bytes (B).
maxFileSizeInMB
maxFileSizeInKB
maxFileSizeInB
(Optional) Specifies match of content size equal to or smaller than this
value. The size can be expressed in megabytes (MB), kilobytes (KB), or
bytes (B).