Operation Manual
40 Setting Up Sites and Pages
3. (For the site) Use the two suboptions to allow or prevent search
engines indexing the entire site (check/uncheck Index pages on this
site option) or to allow or prevent indexing of all pages linked from an
indexed page (check/uncheck Follow links from pages option).
- or -
(For the page) Ensure Override site search engine settings is
checked, then check Create robots meta tag and check/uncheck the
equivalent suboptions for the specific page.
Excluding pages from indexing (Robots file)
The objective of this method is the same as that for using a robots meta tag, but
instead a robots.txt file is created and no robots meta tag is included in web
pages. The robots.txt file is stored in the web site's root folder and can be viewed
in any text editor to verify the excluded pages and folders.
To enable a robots.txt file:
1. Choose Site Properties... from the File menu.
2. From the Search Engine menu option, check Create search engine
robots file.
3. (For the site) To allow or prevent search engines indexing the entire
site (check/uncheck Index pages on this site option).
- or -
(For a page) From page properties, to prevent search engines indexing
the page, ensure Override site search engine settings is checked,
then uncheck the Index this page option.
Including pages in indexing
So far we've looked primarily at methods of excluding web pages from
indexing. Without these controls, web pages will be indexed by discovering
page hyperlinks and crawling through them, harvesting keywords, descriptions,
and page text to be indexed. However, this process may not be efficient as there
may be a limited number of inter-page hyperlinks present throughout your site.
As a result, a search engine sitemap file (sitemap.xml) can be created to act as a
local lookup for crawlers to begin investigating your site. The file simply lists
pages in your site that you've decided can be indexed. The file also indicates to
search engines when pages have been modified, informs when the search engine
should check the page and how "important" pages are in relation to each other.