Datasheet

45 www.microsoft.com/sharepoint
Term
Definition
Stemming
Words in each language can have multiple forms, but essentially mean the same thing.
For example, the verb To Write includes forms such as writing, wrote, write, and writes.
Similarly, nouns normally include singular and plural versions, such as book and books.
The stemming feature in enterprise search can increase recall of relevant documents by
mapping one form of a word to its variants.
Stop Word
Stop words (sometimes known as noise words) are those words for which there is no
value in indexing them. For example, "a" , "and", and "the" are listed in the stop word file
by default. There is no value in indexing these words as they are likely to be contained in
a high percentage of indexed items. Furthermore, information workers rarely search for
just these types of terms.
Synonym
Synonyms are words that mean the same thing as other words. For example, you might
consider laptop and notebook to mean the same thing. Administrators can create
synonyms for keywords that information workers are likely to search for in their
organization. Additionally, synonyms that can be used to improve recall of relevant
documents are stored in thesaurus files.
Word Breaker
Streams of data are retrieved from content repositories, and those streams are broken
down into discrete words for indexing. Word breakers are the components that break
down streams into individual words. Streams to be indexed are normally broken down
by identifying spaces, punctuation marks, and the particular rules of each language.
Also, when a user enters multiple words into a search box, that query is broken into
discrete terms by a word breaker.