Google Search Appliance Deployment Governance and Operational Models March 2014 © 2014 Google 1
Deployment Governance and Operational Models This document discusses several strategies related to governance of a Google Search Appliance (GSA) solution. About this document The recommendations and information in this document were gathered through our work with a variety of clients and environments in the field. We thank our customers and partners for sharing their experiences and insights.
Contents About this document Chapter 1 Content Publishing Governance and Best Practices Overview Determination of relevance and ranking Content creation and structure Relevancy tuning options User experience considerations Chapter 2 GSA Ownership and Chargeback Model Overview Ownership by a centralized IT organization Ownership by an individual business/functional unit Ownership by a third party in a hosted environment Ongoing operations and support Ongoing search governance Chapter 3 GSA Content On-boardin
Chapter 1 Content Publishing Governance and Best Practices Overview The Google Search Appliance provides algorithms “out of the box“ that effectively index and rank common business documents. Users don't have to create documents catering to the GSA indexing and ranking algorithms as the GSA is effective in indexing unstructured content. There is always room for improvement by providing the GSA additional signals that might improve ranking.
These guidelines deal mostly with the indexing of additional metadata along with document content. Metadata can enrich content in the GSA’s index. For just one example of the type of metadata attributes that can be added to enrich document classification, refer to the Dublin Core of metadata elements, which provides standards for a base set of text fields that can describe a resource.
Data classification/taxonomy If documents that are to be indexed by the GSA are associated with an enterprise data classification scheme and/or taxonomy, make sure that ontological information is being indexed by the GSA. Feeding of ontological information along with the content is not mandatory, but it can enhance the search solution. The taxonomy can be used in combination with metadata filters to restrict document searches at certain levels in the overall hierarchy.
The following table provides an overview of features that may affect how results are displayed to users. Feature Comments Source Biasing Via pattern matching, bias one source over another. Date Biasing Assign more or less importance to document creation date. Metadata Biasing Bias documents that have specific metadata attached. KeyMatches Although KeyMatches are technically a suggestion feature and not part of organic search results, you can use them to promote documents for certain queries.
Chapter 2 GSA Ownership and Chargeback Model Overview There are three main search solution ownership models that apply to GSA deployments: ● ● ● Ownership by a centralized IT organization Ownership by an individual business/functional unit Ownership by a third part in a hosted environment Each model has specific characteristics and might have certain advantages or disadvantages based on the landscape at your organization.
Cost and charge models There are a number of ways to calculate how to fund and support the search solution: ● Provide blanket IT funds to the shared service search organization and don’t charge business units for GSA usage ● Recoup the cost of deploying and supporting the GSA by charging individual business units for their usage: 1. Figure out the base GSA costs to operate, including, but not limited to license cost, data center, racking, support, and so on. 2.
Depending on the IT landscape at your organization, this model might imply that the GSA deployment may not be in compliance with all IT governance policies in place, in areas such as audit, branding, platform, architecture, and so on. Search deployed within such a model might have the implicit restriction of not being able to expand to repositories controlled outside of the business unit.
Ongoing operations and support As touched on briefly in the preceding sections, the GSA search solution must be maintained to a certain degree moving forward. First and foremost, help on issues from the Google Enterprise Support Organization is available for the duration of your GSA license. That being said, there needs to be a person or a group familiar with and responsible for the search within your organization.
Chapter 3 GSA Content On-boarding Models Overview This chapter addresses strategies and approaches for on-boarding content sources available in your enterprise to the GSA. Even though the GSA, in a lot of cases, should be able to discover, crawl, and index content without much effort, it is best to plan for and have a strategy of how the integration of different content sources and/or applications will commence.
Consideration Comments Users and user types affected by the content integration Aim for platforms that have the highest impact first. Ease of integration In order of increasing difficulty of integration: direct crawl, connector available, crawl through proxy, feed needs to be developed, connector needs to be developed. Security authentication and authorization mechanism required Systems that integrate with security mechanisms that are directly supported by the GSA provide the easiest integration.
Content repository/platform vs. business application on-boarding model There are two high-level approaches to consider in selecting content for on-boarding onto the GSA: ● ● Content ingestion by platform Content ingestion by business application There are different considerations to make based on these two models. Some of these considerations are discussed in the next sections.
A primary goal of indexing application content on the GSA should be to increase user productivity with that application. When thinking about use cases or end user workflow, identify areas where the GSA could simplify the process.
Initial content analysis Initial investigation of content integration with the GSA would be triggered by a request that comes in or an area being identified, which contains content that is not integrated with the GSA and is not being returned in search results. In this phase, conduct the following analysis: 1. Analyze the on-boarding factors identified in Considering content sources for indexing. 2. Document requirements and populate the “search spec” for this particular system. a.
Handling of sensitive data that is newly searchable on the GSA The GSA is a powerful search tool, which allows users to find documents that they previously may have had no means to find. There are cases where shortly after deployment, users start finding documents in their search results that they should not have access to. These are cases where the GSA correctly respects permissions on the documents, but the permissions in the source system are too open in the first place.
Chapter 4 GSA Environment Approach Overview It is advantageous to test changes in a separate environment before releasing them to end users. As with any type of server or application, a small change to a configuration could have unintended consequences, so a proper testing strategy and staging environment is recommended for changes to any search application.
Testing / QA environment (optional) The Testing/QA environment is similar to the development environment, as it is a replicated, nonproduction environment. This environment should be kept stable for any testers accessing the system. Pushes for deployment components, configuration changes, or front end changes should be done in phases according to a configuration management process. The testing team should be notified of any changes to this environment that could impact their testing.
Summary The topic of governance is not universal and will differ from organization to organization. The sections in this document present some concepts you can adopt when you establish governance around your newly deployed GSA search solution. Keep in mind that these concepts can integrate with your existing processes in a number of ways. As always, just like the architecture around the GSA, try to keep governance processes as simple as possible so deployments can be quick and iterative.