Google Search Appliance Security May 2014 © 2014 Google 1
Security Security is a key consideration when designing and implementing solutions that integrate data from different sources for enterprise search. This can be one of the most complex things to deal with in these projects, especially on the Intranet side, where security is usually a strong requirement. It’s important to allocate enough quality time to this area. This paper provides insights into considerations for modeling security requirements and transforming them into the ultimate solution.
Contents About this document Chapter 1 Designing Security in the GSA Overview Information Gathering Content Acquisition Single vs. Multiple identities Selecting an authorization mechanism Chapter 2 Using Out of box features Silent authentication SAML Early binding with Per-URL ACL Connectors using Per-URL ACL Connector 4.0(beta) Security in Windows environments Perimeter security Secure Search Example Chapter 3 Authentication for Developers Forms authentication with cookie cracking SAML Cookie cracking vs.
Chapter 1 Designing Security in the GSA Overview Enterprise search projects integrate data from different sources to enable users to find information easily. In most cases, especially in intranet projects, access to documents in source applications is protected. To provide relevant and secure results to users, the corporate search engine must apply the same authorization policies as the sources where documents are stored.
accommodate different applications when acquiring contents. The process generally involves using a system or super user account with broad access to the content source so that all the documents can be indexed by the GSA. Serve time authentication Serve time authentication is the integration between the search appliance and the end user. It can be the same authentication protocol as used by one of the content sources, but it doesn’t have to be.
Use the following table to model each content source. Include information about security in the Security Mechanisms field.
Content Acquisition The acquisition generally comes in the following forms. Note that the authentication protocol used would have to be what’s supported by the content source. However, the content acquisition would usually allow different authorization mechanisms to be used.
Selecting an authorization mechanism Serve time authentication and authorization are tightly connected. As mentioned previously, although serve time authentication happens before authorization during serving, you should evaluate the authorization options FIRST. This is a very important point worth repeating here. This chapter describes the connections between these two processes in details. Authorization is always considered on a per content source basis.
With early binding, authorization is fully managed by the search appliance itself. Early binding requires authorization rules to be known to GSA. It doesn’t have to contact an external security component such as the content source at serve time to validate whether the user has the right to access a document. The GSA supports the following two types of ACLs: Per-URL ACLs With Per-URL ACLs, each document in the index can have its own authorization rules.
Although not as commonly used as Per-URL ACLs, it is a very flexible tool that can come in handy in unique situation. For example, if there is a globally defined group that should be denied access to an easily identifiable content source, defining a single Policy ACL entry could be the option. Another case is when the content system uses coarse grained permission rules. For example, CA SiteMinder allows the definition of access control based on URL patterns.
SAML authorizations can be managed in batches, so that the search appliance can send a list of URLs for authorization per request, which can speed up the process. You can activate this option in the GSA Admin Console, but your SAML authorization provider has to support it. Head requests Finally, it’s also possible to send an HTTP head request to the content source to validate authorizations.
All authorization mechanisms require User ID except Head Requests. The following table lists authentication mechanisms that would result in a User ID: Authentication mechanisms when user ID is required HTTP Basic/NTLM It is listed as HTTP Basic/NTLM. However, these are the authentication protocols used to verify the user credentials which happens between GSA and a back end server. To the end user, it is forms authentication.
Authentication Mechanism when user ID is not required (Head Requests) Cookie This is the most common situation; the search appliance forwards user cookies to validate access rights. Forms authentication is required. Cookie Cracking is not needed as a user ID is not required. The rule is configured under Universal Login Authentication Mechanism > Cookie. HTTP An HTTP Basic/NTLM rule must be configured.
there are clear rules on what rules can or cannot be used together: ● ● Per-URL ACL ○ The ACLs are part of the index that can not be added or removed on the fly. If URLs don’t have ACLs attached, Per-URL ACL can’t be used as a mechanism for those URLs. The Credential Group associated with the ACLs is also determined during index time which cannot be changed in the Flexible Authorization settings.
Chapter 2 Using Out of box features In this chapter, we will look at the details of some of the authentication and authorization mechanisms. We will also discuss common scenarios that are supported by Google Search Appliance and related products offered by Google. We will focus on scenarios that don’t require writing code. Silent authentication IT security aims to protect applications and data, providing accurate information to users, but in a secure manner.
Kerberos The Kerberos protocol is used by default in Windows networks. The search appliance can be configured to enable Kerberos so that the authentication is transparent to users. SAML Many SSO systems support the SAML protocol, and provide a silent authentication process. Note that SAML protocol is a way for an external service to securely assert the user’s identity to GSA.
● Groups database . Starting from release 7.2, the search appliance includes an internal database that stores ACLs. This is still a beta feature that has limited functions and scalability. Group memberships must be fed to the appliance, similar to how documents can be fed to the appliance’s index. ● Connectors. The Connector Framework provides an interface to resolve groups. It’s up to the connector developer to decide whether Per-URL ACL or group resolution is implemented.
John Smith's first identity, jsmith, is from the company-wide Active Directory. Of course, there are AD Groups that jsmith is a member of. Let's say one of the content sources is Plone, which is integrated with Active Directory, but has its own groups defined. How do we avoid conflicts when there are groups with the same names in both Active Directory and Plone? Groups from Active Directory will have namespace CG1. We can give groups from Plone a different namespace such as plone_space.
Connectors using Per-URL ACL Local Namespace The Connector Framework introduced the concept of "Local Namespace." Note that this is a connector concept. For ACL definition, there is only one namespace attribute. In connector configuration, there are two namespace fields: one is "Global namespace", which is equivalent to the Credential Group in Authentication. The other is "Local namespace", which will be the name of the connector (or the name of another configured connector, selectable in the dropdown).
Connector 4.0 (beta) Working with Per-URL ACL The indexing of ACLs by Connector 4.0 differs from that of previous versions: ● ACLs are not sent in via feeds. Instead, they are indexed as HTTP headers. ● If ACLs are hierarchical, they won’t be flattened. Inheritance will always be used. ● Namespace needs to be handled by each connector. The File system connector and SharePoint connector use the name “adaptor.namespace” as the configuration entry.
Authorization The “Authorization” in this section refers to late binding when using connector 4.0. In order to configure this, you need to perform the following: In Admin Console, under Search > Secure Search > Flexible Authorization, the Authorization service URL needs to be set to: https://connector-hostname:port/saml-authz. Security in Windows environments The majority of deployments of the appliance, which incorporate secure content search, occur in a Microsoft Windows environment.
Here are some unique behaviors and deployment best practices: ● ● The connector will run for a long time—it could be days if the Active Directory has a lot of users and groups. It’s recommended to: ○ Use dedicated AD Groups connector instances. This is true even for the SharePoint connector which has an embedded Active Directory Groups connector capability and can index both SharePoint content and Active Directory groups. ○ Increase the traversal time out.
Public document Secure document ● ● ● ● ● Public crawled document Feed document with no security Content from a secure content source that has been marked as public by using the GSA Admin Console Securely crawled document Feed document declared as secure Users can search and get to public documents without authentication. However, there is an exception. GSA 6.
Authorization When we try to come up with a solution, you need to start with authorization. It’s obvious that we should use Per-URL ACL for SharePoint and Salesforce content. Because GSA’s connector for Database supports authorization using a query, we can use connector authorization for this content. We will have to use Head Request for the custom IIS web site. Since it uses Kerberos, we can use Head Request with Kerberos.
Flexible Authorization Rules In general, for most deployments, we can leave the first 3 entries of Flexible Authorization alone: PER_URL_ACL, CACHE, and POLICY. This also applies for this particular deployment. PER-URL-ACL rule will kick in for SharePoint and Salesforce content because ACLs are indexed with documents. We do have to make some changes to the CONNECTOR rule because the default configuration is only associated with the “Default” Credential Group.
Chapter 3 Authentication for Developers Whenever possible in your deployments, you should try to use existing products, either supported by Google, provided by Google’s partners, or other 3rd party off-the-shelf products. In general, following this guidance minimizes project risks and reduces overall ownership costs. However, there may still be some requirements for which you have to develop external custom applications or processes in order to fully implement security or content integration with the GSA.
Key considerations If you want to achieve a silent authentication experience with your SSO system, consider the following items: ● A session cookie must not be restricted to the same user IP—some SSO systems support this restriction as a security measure. ● The GSA must be part of the same domain as the session cookie used by the SSO system. For instance, if a cookie is using domain “foo.com”, the GSA must be configured as part of that domain, for instance “gsa.foo.com”.
SAML The search appliance supports SAML 2.0, an XML based protocol for an external identity provider. There might be cases where you will need to develop a custom SAML IdP. Note that building a SAML IdP from scratch is time consuming. You should start with an existing code base—such as OpenSAML. Google also provides an open source project SAML Bridge for silent authentication with Windows technologies.
binding from scratch, it could be more complex as it requires an extra service (Artifact Resolver URL). There are some open source frameworks like OpenSAML and also many code samples for this on the Internet. requires managing Digital Signatures in your code. In general it’s more desirable to use SAML HTTP post binding because it provides a stronger and simpler solution, mainly in terms of high availability.
Cookie cracking vs. SAML If you need to customize your authentication process, it’s important to differentiate between cookie cracking and SAML so that you can plan the best approach before starting the project.
When the connector is intended to provide both authentication and group resolution, the implementation can ignore what the GSA passes to it through the AuthenticationIdentity object and provide a verified username and groups back to the GSA through AuthenticationResponse.
(beta) Trusted Application A very common use case is for the GSA to be deployed behind a portal to provide a search service. The search UI is provided in the portal and users don’t interact with the appliance directly. A challenge in such a use case is how to pass the user’s credential to GSA without asking the user to login.
8. When the trusted user session expires (cookie expired based on Session timeout setting under Secure Search -> Access Control), the GSA will return an error: "The remote server returned an error: (502) Bad Gateway." 9. When the trusted user session is valid (didn't exceed session timeout value), but the authN mechanism's trust duration expires, the appliance performs another authentication using the trusted user credential.
Chapter 4 Authorization for Developers Overview An enterprise search engine must return relevant results to the user, but only those that the user has access to. This is managed through the authorization process that applies to every secure document in the index. In this chapter we focus on custom solutions when designing the authorization process in your enterprise search project with Google.
The attribute “inheritance-type” makes it possible to model the different security mechanisms of various content systems. In an inheritance chain, the permission check always traces back to the top and permissions are evaluated according to the inheritance type that was set: ● PARENT_OVERRIDES ○ ● CHILD_OVERRIDES ○ ● The permission of the parent ACL dominates the child ACL, except when the parent permission is INDETERMINATE. In this case, the child permission dominates.
“Free” ACL example edward william ben nobles playwrights ... ... In this example, http://dummyhost.corp.
Connector Framework for Authorization Another option for modeling security is implementing a custom connector. As it’s explained in this paper and GSA documentation, a connector can be created to “traverse” or feed public or secure content into the search appliance as well as to support serve time authentication and authorization. We have discussed connectors using Per-URL ACL. Here we will discuss using connectors to perform authorization as a late binding mechanism.
Web proxy The options described above are the most common platforms used to implement the security side of the interconnection with a content source. There are others, such as using a web proxy to manage the authorization. In this case, the authorization is centralized in a web proxy that requires all URLs to be rewritten to go through it. So the search appliance sends HTTP head requests to validate security before serving results.
Summary In this paper, we have reviewed the process of designing security for your enterprise search project with the Google Search Appliance. This requires a solid understanding of security in your organization, as well as the related content sources that will be part of the project. You need to invest quality time in analyzing this scenario and modeling authentication and authorization in the search appliance.
Appendix A Sample Trusted Application client code in C# using using using using using using System; System.Collections.Generic; System.Linq; System.Net; System.IO; System.Text; namespace TrustedApp { class GSAClient { String GSA_SESSION_ID = "GSA_SESSION_ID"; String _gsaSessionId = null; String _trustedUser; String _trustedPwd; String _gsaHostName; String _endUser; String _credentialGroup; static void Main(string[] args) { String gsaHostName = "gsa.acme.
request.ContentType = "application/x-www-form-urlencoded"; ServicePointManager.ServerCertificateValidationCallback = new System.Net.Security.RemoteCertificateValidationCallback(AcceptAllCertifications); request.Proxy = WebRequest.DefaultWebProxy; request.CookieContainer = new CookieContainer(); if (_gsaSessionId != null) { request.CookieContainer.Add(new Cookie(GSA_SESSION_ID, _gsaSessionId) { Domain = _gsaHostName }); } else { string authInfo = _trustedUser + ":" + _trustedPwd; authInfo = Convert.
iRetry++; goto Initiate; } else throw e; //if still fails, it might be some other cause. } return strRsps; } public static bool AcceptAllCertifications(object sender, System.Security.Cryptography.X509Certificates.X509Certificate certification, System.Security.Cryptography.X509Certificates.X509Chain chain, System.Net.Security.