Google Search Appliance Administrative API Developer’s Guide: .NET Google Search Appliance software version 7.
Google, Inc. 1600 Amphitheatre Parkway Mountain View, CA 94043 www.google.com GSA-CSAPI_100.05 December 2013 © Copyright 2013 Google, Inc. All rights reserved. Google and the Google logo are, registered trademarks or service marks of Google, Inc. All other trademarks are the property of their respective owners. Use of any Google solution is governed by the license agreement included in your original contract.
Contents Administrative API Developer’s Guide: .NET ..................................................................
Administration License Information Import and Export Event Log System Status Shutdown or Reboot 49 49 50 51 52 53 Index ....................................................................................................................... 54 Google Search Appliance: Administrative API Developer’s Guide: .
Administrative API Developer’s Guide: .NET Introduction This guide provides .NET programming information about how to use the Google Data API to create, retrieve, update, and delete information for one or more Google Search Appliance devices. Use the information in this guide to create or learn about coding .NET applications that programmatically set the administrative functions for the Admin Console of a search appliance. This document uses C# examples, but you can use any .
The information in this section helps you understand how to write your own applications based on the C#.NET client library and how to run the provided open source sample applications. Before starting, you need the following: • Microsoft Visual C# 2008 Express Edition (http://www.microsoft.com/exPress/download/ #webInstall), which includes a free version of Visual Studio so that you can work with the .NET client library.
Building Your Applications This section explains how to build your own applications using the client library outside the solution file provided by the ZIP archive. To build an application: 1. Copy the following client library DLL files from the cs\lib folder to your development folder and add them in the reference path: • Google.GData.Apps.dll • Google.GData.Client.dll • Google.GData.Extensions.dll • Google.GData.Gsa.dll 2. In Visual Studio, create or open a new project. 3.
• “Document Status” on page 21 Crawl URLs Retrieve and update crawl URL patterns on a search appliance using the crawlURLs entry of the config feed. Property Description doNotCrawlURLs Do Not Crawl URLs with the following patterns, separate multiple URL patterns with new line delimiters. followURLs Follow and crawl only URLs with the following URL patterns, separate multiple URL patterns with new line delimiters.
Data Source Feeds Retrieve, delete, and destroy data source feed information for the search appliance using the feed feed. The following parameters let you search for a string and retrieve source statements. Parameter Description query The query string. When used to retrieve all feed information, the query parameter is overloaded to mean the feedDataStore. When getting information about a single feed, the parameter is a query. Each log statement contains a query string to be retrieved.
Retrieving Data Source Feed Information Retrieve all data source feed information from a search appliance using the feed feed: // Send a request and print the response Dictionary queries = new Dictionary(); queries.Add("query",feedDataSource); GsaFeed myFeed = myService.QueryFeed("feed", queries); foreach(GsaEntry myEntry in myFeed.Entries) { //get information on each myEntry Console.WriteLine("Feed Name: " + myEntry.GetGsaContent("entryID")); Console.
Destroying Data Source Feeds After deleting a data source feed, you can destroy the feed so that the feed no longer exists on the search appliance: myService.DeleteEntry("feed", FEED_NAME); Trusted Feed IP Addresses Retrieve and update trusted feed IP addresses using the feedTrustedIP entry of the config feed. Retrieve the IP addresses of trusted feeds using the trustedIPs property. Property Description trustedIPs Trusted IP addresses. This value is a list of one or more IP addresses.
Crawl Schedule Retrieve and update the crawl schedule for a search appliance. Property Description crawlSchedule The crawl schedule is only available in scheduled crawl mode. The value of crawlSchedule has the format: Day,Time,Duration Where: isScheduledCrawl • Day is a number representing the days of a week: 0 means Sunday and 1 means Monday. • Time is a 24-hour representation of time. The time pertains to the search appliance and not the computer running the application to set the value.
Crawler access rules instruct the crawler how to authenticate when crawling the protected content. Property Description domain Windows domain for NTLM, or empty for HTTP Basic authorization. isPublic Indicates whether to allow users to view results of both the public content (normally available to everyone) and the secure (confidential) content. The value can be 1 to enable users to view content as public, or 0 to require users to authenticate to view secure content.
Retrieve an individual crawler access rule as follows: // Send the request and print the response GsaEntry myEntry = myService.GetEntry("crawlAccessNTLM", "urlPattern"); Console.WriteLine("URL Pattern: " + myEntry.GetGsaContent("urlPattern")); Console.WriteLine("User Name: " + myEntry.GetGsaContent("username")); Console.WriteLine("Order: " + myEntry.GetGsaContent("order")); Console.WriteLine("Domain: " + myEntry.GetGsaContent("domain")); Console.WriteLine("Is Public: " + myEntry.
Host Load Schedule Retrieve and update host load schedule information from the search appliance using the hostLoad entry of the config feed. Property Description defaultHostLoad The default web server host load, a float value. This value measures the relative load on the search appliance based on the number of connections that a search appliance can handle.
Updating the Host Load Schedule Update the host load schedule setting in a search appliance as follows: // Create an entry to hold properties to update GsaEntry updateEntry = new GsaEntry(); // Add a property for the Host Load Schedule to updateEntry updateEntry.AddGsaContent("defaultHostLoad", "2.4"); updateEntry.AddGsaContent("exceptionHostLoad", "* 3 5 1.2 \n www.example.com 1 6 3.6"); updateEntry.AddGsaContent("maxURLs", "3000"); // Send the request myService.
Recrawling URL Patterns If you discover that a set of URLs that you want to have in the search index are not being crawled you can inject a URL pattern into the queue of URLs that the search appliance is crawling. URLs may not appear in the index because changes were made to the web pages, or because a temporary error or misconfiguration was present when the crawler last tried to crawl the URL. Property Description recrawlURLs URL patterns to be recrawled.
Retrieving a List of Connector Managers Retrieve a list of connector managers as follows: // Send the request and print the response GsaFeed myFeed = myService.GetFeed("connectorManager"); foreach(GsaEntry myEntry in myFeed.Entries) { Console.WriteLine("Status: " + myEntry.GetGsaContent("status")); Console.WriteLine("Description: " + myEntry.GetGsaContent("description")); Console.WriteLine("URL: " + myEntry.
The properties for retrieving a OneBox are as follows: Property Description maxResults Maximum number of results. timeout OneBox response timeout in milliseconds. Updating OneBox Module Settings Update the OneBox settings for a search appliance as follows—in this example three results are requested and the timeout is set to 2000 milliseconds. // Create an entry to hold properties to update GsaEntry updateEntry = new GsaEntry(); // Add properties for the OneBox settings to updateEntry updateEntry.
Retrieve an individual OneBox module’s log information from a search appliance as follows: // Send the request and print the response GsaEntry myEntry = myService.GetEntry("onebox",ONEBOX_NAME); Console.WriteLine("OneBox Log: " + myEntry.GetGsaContent("logContent")); Note: You can only retrieve OneBox log entries individually. Deleting a OneBox Module Delete a OneBox module from a search appliance as follows: myService.
Document Status Retrieve document status using the properties that follow. Property Description crawledURLsToday The number of documents crawled since yesterday. (Note that the time pertains to the search appliance, not the computer sending this request.) crawlPagePerSecond Current crawling rate. errorURLsToday The document errors since yesterday. filteredBytes The document bytes that have been filtered. foundURLs The number of URLs found that match crawl patterns.
Collections Retrieve, update, create, or delete the collections of documents on the search appliance. Property Description collectionName The name of the collection to create, which is only required when creating a new collection. doNotCrawlURLs The URL patterns of content that you want to exclude from this collection. followURLs The URL patterns of content that you want to include in this collection.
Retrieving All Collections Retrieve a list of collections as follows: // Send the request and print the response GsaFeed myFeed = myService.GetFeed("collection"); foreach(GsaEntry myEntry in myFeed.Entries) { Console.WriteLine("Collection Name: " + myEntry.GetGsaContent("entryID")); Console.WriteLine("Follow URLs: " + myEntry.GetGsaContent("followURLs")); Console.WriteLine("Do Not Crawl URLs: " + myEntry.
Document Status Values The following tables list the document status values. Note: Use the all to indicate any status value.
Excluded Description 6 Long redirect chain 8 Infinite URL space 9 Unhandled protocol 10 URL is too long 13 The robots.txt file indicates to not index 18 Rejected by rewrite rules 19 Unknown extension 20 Disallowed by a meta tag 24 Disallowed by the robots.txt file 26 Unhandled content type 27 No filter for this content type 34 robots.txt forbidden Google Search Appliance: Administrative API Developer’s Guide: .
Listing Documents Query parameters: Value Description collectionName Name of a collection that you want to list. The default value is the last used collection. flatList Indicates: false: (Default) List the files and directories specified by the URL. true: List all files specified by a URL as a flat list. negativeState Indicates: false: (Default) Return documents with a status equal to view. true: Return documents with a status equal to view. pageNum The page you want to view.
Directory status entry properties: Property Description Entry Name The URL of the directory. numCrawledURLs The number of crawled documents in this directory, numExcludedURLs The number of excluded URLs in this directory. numRetrievalErrors The number of retrieval error documents in this directory. type DirectoryContentData or HostContentData. Document status entry properties: Property Description Entry Name The URL of the document. docState The status of this document.
Viewing Index Diagnostics for a Document Retrieve detailed information about a document by sending an authenticated GET request to a document status entry of the diagnostics feed. The parameter is as follows. Parameter Description collectionName Name of the collection for which you want to view crawl diagnostics. A detailed document status entry is returned. Detailed document status entry properties: Property Description Entry Name The URL of the document.
GsaEntry entry = myService.GetEntry("diagnostics", "http://server.com/secured/test1/doc_0_2.html"); Console.WriteLine("Collection List: " + entry.GetGsaContent("collectionList")); Console.WriteLine("Forward Links: " + entry.GetGsaContent("forwardLinks")); Console.WriteLine("Backward Links: " + entry.GetGsaContent("backwardLinks")); Console.WriteLine("Is Cached: " + entry.GetGsaContent("isCached")); Console.WriteLine("Document Date: " + entry.GetGsaContent("date")); Console.
A list of content statistics entries are returned. GsaFeed myFeed = myService.GetFeed("contentStatistics"); foreach(GsaEntry entry in myFeed.Entries) { Console.WriteLine("Entry Name: " + entry.GetGsaContent("entryID")); Console.WriteLine("Maximum Size: " + entry.GetGsaContent("maxSize")); Console.WriteLine("Minimum Size: " + entry.GetGsaContent("minSize")); Console.WriteLine("Total Size: " + entry.GetGsaContent("totalSize")); Console.WriteLine("Average Size: " + entry.GetGsaContent("avgSize")); Console.
Resetting the Index Reset the index as follows: // Create an entry to hold properties to update GsaEntry updateEntry = new GsaEntry(); // Add a property to updateEntry updateEntry.AddGsaContent("resetIndex", "1"); myService.
Get information about a front end as follows: // Send a request and print the response GsaEntry myEntry = myService.GetEntry("frontend", FRONTEND_NAME); Console.WriteLine("Front End OneBox: " + myEntry.GetGsaContent ("frontendOnebox")); Console.WriteLine("Remove URLs: " + myEntry.
Output Format XSLT Stylesheet Retrieve and update the XSLT template and other output format-related properties for each language of each front end using the frontend entry of the outputFormat feed. Parameter Description language Specify a language for the output format properties that you want to retrieve. Each front end can contain multiple languages, and each language has its own output format properties. Each front end + language can have its own XSLT stylesheet.
Retrieving the Output Format XSLT Stylesheet Retrieve the output format stylesheet information from a search appliance as follows: Dictionary queryMap = new Dictionary(); // Initialize the query map queryMap.Add("language", "en"); GsaEntry myEntry = myService.QueryEntry("outputFormat", "default_frontend", queryMap); Console.WriteLine("Language: " + myEntry.GetGsaContent("language")); Console.WriteLine("Default Language: " + myEntry.
Parameter Description startLine The starting line number of a result, default value is 0. maxLines The number of result lines in a response, default value is 50 lines. Use the following properties to set KeyMatch configurations. Property Description line_number The line_number of the KeyMatch configuration rule. newLines The new KeyMatch configuration to update. This value may include multiple KeyMatch statements. The line delimiter is \n. numLines The number of total result lines.
Retrieving KeyMatch Settings Retrieve KeyMatch settings as follows: Dictionary queryMap = new Dictionary(); // Initialize the query map queryMap.Add("query", "myQuery"); queryMap.Add("startLine", "0"); queryMap.Add("maxLines", "50"); // Send the request and print the response GsaEntry myEntry = myService.QueryEntry("keymatch", "myFrontend", queryMap); foreach(KeyValuePair kvp in myEntry.GetAllGsaContents()) { if (Regex.IsMatch(kvp.Key, @"^\d+$")) { Console.
The following example updates KeyMatch settings: // Create an entry to hold properties to update GsaEntry updateEntry = new GsaEntry(); updateEntry.AddGsaContent("updateMethod", "update"); // Set the start line number updateEntry.AddGsaContent("startLine", 0); // Provide the original content string originalLines = "image,KeywordMatch,http://images.google.com/,Google Image Search\n" + "video,KeywordMatch,http://www.youtube.com/,Youtube\n" + "rss feed,PhraseMatch,http://www.google.
Use the following properties to access related queries. Property Description line number The line number of the related query configuration rule (in all the rules). newLines The new related query configuration to add. This value may include multiple lines of related query statements. The delimiter is \n. numLines The total number of result lines. originalLines The original related query configuration to change. This value may include multiple lines of related query statements. The delimiter is \n.
Changing Related Queries The following example appends related queries: // Create an entry to hold properties to append GsaEntry appendEntry = new GsaEntry(); appendEntry.AddGsaContent("updateMethod", "append"); // Prepare new content string newLines = "airplane,aircraft\n" + "google,googol\n" + "stock,security"; appendEntry.AddGsaContent("newLines", newLines); // Send the request to the search appliance myService.
Query Suggestion Blacklist The query suggestion blacklist supports the /suggest feature described in the “Query Suggestion Service /suggest Protocol” chapter of the Search Protocol Reference. This feature uses the suggest feed to retrieve and update the query suggestion blacklist entries. Property Description suggestBlacklist Content of the suggest blacklist file. The query suggestion blacklist supports the regular expressions in the re2 library (http:// code.google.com/p/re2/wiki/Syntax).
Search Status Retrieve the search status for the search appliance using the servingStatus entry of the status feed. Property Description queriesPerMinute Average queries per minute served recently on the search appliance. searchLatency Recent search latency in seconds. Retrieving Search Status Retrieve the current search appliance serving status as follows: GsaEntry myEntry = myService.GetEntry("status", "servingStatus"); Console.WriteLine("Queries Per Minute: " + myEntry.
Property Description reportState (Read only) The status of a search report: • 0: The search report is initializing. • 1: The search report is generating. • 2: The search report is complete. • 3: A non-final complete report is generating. • 4: The last report generation failed. topCount The number of top queries to generate. withResults Indicates if a query should only count searches that have results. The default value is false.
Purpose Format Year year_year Date range range_month_day_year_month_day_year For example, to specify the range of dates from 2 January 2009 to 23 September 2009, use this statement: insertEntry.addGsaContent(“reportDate”, “range_1_2_2009_9_23_2009); A new search report entry will be generated and returned as follows. GsaEntry insertEntry = new GsaEntry(); insertEntry.AddGsaContent("reportName", "bbb"); insertEntry.AddGsaContent("collectionName", "default_collection"); insertEntry.
Deleting a Search Report Delete a search report by sending an authenticated DELETE request to a search report entry of the searchReport feed. The search report entry will be deleted: myService.DeleteEntry("searchReport", "bbb@default_collection"); Search Logs Generate, update, and delete a search log using the searchLog feed. A search log lists all search queries for a specified time frame in a format similar to a common log format (CLF).
Listing a Search Log List search log entries by sending an authenticated GET request to the root entry of the searchLog feed. Parameter Description collectionName Collection name of a search log. The default value is all.collections. A list of search log entries will be returned. GsaFeed myFeed = myService.GetFeed("searchLog"); foreach(GsaEntry entry in myFeed.Entries) { Console.WriteLine("Entry Name: " + entry.GetGsaContent("entryID")); Console.WriteLine("Report State: " + entry.
A search log entry with logContent, if content is ready, is returned. Dictionary queries = new Dictionary(); queries.Add("query","User"); queries.Add("startLine","1"); queries.Add("maxLine","10"); GsaEntry entry = myService.QueryEntry("searchLog", "bbb@default_collection", queries); Console.WriteLine("Entry Name: " + entry.GetGsaContent("entryID")); Console.WriteLine("Report State: " + entry.GetGsaContent("reportState")); Console.WriteLine("Report Creation Date: " + entry.
GSA Unification is also known as dynamic scalability. This section describes use of the federation feed. Configuring a GSA Unification Network Retrieve, update, create, or delete the GSA Unification node configuration and retrieve the node configuration of all nodes in the network on the Google Search Appliance. Property Description applianceId The ID of the search appliance, required to identify the node during node operations.
Adding a GSA Unification Node Add a GSA Unification node as follows: // Create an entry to hold properties to insert GsaEntry insertEntry = new GsaEntry(); // In the following example code, add a secondary // node with arbitrary values for the various settings. // Add properties to insertEntry insertEntry.AddGsaContent("entryID", "node_appliance_id"); insertEntry.AddGsaContent("nodeType", "SECONDARY"); insertEntry.AddGsaContent("federationNetworkIP", "10.0.0.2"); insertEntry.
Updating a Node Configuration Update the configuration of a node as follows: // Create an entry to hold properties to update GsaEntry updateEntry = new GsaEntry(); // Add properties to updateEntry updateEntry.AddGsaContent("entryID", "applianceId"); updateEntry.AddGsaContent("nodeType", "PRIMARY"); updateEntry.AddGsaContent("federationNetworkIP", "10.0.0.3"); updateEntry.AddGsaContent("secretToken", "new_secret_token"); updateEntry.AddGsaContent("hostname", "new_hostname"); updateEntry.
Retrieving License Information Retrieve license information using the following properties. Property Description applianceID Provides the identification value for the Google Search Appliance software. This value is also known as the serial number for the search appliance. licenseID Provides the unique license identification value. licenseValidUntil Identifies when the search appliance software license expires. maxCollections Indicates the maximum number of collections.
Exporting a Configuration Export a search appliance configuration by sending an authenticated GET request to the importExport entry of the config feed. The following importExport entry is returned: Dictionary queries = new Dictionary(); queries.Add("password", "12345678"); GsaEntry entry = myService.QueryEntry("config", "importExport", queries); Console.WriteLine("XML Data: " + entry.
Retrieving an Event Log Retrieve the event log information from a search appliance as follows: Dictionary queries = new Dictionary(); queries.Add("query","User"); queries.Add("startLine","10"); queries.Add("maxLine","2"); GsaEntry myEntry = myService.QueryEntry("logs", "eventLog", queries); Console.WriteLine("Log Content: " + myEntry.GetGsaContent("logContent")); Console.WriteLine("Total Lines: " + myEntry.GetGsaContent("totalLines")); Console.
Shutdown or Reboot Shut down or reboot the search appliance. Property Description command Command sent to the search appliance. The command can be shutdown or reboot. runningStatus Indicates the search appliance status: • shuttingDown: If you sent the shutdown command. • rebooting: If you sent the reboot command. • running: If the search appliance is operating normally.
Index Symbols D .NET client library (DLLs) 5 .NET Google Data API client library 5 data source feed delete 10 destroy 11 retrieve 10 diagnostics feed 23, 26, 28 document status, retrieve 21 A Admin Console 6, 10, 31 Administration 49–53 API software 6 authentication 7 C C#.
G S google-enterprise-gdata-api open source site 5 GSA Unification add nodes 48 configure 46–49 delete nodes 49 retrieve nodes 48 update nodes 49 GsaService object 7 L sample applications 6 search logs create 45 delete 46 entry properties 44 list entries 45 retrieve 45 update 46 search reports create 42 delete 44 entry properties 41 list entries 42 retrieve 43 update 43 searchLog feed 45, 46 searchReport feed 41, 42, 43 serving 31–40 serving status, retrieve 41 shut down a search appliance 53 status and