Administrator's Guide

8 Regional configuration
8.1 Overview
This chapter describes the configuration needed to support High Availability (HA) for SDN
Controllers to OpenFlow switches. This is done by creating region configurations in the controllers
using the REST APIs provided by the Role Orchestration Service (ROS).
Putting the region configurations in place in a controller team ensures seamless failover and failback
among the configured controllers for the specified network devices in a region. That is, when a
master controller experiences a fault, the Role Orchestration Service ensures that a slave controller
immediately assumes the master role over the group of network devices to which the failed controller
was in the master role. Once the failed controller recovers and rejoins the team, the Role
Orchestration Service ensures restoration of this controller’s role; that is, the rejoining controller
takes back the role for which it was configured with respect to the other network devices. If the
controller was configured to operate as the master in a region, then it would be restored to the
master role. If it was configured to operate in the slave role, it would resume operation in the slave
role.
Once the region definition(s) are in place, the ROS ensures that a master controller is always
available to the respective network element(s) even if the configured master fails or there is a
disruption of the communication channel between the controller and the network device(s).
NOTE: All region configuration operations (create, update, refresh, and delete) using the REST
API require that every controller specified in the region, including the master controller and all
slave controllers, be in an active state. If any controller in the region is in a "down" state, then the
region configuration operations are disallowed
8.1.1 Failover
ROS triggers the failover operation in two cases:
Controller failure: The ROS detects a controller failure in a team through notifications from the
teaming subsystem. If ROS determines that the failed controller instance was a master for any
region, it immediately elects one of the backup (slave) controllers to assume the master role
over the affected region.
Device disconnect: The ROS instance in a controller is notified of a communication failure
with network device(s) through the Controller Service notifications. It instantly communicates
with all ROS instances in the team to determine if the network device(s) in question are still
connected to any of the backup (slave) controllers within the team. If that is the case, it elects
one of the slaves to assume the master role over the affected network device(s).
8.1.2 Failback
When the configured master recovers from a failure and rejoins the team, or when the connection
from the disconnected device(s) with the original master is resumed, ROS initiates a failback
operation in which the master role is restored to the configured master as defined in the region
definition.
The next section provides details about the various REST operations that can be used to create,
update, and delete region configurations.
NOTE: Examples of cURL commands in this guide use the --noproxy option, which is appropriate
where execution of cURL commands does not need a proxy to access controllers. If your network
is set up such that a proxy is needed to access controllers, use the --proxy option. For details
on cURL proxy options, visit http://curl.haxx.se/docs/manpage.html.
88 Regional configuration