Problem Statement
What is the most suitable deployment mode for vCenter Single-Sign On (SSO) in an environment where there is a single physical datacenter with multiple vCenter servers?
Requirements
1. The solution must be a fully supported configuration
2. Meet/Exceed RTO of 4 hours
3. Support Single Pane of glass management
4. Ability to scale for future vCenters and/or datacenters
Assumptions
1. All vCenter instances can access the same Authentication source (Active Directory or OpenLDAP)
2. The average number of authentications per second for each SSO instance is <30 (Configuration Maximum)
Constraints
1. vCenter servers reside in different network security zones within the datacenter
Motivation
1. Future proof the environment
Architectural Decision
1. Use “Multi-site” SSO deployment mode
2. Use one SSO instance per vCenter
3. Each SSO instance will reside with the vCenter on a Windows 2008 x64 R2 virtual machine in a vSphere cluster with HA enabled
4. Each SSO instance will use the bundled SQL database
5. (Optional) For greater availability, vCenter Heartbeat can be used to protect each SSO instance along with vCenter and the bundled SSO database
6. The Virtual Machine hosting vCenter/SSO will be 2vCPU and 10GB RAM to support vCenter/SSO/Inventory Service and an additional 2GB RAM to support the bundled SSO Database
7. Using the bundled SSO database ensures only a single vCenter Heartbeat deployment is required to protect each vCenter/SSO instance and reduce Windows licensing
Justification
1. To simplify the maintenance/upgrade process for vCenter/SSO as different versions of vCenter cannot co-exist with the same SSO instance
2. If “High Availability” mode is used it would prevent single pane of glass management
3. “High Availability” mode currently requires an SSL load balancer to be configured as well as manual intervention which can be complicated and problematic to implement and support
4. “Basic” mode prevents the use of Linked Mode which will prevent the management of the environment being single pane of glass
5. Where vCenter servers reside in different network security zones, Using Multi-site mode allows each SSO instance to use authentication sources that are as logically close as possible while supporting single pane of glass management. This should provide faster access to authentication services as each SSO instance is configured with Active Directory servers located in the same or logically closest network security zone/s.
6. If one instance SSO goes offline for any reason, it will only impact a single vCenter server. It will not prevent authentication to the other vCenter servers.
7. Reduce the licensing costs for Microsoft Windows 2008 by combining SSO and vCenter roles onto a single OS
Alternatives
1. Use “Basic” Mode, resulting in a standalone version of SSO for each vCenter server with no single pane of glass management
2. Use “High Availability” mode per vCenter
3. Use a shared “High Availability” mode for all vCenters in the datacenter
4. In any SSO configuration, Host the SSO database (per vCenter) on a Oracle OR SQL Server
5. Run SSO on a dedicated Windows 2008 instance with or without the SSO database locally
6. Run a single SSO instance in “Multi-Site” mode , use vCenter Heartbeat to protect SSO (including the database) and share the SSO instance with all vCenters
Implications
1. Where SSO is not protected by vCenter Heartbeat (optional), SSO for each vCenter is a Single point of failure where authentication to the affected vCenter will fail
2. “Multi-Site” mode requires the install-able version of SSO, which is Windows Only which prevents the use of the vCenter Server Appliance (VCSA) as it only supports basic mode.
Related Articles
1. vSphere 5.1 Single Sign On (SSO) deployment mode across Active/Active Datacenters
2. vSphere 5.1 Single Sign On (SSO) Architectural Decision Flowchart
3. Disabling Single Sign On – Dont Do It! – Michael Webster (VCDX#66) @vcdxnz001
Hi Josh. I wanted to hear your thoughts on this. I have reached out to many colleagues who are VMware engineer peers of mine on SSO and I saw your blog so I thought I would reach out to you. I think SSO has caused some “issues” from a design perspective for many people so I wanted to give my scenario to see what your feedback might be. I currently have a mid size ESX environment (100 nodes across 15 sites) that spans multiple geographic locations. We have a primary DC that has virtual center supporting a group of clusters and then a remote DR DC (several states away) that has separate virtual center servers for those clusters. vCenter setup in linked mode between the sites as of now. We run 4.1 but will build brand new vCenter and ESX for 5.1. We also have 30 or so remote sites that connect back to central vCenter in HQ DC for other ESX hosts. We have View as well but plan will be to keep that vcenter separate and will not be linked to the HQ vCenter in future. We don’t use SRM. We replicate LUNs and I have automated scripts that will autoscan LUNs, add LUNs to ESX hosts and autostart VMs in event of disaster and failover to our remote DR site.
I looked at doing multi site SSO and centralizing in one DC and using F5 LB. I would then point all vCenter to that SSO cluster. I am leaning away from that since if I locate that in primary DC, in event of DR, while the VM running SSO is replicated, the DB that would support SSO may not be up on the remote DR side nor will the F5 for several days. That means we have a SSO cluster that cant connect to and authenticate or work until DB is up an running Because of that, I am thinking to have separate basic SSO’s in each DC and just forget about using linked mode for the primary and DR DC. Any remote sites running ESX will still connect back to central virtual center which would use the primary site SSO and the virtual center server supporting View will also use SSO in primary site. No plans to care about view in event of DR for now. The idea is that if we have a disaster, we can still have the remote site fully independent and able to support all the VMS that would failover. I am sure we can wait a couple of days to have the replicated VM running vcenter in primary site to come up on DR side as vCenter is not required to keep remote site ESX hosts running.
Ideas???
I would suggest that Multi-Site mode is the obvious choice, as you use Linked-Mode and you have many sites and multi-site mode will help future proof the solution. In my opinion, a central SSO instance is potentially risky, and as one of your requirements is to be able to have remote sites remain functional independent of the primary SSO site.
I would do an SSO instance per data centre where a vCenter Exists. For Sites where a vCenter does not exist you will need to accept they will be un-managable until their vCenter and SSO is available over the WAN.
As for central SSO instance per site, I would say it is better to run a single SSO instance per vCenter to simplify the upgrading process and remove the dependency on the WAN.
If high availability of SSO is required consider vCenter Heartbeat and a dedicated Windows instance for SSO running both SSO and the database as this is currently the only way to make the DB highly available and it would also improve performance for vCenter servers servicing the larger clusters.
Hope that helps.