img
Home > Candidate Patterns > Federated Enterprise Authentication

Federated Enterprise Authentication

How can services, which are disbursed in the cloud, authenticate a federation of users (and other consuming services) using X.509 certificates in a way that supports performance and Continuity Of Operations (COOP) requirements?

Problem

Services need to authenticate user (and service) credentials from various organizations meeting performance requirements even if there are times of communications disruptions to the various organizations' enterprise credential validation mechanisms.

Solution

Federated authentication allows services to authenticate users (and other services) through local certificate status checking validators. These validators need the autonomy to generate status based on enterprise validation mechanisms for performance requirements and support authentication independently when communications to the source validation mechanism occur.

Application

A Federated Enterprise Authentication Service is used to authenticate users (and services) to other services.

Impacts

The service consumer must have an X.509 certificate credential issued by a trusted identity provider. Identity Providers may be from more than one PKI. The service provider infrastructure must provide credential validation on its network local to the services it supports and have access to the PKI(s)' CRL(s). The validation service must have the ability to sign responses independently from access to the source validation mechanism.

Architecture

Service Oriented, Domain Inventory, Hybrid, Service

Status

Under Review

Contributors

Robert Cope

Problem

A goal of a federated enterprise SOA is to share data and allow applications to be made of composite services integrated across the enterprise including externally among business partners. This requires that the architecture support authentication from individuals and services from diverse organizations. The trust of participating organizations' processes for vetting their individuals in order to issue credentials must also be addressed.

The need to integrate composite services across the enterprise and share data among consumers from diverse organizations derives the requirement to provide authentication mechanisms adaptable to Disconnected, Intermittent and Limited (DIL) connectivity and varying performance and scaling requirements. When using x.509 certificates for consumer claims, without some form of local authentication service, relying parties normally download and check a Certificate Authority's (CA) Certificate Revocation List (CRL) for themselves. For large and multiple CRLs this can cause source repository and network bandwidth performance issues. If a CRL is expired the relying party application normally rejects the authentication request from a consumer which can cause service disruption issues.

In the case of a Public Key Infrastructure (PKI) issuing millions of X.509 certificates, a local validation service allows large CRLs, sometimes on the order of hundreds of megabytes, to be managed by providing individual responses based on a downloaded CRL rather than requiring the relying services to download and manage the CRL directly. With an Online Certificate Status Protocol (OCSP) validation service, other relying party services can access it to validate credentials rather than individually download CRLs for checking certificate revocation. With OCSP, the relying party can contact a local OCSP certificate validator which can continue to put valid signatures on the OCSP response even when CRLs are stale. They can notify administrators who can then manage the risk but keep mission critical systems operating reliably and enabling a smooth transition to stale CRLs under DIL conditions.

Solution

Federated Enterprise Authentication Service

Service View

Authentication of X.509 certificates is accomplished through a local Validation Authority (VA) OCSP Service. The VA Service provides status responses to requests for validation of individuals' and services' certificates by checking if their submitted certificate has been revoked.

Table 1, Service Capability Definitions for the VA Service, describes the functions contained in the VA service. The functions are derived from Internet Task Engineering Task Force (IETF) Request For Comment (RFC) documents as the authoritative source.

img

Table 1 - Service Capability Definitions for the VA Service

Figure 1, Summary of VA Service, illustrates a summary of the service functions.

The VA Service provides the following capabilities:

  1. Certificate revocation information via OCSP - Relying parties use this capability to consume real time certificate status. The VA Service accepts a request for a certificate or group of certificates' validation and provides a signed validation response (GOOD, REVOKED or UNKNOWN) based on the relevant CRL.
  2. Audit information - Audit logs are available for administrators that utilize this capability for compliance and incident requirements.
  3. CRL Requests - The VA uses this capability to request CRLs from certificate authorities for which the VA provides validation. The information is used to base individual certificate revocation status information via OCSP. This function strives to keep CRLs with current validity in its data store.
  4. Administration - Administrators utilize this capability to configure and manage the service. This capability sends out alerts when operating off of a stale CRL.
img

Figure 1 - Summary of VA Service

Enterprise View

Referring to Figure 2, Enterprise View of Certificate and Validation Services, it illustrates how the Federated Enterprise Validation Services operates in the enterprise. The Global VA Service consumes CRLs published by the Certificate Authority (CA) normally to a Global Directory Service, as part of the PKI, which is not shown. Instead the diagram shows CRLs going directly from the CA to the VAs. The Global VA provides validation via OCSP responses for those services that do not have access to local ones and serves as a backup for local VAs that fail.

Validation Services are deployed to the local nodes to preserve operations during DIL and to ensure reliable performance. They consume CRLs from the participating CAs when connectivity is available. When connectivity is not available the local Validation Services can validate based on stale CRLs and automatically notify administrators without disrupting operations. When local node VAs fail they can fail over to the Global VA.

Application

Implementation of the Federated Enterprise Validation Authority Service allows service providers' services to reside in a SOA Cloud and allows them; acting as relying parties, to validate X.509 credentials by OCSP locally in the same cloud. Federation in this case means consumption of a CRL from a Global PKI and using it locally to issue status responses to the local network relying parties. Validation by OCSP means checking the status of the certificate to see if it has been revoked by virtue of being on the CRL. This requires the relying party to also: determine if the certificate's validity period is current, build and verify a certificate trust chain by doing a status check of each certificate involved and verifying that the certificate is trusted. In the case of the cloud, the service owner must confirm that the VA is available on the local infrastructure topology that is common to the service in question.

img

Figure 2 - Enterprise View of Certificate and Validation Services

Impacts

The infrastructure requirements and added complexity increase the size, complexity and overall management costs of the IT environment. However, for mission critical applications, their local network infrastructure must be concerned about DIL and performance issues which local OCSP validators can solve.

Global infrastructure issues include:

  1. CRL Access - For PKIs with large CRLs, without using OCSP, the relying parties must all access them directly for validation requirements. This can mean tens of thousands of applications on thousands of networks are downloading large CRLS, on the order of 500 mb, from PKI repositories (usually directories) regularly. This can quickly cause repositories and networks to bog down. Policies and procedures need to be in place requiring relying party applications to normally use local OCSP validators and limit CRL access to OCSP validators. Strict enforcement would not be warranted because of the unanticipated user, not able to contact a local service, and COOP scenarios where relying parties need to fail over to the global validators. However, monitoring of the global infrastructure performance is a requirement.
  2. Global Validator Access - Procedures and profiles need to be in place to allow relying parties to configure local OCSP validators to fail over to the Global Validator. Procedures and profiles also need to be readily available for the type of user/relying party that needs to routinely use the Global Validators such as the person in their hotel room processing secure email.

Local infrastructure issues include:

  1. Selections of Mode of Validation - There are two modes of operation: Pre-signed validations and validations signed as they are issued . Procedures and profiles need to be developed to guide the relying party on which mode to select based on requirements and how to configure the validator. Server-based Certificate Validation Protocol (SCVP) is also an option to consider.
  2. Trust Model - There are three trust models for trusting the OCSP. They need to be profiled and selected for the requirements of the network.
  3. DIL Procedures - Procedures and profiles need to be developed for the case of wide area communications failure and inability to access a valid CRL. In this case the local administrators need to be notified and mission critical systems need to continue to operate normally with an awareness of the risk and procedures activated for locally revoking certificates if required.
  4. Federation of Users - OCSP Validators have the ability to trust more than one PKI and therefore allow users from diverse organizations to share data by validating all their certificates. Procedures and profiles need to be in place to evaluate assurance policies of other PKIs for appropriateness for access to resources and configure OCSP Validators to validate their certificates as required.
  5. Connection to Service - Procedures and profiles need to be developed for local relying parties to connect to the local validation service.

Relationships

This pattern raises a number of architectural considerations that consequently establish relationships with a variety of patterns. The patterns below are from industry and Figure 3, Federated Enterprise Validation Architectural Relationships, shows how Federated Enterprise Validation relates to these patterns and principles from Thomas Erl's Pattern Language, and SOAPatterns.org.

img

Figure 3 - Federated Enterprise Validation Architectural Relationships

  • Federated Identity is a brokered validation pattern designed for SAML. Certificates can be used to authenticate to the SAML Secure Token Service which in turn can validate them using the validation service and issue a token for Single Sign-On.
  • Policy Centralization supports Federated Enterprise Validation since policies with regard to PKI assurance level, usage policies and other security related issues associated with certificate validation can be centrally established and managed at the enterprise level.
  • Service Discovery supports this pattern since it can house the policy describing the service and the service endpoints.
  • Validation Abstraction says that detailed validation constraints should be abstracted away from the service contract. As far as the service provider is concerned the Enterprise Validation Service abstracts the status checking policies away from the service by relying on the validation service. Service policies affecting other aspects of validation can be accomplished from the service contract thus decreasing the abstraction or affected by an agent or a different service thus increasing validation abstraction.
  • Federated Enterprise Validation increases service reusability by establishing a standard means of validating any X.509 certificate. A standard security infrastructure extends the reusability of a service by providing standard security policies and services for use in any service composition.
  • Service Loose Coupling and Service Abstraction are both reduced since services will need to specify PKI tokens for identification.
  • Service Autonomy is reduced since it must rely on an external infrastructure for identification and authentication.

Case Study Example

Budget Airlines needs a maximum uptime reservation system and has decided on an architecture that places the reservation system in a number of off-premises commercial clouds. Budget provides a ticket clearance system to a number of other airlines. It requires X.509 certificates for authentication to its clearance service. Since it receives critical a portion of its business during the Thanksgiving Season, a reliability of five 9s is useless if it loses the hour down time over Thanksgiving.

Budget Airlines develops an architecture that uses X.509 Certificates and a Federated Enterprise Validation scheme. A number of instances of the reservation system are varied based on projected loads and those instances are brought up and down in two separate clouds as needed. The instances also have corresponding validation authorities local to each service in the cloud. The airlines also configure the validation authorities to fail over to a global validator to validate certificates in the event of a local responder failure. The airlines decide to trust the two largest third party PKIs' credentials and give their customers the option of using either service. Budget configures their VAs to pull both the PKIs' CRLs at the PKI policy established intervals.

The airline plans to validate certificates from both PKIs locally for performance reasons and for the certain eventuality of DIL conditions. The VAs are configured to continue validating even if DIL conditions cause any of the CRLs to go stale. In that event the responders will notify responsible authorities by email of the occurrence so they can monitor the risk and continue validating certificates based on the stale CRL.