img
Home > Candidate Patterns > Load Balanced Stateful Services

Load Balanced Stateful Services

How can the sharing of context and conversational state with predictable scalability and low latency data access speeds be provided for services that require state to be maintained across multiple load-balanced service requests?

Problem

Context data associated with a particular service activity can impose a great deal of runtime state management responsibilities upon composition controllers and other involved services, thereby reducing their scalability.

As service execution consumes computing resources and approaches limitations (memory, cpu, disk i/o throughput) on a given machine, a typical scalability solution is to virtualize the service by deploying multiple instances of a service instance across multiple machines and dispatch service requests to each of the service instances using a load balancer.

A composition coordinator may execute a process flow that needs to span multiple requests going into a service instance (re-entrancy). The service may need to maintain state that spans beyond the life of an individual service request. A load balancer may direct requests to a different instance of a service on each invocation.

Solution

Context data and rules are managed and stored by dedicated system services that are intentionally stateful. By use of the Fault Tolerant Collection a service instance may put() and get() its transient state data reliably in and out of familiar data structures such as java Hashmap or .NET Dictionary, with near in-memory speeds while still achieving predictable scalability and high availability of services. This is achieved by using the System Service FT Collection that provides shared access of state data with location transparency, that is accessible by all services in the load balanced service farm.

Application

A shared or dedicated FT Hashmap or FT Dictionary utility is made available as part of the inventory or service architecture. Multiple instances of a load balanced server may act as one by transparently sharing state across service requests.

Impacts

Performance bottlenecks due to disk I/O are dramatically reduced or eliminated. Synchronization of primary and backup data between network nodes may be affected by network latency. Reduces or eliminates the requirement for special programming models and best practices for state management.

Architecture

Inventory, Service

Status

Under Review

Contributors

David Chappell

Problem

Context data and rules associated with a particular service activity can impose a great deal of runtime state management responsibilities upon composition controllers and other involved services, thereby reducing their scalability.

As service execution consumes and approaches resource limitations (memory, cpu, disk i/o throughput) on a given machine, a typical scalability solution is virtualize the service by deploying multiple instances of a service instance across multiple machines and dispatch service requests to each of the service instances using a load balancer.

A composition coordinator may execute a business process that needs to span multiple requests going into a service instance (re-entrancy). The service may need to maintain state that spans beyond the life of an individual service request.

How can I share state data without passing it all over the wire, without disk I/O bottlenecks, and without sacrificing HA

Common solutions to this problem are to use load balancing affinity or "sticky" load balancing which always ensures that multiple service requests that belong to a particular logical conversation get routed to the same instance of a service in order to maintain continuity of transient service state data that relies upon in-memory state data left behind from previous requests. This requires additional work to be done in the load balancer, thus increasing latency, and can sometimes result in unbalanced workloads across service instances.

The service may also and need to isolate state data from multiple instances of a business process that share a given instance of a re-entrant service. The service container may implement its own thread pooling and service instance affinity in order to isolate individual conversational context. This may still requires sticky load balancing however.

Sometimes sticky load-balancing is not the preferred method, and a simpler round-robin, weighted round-robin, or work queue pull-based load balancing is desirable. In these situations the load balancer may dispatch subsequent requests, which belong to the same logical conversation, to a different instance of a service each time.

One solution for this situation is to pass all of the state data back and forth as part of the message payload of the service request and response. In this model the services themselves can be written to be completely stateless in that they are completely reconstruct the state using API's into the message payload content, and tear it down and push the current state of the data back into the payload of a response message. This practice is common, yet contains significant drawbacks.

It requires that a particular architectural style and best practices for stateless service development and message payload constructs to be created, imposed, and enforced across a development organization.

Under high volume throughput conditions, this can incur significant overhead on serialization/deserialization of data objects over each service request. This can increase latency due to network bottlenecks, additional cpu workload performing serialization and deserialization, and disk i/o for store-and-forward messaging

As an alternative, the services themselves can be stateful, and make use of State Repository and/or other deferred State patterns to store the contextual state data. In this model, the network and serialization overhead is reduced to passing a key to the data that is held in the repository.

img

In high volume throughput scenarios, conventional means of storing data such as file persistence or relational database storage can become a bottleneck for processing (illustrated in FT Stateful Services), increasing latency of service requests beyond the acceptable limits of Service Level Agreements (SLA).

img

Solution

Context data and rules are managed and stored by dedicated system services that are intentionally stateful. By use of the Fault Tolerant Collection a service instance may put() and get() its transient state data reliably in and out of familiar data structures such as java Hashmap or .NET Dictionary, with near in-memory speeds while still achieving predictable scalability and high availability of services. This is achieved by using the System Service FT Collection that provides redundancy and failover of in-memory application data by synchronizing in-memory data structures with a duplicate copy that exists on another machine on the network. The real key to this pattern is that the System Service FT Collection provides shared access of state data with location transparency, which is accessible by all services in the load balanced service farm.

img

1st invocation load-balancer routes to Service 'A', which puts state data in the FT Collection, and returns unique Hashkey as WS-Addressing reference property.

img

2nd invocation - load balancer dispatches to service 'A1', which gets state data from FT Collection using WS-Addressing reference property.

Impacts

Performance bottlenecks due to disk I/O are dramatically reduced or eliminated. Object to relational mappings are not required. Synchronization of primary and backup data between network nodes may be affected by network latency. Reduces workload on service coordinator for managing state on behalf of coordinated services.

Relationships

img