Building planet-scale distributed apps (1/2)
What happens if one morning you are struck by the idea of building something that is planet-scale and distributed, like a WhatsApp or an Uber? Many consumer applications are set to have properties in common with these.
In this two-part series of blogs we explore an approach to building applications that are planet-scale. Any architect using new technologies is bound to gain from the patterns and practices discussed here.
These blogs discuss an approach to tame the chaos associated with applications at such scale and complexity. The approach can be broken down as follows:
- Design and architect - Covers aspects involved in identifying parameters important for the application and the technological components or services that it could use. All this converges to a reference architecture and considerations.
- Build and deploy - Addresses implementing the reference architecture, application development practices and continuous innovation, robustness, and scaling.
- Operate and stabilize - Discusses elements associated with operating a distributed system(monitoring, tracing, and measuring user experience and monitoring infrastructure).
This blog covers the first of three parts - Design and architect
The design phase identifies priorities for building an Internet scale, distributed application where user experience is paramount It identifies user/customer priorities and maps them to technical components/services.
To make things easier let's pretend we're building an application for a taxi booking service. We expect tens of thousands of taxis, drivers and riders in most major cities using this service. Our goal is that users should be able to book taxis from a mobile app and the backend should figure out the geographically nearest taxi, book the taxi, and that taxi should drive towards the rider who booked it (you know the drill). In this design and architect exercise, we explore mapping these requirements to technological services or components that are either already available and create some as we find a need. The objective is also to explore the marriage between application architecture, container architecture and cloud platform architecture. It is important to note that these do not exist in isolation. It is a well choreographed dance of these architectures. The individual(s) responsible for the design of such systems should recognize the fusion of technologies and factor this into the overall architecture.
The broad category of technologies chosen for this solution are:
- Cloud platform (Hosting and PaaS services) - Google Cloud, Microsoft Azure
- Container and Container orchestration (Microservices architecture)- Docker and Kubernetes
- Actor Framework (high-scale and distributed computing framework) - Microsoft Orleans
- Mobile App Dev platform - Xamarin Forms
The most important and exciting part of this is designing the application, the container architecture and orchestration using Kubernetes and Google Cloud. Before we delve into the finer details, we'll do a component-wise breakdown of what makes up our application (here we refer to the high-scale distributed backend and not the mobile app). Although not with absolute strictness or completeness, we'll discuss this application in line with 4+1 architecture model.
The scenario is that of a taxi booking service, one that we're familiar with. Based on this we can deduce the physical components and technologies. It is one of the requirements that the components or different parts of the application will experience varying load and performance requirements, differing development language/frameworks, upgrade cycles.
Consider the diagram below to get a sense of the physical components at play.
Core Infrastructure, Technologies & Kubernetes
Kubernetes is the primary compute tier for almost all of the services that we develop. We also rely on PaaS services for specialized capabilities that we do not want to reinvent. Any code that we write forms a container(s)/service in the Kubernetes cluster and any code that we avoid writing is due to the availability of a PaaS service that is readily usable, technologically capable of meeting our scaling/performance requirements and is cost effective. The Kubernetes cluster is hosted in Google Cloud.
The above diagram indicates our deployment view of the system. In simple terms this indicates:
- (1) - All mobile and web clients requesting a taxi. The geolocation and ID that represents the signed-in user with ‘auth token’ is passed.
- (2) - A Google Cloud L7 load balancer provisioned as a result of creating a Kubernetes Ingress does the application routing (refer the ‘microservices’ section below).
- (3) - All of the code and services that are developed for our application are hosted in a managed Kubernetes environment (GKE) in the Google Cloud
- (4) - All of the taxis on the ground send their geographical location to the system. Once a rider books a taxi, the location of that taxi is tracked in real-time by an API that we host in Kubernetes (more about this in application architecture)
- (5) - A Google Cloud Pub/Sub service that can consume massive amounts of incoming data. All the taxi location data is consumed by this PaaS service. An alternative to this could be Event Hubs in Azure, Kinesis in AWS or self-hosted Kafka in Kubernetes. - Note that this pub/sub is used only by taxis that are looking for a ride. Once a ride is booked the calls are made to a separate API that provides real-time location and updates streamed between the mobile device and microservices backend (See the ‘gRPC’ section on this below).
- (6,7,8,9) - Represents application components exposed as Kubernetes services. The pods that belong under each of these would be of its own and can auto-scale based on load (more about this in the ‘microservices’ section below). All the ‘services’ are wrapped into Docker containers running Linux .Net Core as a managed runtime.
- (10) - The architecture proposes to implement a virtual actor framework using Microsoft Orleans. All of the riders, drivers, and cars will be represented as single threaded entities within Orleans Silos that we create (read more in the ‘Virtual Actors’ section below).
- (11) - Uses Azure table storage for membership information and state of Orleans Silos for the election of nodes/silos for Orleans.
- (12) - Updates the location of each taxi to Azure Search for enabling geographical search for a taxi closer to the rider.
The diagram does not attempt to detail monitoring, logging, CI/CD pipelines etc.; those will be covered in the subsequent parts of this blog on the deployment and management of this system.
Microservices: The application design follows a microservices pattern. Kubernetes makes it easy to implement these using constructs like services, pods, Ingress, load balancer, etc. The application design is adequately supported by Kubernetes components and its architecture. Each of the functionalities of our taxi service—booking a taxi, searching for a taxi, applying discounts, etc- are baked and exposed as services in Kubernetes. Each service has its corresponding pod with its own container image. All the services are fronted by an Ingress and in turn a Google Cloud application-aware load balancer. This performs the routing to the appropriate service being exposed via Kubernetes.
[The yaml config is truncated to suit ease of explanation]
The above diagram indicates that Ingress allows for routing API calls to the services provisioned within Kubernetes. Each of these services is independently scalable and upgradable without degrading the user experience. (CI/CD, rolling/canary upgrade will be discussed in part 2/2 of this series). The service may also be versioned, and version tag may be passed by the URI or an HTTP header and diverted using Istio or a similar service mesh [This is cleaner, elegant and extensible - IMO].
Virtual Actors and Microsoft Orleans: So far, we've discussed the application components at an API level and service routing. The architecture chosen here has many ‘states’ to be maintained that are highly scaled and distributed. For example, at any point in time, the location of each taxi in the fleet needs to be tracked, every rider requesting for a taxi, every driver, every ride so far, demand in a region, etc. All this has to be handled with a low latency and a high accuracy to provide a premium user experience. Also, be aware that we have a system that could face ‘race conditions’ and ‘concurrency issues’ like multiple users attempting to book the same taxi.
Considering the high-scale and distribution, we've chosen Microsoft Orleans for its simplicity in providing scalability, and its ways to handle/avoid concurrency issues, fault tolerance, and resource management. Orleans implements the concept of 'Grains' based on the Actor model, which is the atomic unit of isolation, distribution, and persistence. As in OOP, a Grain encapsulates the state of an entity and encodes its behavior in the code logic. Every Grain in the system has a unique identifier, also referred to as a Grain Key. Orleans solves concurrency issues and race conditions by ensuring that sharing data between Grains happens only by message passing and by providing a single threaded execution guarantee. This means every taxi, driver or rider could be represented as a Grain within Orleans.
Additionally, Orleans implements a concept of Silos/Cluster. Silo (or nodes) are where the Grains reside. Our architecture proposes to have Silos as pods within Kubernetes. The configuration definition (detailed in later blogs) allows them to join to the same cluster and perform elections. In our scenario, the cars inject their geo-location to a pub/sub, and we have a service that reads from the queue and updates the location of each taxi to the Grain using the taxi identifier-Grain Key. See the diagrammatic representation and code snippet below for more clarity:
- (1) Indicates a Kubernetes service that wraps multiple pods that are silos in the Orleans cluster
- (2) Each silo registers itself to the cluster based on the configuration dictated in code
- (3) The membership table records the state of all silos in the cluster whether active or dead. This table is used for fault tolerance within the cluster.
- (4) In case a silo stops to function the cluster redistributes the resources, Grains, and entities in the cluster.
- (5) All entities, in our case the taxis and users, are represented as Grains within the cluster.
You may also choose to implement the actor pattern using Akka for .Net/Java, or Orbit framework for Java from Electronic Arts or Service Fabric.
gRPC, Http2, Binary Serialization and Streaming Data - The system implements gRPC for communication across clients and the backend. The endpoints for communication uses Net Core 2.2.0 preview that supports http2 and thus gRPC. The reason for this is the ability of Http2 for multiplexing and using gRPC that allows for streaming data. The experience for users is far superior while using streaming data, particularly with a real-time approach of a taxi towards the rider who has booked it. With Http 1.1 this means repeated calls to an API endpoint that's relatively low performing for the user experience we're targeting. All communication between Actors/Grains within Orleans uses binary serialization. This is extremely fast performing and is a highly efficient system when compared to juggling JSON between microservices or endpoints.
So far, we've covered the high-level architecture, technology choices, and large components. Microservices based architecture, high-scale constructs, streaming data are all first-class citizens in the apps we will use and make in future. The concepts covered here are some of the newest technological capabilities that give application services tremendous possibilities. In turn, they provide exceptional value to customers at a marginal cost and developmental effort.
In the subsequent part of this 2-part blog series, we'll see the implementation details of this system - CI/CD tools involved like Spinnaker, VSTS/Travis, scaling in Kubernetes, deployment of Kubernetes Ingress, services for each microservice and structuring the microservices and Orleans silos implementation.
Note: Selection of products used is not the focus here. Similarly, geographical distribution, business continuity, disaster recovery, backup etc., are intentionally excluded from the scope of this article.
May 16, 2022
May 10, 2022
Dec 20, 2021