Shared-nothing Architectures — the cloud native way
A resilient system architecture makes users happier.
Why the system is down ? — The database connection has crashed, the host machine has rebooted, the proxy is down, the application server is overloaded with too much requests … Software failures remain an important cause of service’s outages, alongside infrastructure and networks failures.
An analogy with building’s design can be used here to show that for better resiliency capabilities we have to provide a wise interconnection between software’s layers and components.
In the following lines, we will see some guidelines based on the similarities between Shared-nothing architecture concepts and cloud native principles to achieve better application resiliency.
Shared-nothing concept is built on top of a common technique in distributed systems design named Partitioning.
Partitioning enhances scalability : Splitting the system by components that performs a well scoped logical task. The components are then built with an ability to support concurrency when needed to be scaled out.
Partitioning improves performance : Delegates specific tasks increase performance, and that leads to low latency in interactions and a higher throughput.
Partitioning helps build Fault tolerance and resilient systems : Splitting a system into independent components helps reducing the impact of failures. One component can fail without affecting the overall system or service.
Only share the components which are natively built to support partitioning or to be distributed.
Cloud native principles are compatible with Shared-nothing architecture as far as they are implying logical (function, object, …) and physical (resource, infrastructure) partitioning.
Shared-nothing architecture requires that each bounded context has its own data stores and resources (code base, configuration, computing context, development team, …)
How to apply the shared-nothing architecture philosophy into our cloud native solutions ?
1 — Use Micro-services architectural patterns for partitioning => CQRS (command query and request segregation), Database per service or Domain Event in a micro-service architecture are made for that purpose.
2 — Promotes immutable components and infrastructure, as a build pattern which will help for partitioning, and so provides a native barrier for the common infrastructure and operational concerns :
- Turn Variability into Repeatability => Improving the capacity of providing identical environments for developments and production (Dev/Prod parity) using cloud infrastructure providers capabilities.
- Turn Change as exception to a rule => Enabling environment replacement instead of upgrading to adapt when needed (Frameworks and platform regular changes or Compliance changes).
- Turn Risky into Safe deployments => Reducing risk in deployments by using immutable deployment model which is already supported by cloud providers in their continuous deployment strategies.
3 — Deploy with containers to use the application-first approach provided by cloud native platforms instead of the common infrastructure-first hosting (IaaS). That approach is enhancing the computing resources partitioning.
As we can see, we have common key mechanisms between shared-nothing architecture and cloud native platforms.
- Distributed load =>Health-checks and heartbeats control process.
- Distributed computing or data processing over Consensus => State store and control process.
4 — Avoid the known pitfalls of the shared-nothing architecture. Yes there are some and cloud native applications and platforms address that by providing control and management tools/functions to tackle them :
- Complexity => API for service topology specification and API for monitoring and management. Orchestration tools like Kubernetes or OpenShift are well designed for that purpose.
- Flexibility => Design for automation with APIs which provides the support of the entire SDLC (Software development lifecycle) in every aspects (Provisioning infrastructure, backup, Elasticity, CI/CD pipelines)
- Cost => Simplifying cost evaluation by offering a more elementary pricing model behind a full of options (On-Demand instances, spot instances / virtual machines), FaaS and Serverless implementations (lambda functions, Azure functions,…).