An Introduction to Cloud-Native Design Principles

The cloud empowers engineers to build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds. Cloud-native techniques enable loosely coupled systems that are resilient, manageable, and observable. Combined with robust automation, they allow engineers to make high-impact changes frequently and predictably with minimal toil. What are cloud-native design principles and what value do they bring?

The Three Categories of Cloud-Native Design Principles

At Xebia, we have established design principles that act as guidelines and a practical checklist for our engineers and developers. One may wonder what the value of these cloud-native principles is. The pursuit of quality is, of course, one answer. But more importantly, these principles can help organizations maximize the benefits of the cloud!

“Cloud-native design principles give our engineers and developers tools to get the most out of the cloud and maximize quality.”

Our cloud-native design principles can be divided into three categories:

Architecture
Configuration management
Quality assurance

In this article, we will discuss all of them, with the first two leaning more toward guidelines and the last taking the form of a checklist.

The Fundamental Organization of a System — Architecture

Our architecture principles refer to the fundamental organization of a system. The directions focus on the design phase. In this phase, you have a significant influence on quality, while you determine what the product will look like. Our guiding principles include simplicity, minimizing waste and manual labor, and designing for failure.

Clean and Straightforward

We believe that high quality is achieved through simplification. The more simple, the more reliable. This is also reflected in principles such as less waste and fewer manual operations. By automating more and eliminating unnecessary steps, you keep processes clean and straightforward. But, as simple as this sounds, the world is not perfect, which is why we design for failure. This enables us to detect errors, provide graceful degradation of the service and automatically recover from errors.

Shared Responsibility

During the architecture phase, we also pay attention to how teams work. Autonomy, for example, is critical – especially for Agile businesses – and so is “you build it, you run it,” – which means as much as sharing responsibility for the development, continuous improvement, reliability, and availability of the business capability in production.

Buy Over Build

Finally, our principles determine our choices, for example, we are not opposed to building, but we prefer a SaaS solution, if available. Why reinvent the wheel, right? To avoid creating vendor lock-in, we also favor open platforms.

Reliable Performance — Configuration Management

We want to ensure that systems continue to perform as intended over time, that is why we established our configuration management principles. In particular, our focus is on version control, infrastructure as code, and immutable infrastructure.

The first, version control, prohibits changes to be made manually. You must realize them through the code, which makes them replicable.
The second – infrastructure as code – comes with benefits like better traceability, testability, and built-in disaster recovery.
Finally, if errors occur, manually implemented modifications are usually to blame. Making the infrastructure immutable (unable to be changed manually) makes it super reliable.

Our Commitment to Superb Quality — Quality Assurance

We have captured our ultimate commitment to superb quality in our principles too. Think of them as a checklist for engineers and developers.

Underlying most of these principles is automation. The less you do manually, the less room there is for human error and the smoother your processes will run. For example, we deploy all changes via an automated pipeline (Continuous Delivery Build Pipelines) and automatically test all services (Automated System Test). If there is an error, nothing is deployed to production.

We only build applications and infrastructures that are self-healing and capable of solving everyday errors. We continuously monitor business services, as availability is often seen as systems being up and running, but if you can’t live up to your customers’ expectations, availability doesn’t matter. Lastly, we believe in the power of an active-active high availability set-up with two active datacenters to ensure seamless disaster recovery.