At Highwing, we're building a data platform for the insurance industry using an entirely serverless architecture. We've learned many lessons from this architectural choice. One of the more interesting ones has been how the functions-as-a-service (FaaS) model increases the degree of coupling between application and infrastructure concerns. In this article, I will explore some ways to reason about this coupling and a few strategies we've adopted to mitigate its effects.
Let’s get specific about what we mean when we talk about coupling (and its contrast, cohesion) in a software system. Coupling refers specifically to the degree of interdependence between modules. Cohesion refers to the relatedness between functions within a single module. The two are typically inversely correlated: highly cohesive systems exhibit less coupling, and less cohesive systems exhibit higher coupling. Systems with high coupling and low cohesion tend to be difficult to reuse and challenging to change, which increases their cost of maintenance over time.
The study of connascence provides a helpful toolkit to measure coupling in software systems. Developers can use it in both quantifiable and qualitative ways to discuss coupling in a precise fashion. By studying connascence in the context of a software system, we can determine ways to reduce its impact over time, ultimately reducing the cost of making changes to that system.
There are three axes used to describe the nature of coupling: Strength, Degree, and Locality. Strength explains how readily (or not) one can change a particular coupling (that is, stronger connascences require more effort to refactor). Degree is the extent to which the coupling occurs (is it egregious, such as having ten positional parameters for a function, or less so, such as having only 3?). Locality describes the breadth of a particular coupling, whether, for instance, it's confined to a single module or spread across the entirety of the codebase.
There are also two significant categories of connascence: static, which can be determined just by reading the codebase, and dynamic, which is observable only at runtime. Many individual forms have more detailed names and descriptions, which I won't rehash here, but there's a review of the existing literature at https://connascence.io/ that's worth a read.
Historically, a form of dynamic connascence that's received less attention is the coupling between code and the infrastructure on which it runs. These two elements have always had a degree of coupling, but traditional deployment methods had the effect of making it largely invisible to developers. It's generally of concern to understand how fast your code is running, its memory usage, and other operational parameters. If you're deploying code to server instances or containers, the coupling is spread across the entire codebase. It's not that the coupling doesn't exist (think about the effort required to get local dev servers configured or any number of other things that are impossible to test except on live infrastructure). Still, the only controls you can adjust are global to the entire application. As a result, you're unlikely to be highly concerned about the degree to which an individual object or execution path is coupled to the infrastructure supporting it.
With the advent of serverless architectures and application-level managed services, coupling quickly becomes a more pressing concern because the execution of an individual function is usually tightly coupled to the infrastructure on which it runs. Creating cohesion between infrastructure and the code it directly interacts with is critical to avoid strong connascence with broad locality (an undesirable combination). This situation leads to "infrastructure drift,” which can lead to bugs, downtime, and worse if not mitigated. In the language of connascence, this coupling is a "connascence of infrastructure," where the configuration of the infrastructure for a piece of code is essential to its proper functioning.
Within our codebases at Highwing, we've implemented several strategies to help manage this phenomenon. We have a somewhat unorthodox architecture. We run a serverless monorepo using Ruby/Rails with a system for dynamic event dispatch (more on this in future posts). We manage our infrastructure as code using Terraform and keep our Ruby codebase and Terraform roots in the same Git repositories, so those two things move in lockstep. When we deploy, we build and test our application code then deploy it alongside other infrastructure changes in a single pass.
In our case, this strategy helps us keep individual deployable units relatively self-contained and focused on specific tasks, which could be a particular event or a request from a user-facing application. It also helps us keep cloud tools front-of-mind and easy-to-integrate, helping us avoid the temptation to build things from scratch that we should be buying instead.
We believe that our current approach is just starting to scratch the surface of what’s possible here. There are many great ideas in the space that we think will make this approach even more seamless in the future. Ruby on Jets has an excellent DSL for incorporating specific cloud resources directly into application code. The Chalice framework for Python does some similar things using decorators to wire up functions directly to cloud services in a concise way. The AWS CDK is blurring the lines between infrastructure-as-code and traditional object-oriented software development. Since the deployment model for FaaS implicitly creates strong connascence of infrastructure, these tools allow us to locate application code and infrastructure near each other, enabling the strong localized cohesion that's desirable while avoiding the strong coupling at a distance that is not.
Ultimately, we think that the future is very bright for tools that make it easier for developers to orchestrate high-level infrastructure and service primitives while keeping their code and infrastructure in lockstep. We're undoubtedly bullish on this approach and are excited to see what the future of converged core and infrastructure holds!
How We Transfer (Software) Risk Using Serverless
This post is adapted from a talk I gave at a 2020 CTO Summit event on emerging trends in technology.
From Bootcamp to Startup: Five Tips for Managers and Engineers
Perusing the #startuplife topic on social media, one is likely to find a variety of anecdotes for...