Posted on 2025-01-28
software-engineering
complexity
All software systems are inherently complex. Essential complexity is attributed to the business requirement and
incidental complexity is attributed to everything else. I wrote a blog
Complexity in Software Engineering explaining types of complexity and how they arise (I would highly
recommend reading this article to familiarise with the fundamentals). Once you've grasped the basics, the real
fun begins - managing complexity. In the race to deliver features, how do we ensure we are adding necessary
complexity which allows the system to sustainably evolve and not come to a screeching halt. Let's dive into
principles to manage complexity in software systems both within your code.
To understand the concepts I would be using the following example throughout the article.
Digital Transformation for a Hire Shop
Imagine you have been tasked to solve a problem for a local hire shop. The team members are currently
using paper-based forms where manual customer details and contracts are maintained to allow renting of
different tools such as rug cleaner, sander, etc. They are keen to move to a digital platform. The goal
is to transition from the manual system to a digital platform that empowers team members to:
- Book Equipment: Allow team members to record new hires and create digital contracts.
- Manage Hires: Provide tools to update or modify ongoing hires and customer details.
- Process Payments: Support charging relevant security deposits and hire fees.
- Release Hires: Enable the seamless conclusion of hires, including return confirmations.
Here these state objectives are "essential complexity" i.e. inherent in the problems as seen by the user.
To build a digital application for the same which would be usable, other underlying requirements would be
acceptable performance, ease of use, and ability for multiple team members to be able to concurrently access the
system without impact on data integrity. These are all accidental complexity i.e. not inherent to the user.
Principle No 1 - Assisting with essential complexity
As the definition goes "Essential Complexity" is inherent in, and the essence of, the problem. Through listening
to the audiobook "Alchemy" by Rory Sutherland, it is sometimes okay to rephrase the brief. You just need to ask
the right questions.
How about we dive into an example from the hire shop? The client wants to implement a live chat feature on the
website to assist with high call volumes from their end user into the call center. As the client explains the
solution, the development team tries to ask some more questions with intent to understand the problem better.
They discover the heavy call volumes end up costing the business a lot of money. Investigating further, the team
discovers 80% of the calls are from end users who want to confirm details about their bookings a day before the
collection date. While the team could implement a live chat features by easily leveraging an off the shelf
product or building a chat system from scratch, they propose an alternative solution - to send an email as soon
as the customer makes a booking and an SMS a day before the collection date with order details. This alternative
solution helps to reduce call volumes, and thus the cost to the business. (This is a real-life example from a
project I worked on. Here the intention was NOT to recommend against live chat,
but truly understand the problem and propose a solution which would assist with the essential complexity).
In terms of complexity, for the first solution the team would have required to either buy an off the shelf
product or build a complicated chat system from scratch. Often live chats rely on protocols such as websockets,
MQTT, etc. which adds another layer of complexity. Using automated integration testing with such systems could
also pose challenges. In our scenario by rephrasing the brief, we could leverage existing SMS and emailing
systems to schedule an SMS and email with considerable reduction in overhead.
The above may not be achievable with all requirements but even assisting with a few can have substantial impact
on overall complexity posture.
Principle No 2 - Eliminating or Reducing incidental complexity
In my blog post
Complexity in
Software Engineering, I categorised incidental complexity arising from 3 buckets - Cross-functional
requirements, Tech Debt and Code Mess (i.e. unintentional and/or careless cruft incurred). We will dive into
strategies for managing complexity due to each of the buckets:
Eliminating or Reducing incidental complexity - Cross-Functional Requirements
Performance, security, resiliency, and scalability are common cross-functional requirements (aka CFRs) which
can lead to accidental complexity. Pretty much as a first rule of thumb we should attempt to make these
informal requirements formal. If a performant product is the ask, question and understand why is performance
important – Is it because we are expecting due traffic on the website, or is it because it's a system where
completing an operation within a few millisecond can cause a make or break (like trading system). Once we
have the answers, understand what acceptable performance means and record those as SLAs for the product.
This is now a formal requirement and thus an essential one. Furthermore, having it written allows us as the
development team to measure valuable metrics.
Here is an example from the hire shop where to ensure fault tolerance against a downstream API we build an
exponential backoff mechanism. Here we retry 3 times with the frequency - 1 second, 3 seconds and 5 seconds.
The code shows essential and incidental complexity coupled together.
func addCustomerReview(review string) (error) {
maxRetries := 3 // accidental state
delay := 1 * time.Second // accidental state
for i := 0; i < maxRetries; i++ {
// Perform the operation - essence
err := performOperation(review)
if err != nil {
// Perform backoff - accidental logic
time.Sleep(delay)
delay = delay + 2*time.Second
continue
}
return nil
}
return errors.New("failed to get customer reviews")
}
To elimination this complexity we can leverage one of the following two options:
-
Aspect oriented programming is another way to separate some of these
aspects. Revisiting the example above, rather than muddying the essential logic with retry logic
(which is incidental) we can separate out the retry logic in a separate function as well
// accidental logic
func withRetryAndTimeout(aFunction func() error, maxRetries int, delay time.Duration) error {
for i := 0; i < maxRetries; i++ {
err := aFunction()
if err != nil {
time.Sleep(delay)
delay = delay + 2*time.Second
continue
}
return nil
}
return errors.New("failed to get customer reviews")
}
// essential logic
func addCustomerReview(review string) error {
return performOperation(review)
}
withRetryAndTimeout(func() error {
return addCustomerReview("This is a great product")
}, 3, 1*time.Second)
-
Leverage event driven architectures. In the above example, an event can be published to add a
customer review. The underlying eventing platform provides fault tolerance.
In scenarios where we cannot convert an informal requirement into a formal requirement, it is essential to
understand its value. Example for ease of expression we can write more descriptive code with intermediate
variables. This increases lines of code and incidental complexity, but it helps with readability and
context.
For example, cost calculation in the hire shop can be done in 2 ways, which one would you prefer? The second
one bears more complexity but is far more readable and would be a preferred approach.
func costCalculator(daysOfHire int, dailyRateInCents int, securityInCent int) int {
return int(float64(daysOfHire * dailyRateInCents) * (1 - calculateDiscountRate(daysOfHire))) + securityInCent
}
func costCalculator(daysOfHire int, dailyRateInCents int, securityInCent int) int {
baseCost := daysOfHire * dailyRateInCents
discountRate := calculateDiscountRate(daysOfHire)
discountedCost := float64(baseCost) * (1 - discountRate)
return int(discountedCost) + securityInCent
}
Eliminating or Reducing incidental complexity - Tech Debt
Tech Debt is an incident complexity which is consciously incurred by a team or a developer as a trade-off to
a constraint such as tight deadlines, missing platform capability, etc. When an incidental complexity is
incurred under this context, the team must put aside a register to keep a log, and pay forward the interest
once the constraint has been lifted. We must not let tech debt pile over one another; this can have
disastrous consequences. Beyond a certain point, piling of tech debt is nothing but a Code Mess. There is no
silver bullet to eliminate incidental complexity against Tech Debt other than addressing it timely.
Eliminating or Reducing incidental complexity - Code Mess
Code Mess is very different from Tech Debt and is an unconscious or ignorant act of accumulating code cruft
over a period of time. There are means by which we can keep check on code mess and not let accidental
(pretty much through the rest of the article I have used the word incidental and not accidental, but in
context of Code Mess it is a conscious choice to use negative connotation over a neutral one) slip up.
-
Using linters to enforce code consistency is a small step but incurs massive benefits in the long
run.
-
Test Driven Development (TDD), allows to write testable code from the get-go which in-turn
encourages design patterns such as dependency injection, strategy pattern, etc.
-
Code scanning tools (static and dynamic) allow insights into quality of code and good practices.
-
Many modern IDEs provide tools to identify duplicate or problematic code snippets.
-
Last but one of the most important one: Being mindful of Separation of concerns: logic, state and
control.
Testable vs Non-Testable Code
Let's compare a piece of untestable and testable code from our hire shop example. A special discount applies
when customers book equipment more than 2 weeks in advance. Let us see how this could be implemented. It
would be either difficult to write the tests for code on the left and/or the tests are highly likely to be
flaky due to integration with time.Now() function. Whereas for code on the right we can inject a desired
object of time and write test cases easily around all scenarios. The latter also explicitly specifies input
in arguments rather than hidden inputs within the code.
func isPickUpAllowed(bookingTime time.Time) bool {
currentTime := time.Now()
earliestAllowedPickUp := bookingTime.Add(-30 * time.Minute)
return currentTime.After(earliestAllowedPickUp)
}
func isPickUpAllowed(bookingTime time.Time, pickUpTime time.Time) bool {
earliestAllowedPickUp := bookingTime.Add(-30 * time.Minute)
return pickUpTime.After(earliestAllowedPickUp)
}
Separation of concerns: logic, state and control
A software system consists of logic, state and control where logic is the rules and computations performed
by the system, state represents the data and control is the flow or the order of execution of operations.
Logic, and state can either be essential or incidental. This principle suggests that they all must be
separated with some base rules at hand:
-
The essential state is the foundation of the system and should not reference either logic or
control.
-
Essential logic (business rules) does not have a say in how, when and why the essential state might
change.
-
The essential logic specification will make no reference to any part of the incidental state, logic
or control.
-
Accidental State and control is conceptually least important. Changes to this should not impact any
other specification but this specification can change if either essential state or logic changes.
Separation of function
withRetryAndTimeout
from addCustomerReviews
above is
a good example of separating accidental state and control from essential state and logic.
The code below is a refactored version of calculateCost demonstrates separation of essential state and
logic. Here the function now takes objects of HireDetails and HireCost for an equipment which represents the
state. The function itself is a representation of essential logic.
func calculateCost(hireDetails HireDetails, hireCost HireCost) int {
baseCost := hireDetails.daysOfHire * hireCost.dailyRateInCents
discountRate := calculateDiscountRate(hireDetails.daysOfHire)
discountedCost := float64(baseCost) * (1 - discountRate)
return int(discountedCost) + hireCost.securityInCents
}
Summing it up!
When solving interesting problems, we will inevitably work on systems which will get complex. A complex software
system can consist of a high-quality codebase. A high-quality codebase is an outcome of deliberate effort,
achieved through principles such as those above to keep us in check on the degree of complexity we incur. When
we look around we will find legacy codebases which turned legacy a bit too-soon as well as codebases which have
stood the testament of time. We choose by our very actions, the course of our codebases.