Garima Singh - Blog - Principles for Managing complexity in Code

All software systems are inherently complex. Essential complexity is attributed to the business requirement, and incidental complexity is attributed to everything else. I wrote a blog Complexity in Software Engineering explaining types of complexity and how they arise (I would highly recommend reading this article to familiarise with the fundamentals). Once you've grasped the basics, the real fun begins - managing complexity. In the race to deliver features, how do we ensure we are adding necessary complexity which allows the system to sustainably evolve and not come to a screeching halt. Let's dive into principles to manage complexity in software systems both within your code.

To understand the concepts I would be using the following example throughout the article.

Digital Transformation for a Hire Shop

Imagine you have been tasked to solve a problem for a local hire shop. The team members are currently using paper-based forms where manual customer details and contracts are maintained to allow renting of different tools such as rug cleaner, sander, etc. They are keen to move to a digital platform. The goal is to transition from the manual system to a digital platform that empowers team members to:

Book Equipment: Allow team members to record new hires and create digital contracts.
Manage Hires: Provide tools to update or modify ongoing hires and customer details.
Process Payments: Support charging relevant security deposits and hire fees.
Release Hires: Enable the seamless conclusion of hires, including return confirmations.

Here these state objectives are "essential complexity" i.e. inherent in the problems as seen by the user.

To build a digital application for the same which would be usable, other underlying requirements would be acceptable performance, ease of use, and ability for multiple team members to be able to concurrently access the system without impact on data integrity. These are all accidental complexity i.e. not inherent to the user.

Principle No 1 - Assisting with essential complexity

As the definition goes "Essential Complexity" is inherent in, and the essence of, the problem. Through listening to the audiobook "Alchemy" by Rory Sutherland, it is sometimes okay to rephrase the brief. You just need to ask the right questions.

How about we dive into an example from the hire shop? The client wants to implement a live chat feature on the website to assist with high call volumes from their end user into the call center. As the client explains the solution, the development team tries to ask some more questions with intent to understand the problem better. They discover the heavy call volumes end up costing the business a lot of money. Investigating further, the team discovers 80% of the calls are from end users who want to confirm details about their bookings a day before the collection date. While the team could implement a live chat features by easily leveraging an off the shelf product or building a chat system from scratch, they propose an alternative solution - to send an email as soon as the customer makes a booking and an SMS a day before the collection date with order details. This alternative solution helps to reduce call volumes, and thus the cost to the business. (This is a real-life example from a project I worked on. Here the intention was NOT to recommend against live chat, but truly understand the problem and propose a solution which would assist with the essential complexity).

In terms of complexity, for the first solution the team would have required to either buy an off the shelf product or build a complicated chat system from scratch. Often live chats rely on protocols such as websockets, MQTT, etc. which adds another layer of complexity. Using automated integration testing with such systems could also pose challenges. In our scenario by rephrasing the brief, we could leverage existing SMS and emailing systems to schedule an SMS and email with considerable reduction in overhead.

The above may not be achievable with all requirements but even assisting with a few can have substantial impact on overall complexity posture.

Principle No 2 - Eliminating or Reducing incidental complexity

In my blog post Complexity in Software Engineering, I categorised incidental complexity arising from 3 buckets - Cross-functional requirements, Tech Debt and Code Mess (i.e. unintentional and/or careless cruft incurred). We will dive into strategies for managing complexity due to each of the buckets:

Eliminating or Reducing incidental complexity - Cross-Functional Requirements

Performance, security, resiliency, and scalability are common cross-functional requirements (aka CFRs) which can lead to accidental complexity. Pretty much as a first rule of thumb we should attempt to make these informal requirements formal. If a performant product is the ask, question and understand why is performance important – Is it because we are expecting due traffic on the website, or is it because it's a system where completing an operation within a few millisecond can cause a make or break (like trading system). Once we have the answers, understand what acceptable performance means and record those as SLAs for the product. This is now a formal requirement and thus an essential one. Furthermore, having it written allows us as the development team to measure valuable metrics.

Here is an example from the hire shop where to ensure fault tolerance against a downstream API we build an exponential backoff mechanism. Here we retry 3 times with the frequency - 1 second, 3 seconds and 5 seconds. The code shows essential and incidental complexity coupled together.

               
func addCustomerReview(review string) (error) {
   maxRetries := 3 // accidental state
   delay := 1 * time.Second  // accidental state
   for i := 0; i < maxRetries; i++ {
      // Perform the operation - essence
      err := performOperation(review)
      if err != nil {
         // Perform backoff - accidental logic
         time.Sleep(delay)
         delay = delay + 2*time.Second
         continue
      }
      return nil
   }
   return errors.New("failed to get customer reviews")
}

To elimination this complexity we can leverage one of the following two options:

Aspect oriented programming is another way to separate some of these aspects. Revisiting the example above, rather than muddying the essential logic with retry logic (which is incidental) we can separate out the retry logic in a separate function as well

                        
// accidental logic
func withRetryAndTimeout(aFunction func() error, maxRetries int, delay time.Duration) error {
   for i := 0; i < maxRetries; i++ {
      err := aFunction()
      if err != nil {
         time.Sleep(delay)
         delay = delay + 2*time.Second
         continue
      }
      return nil
   }
   return errors.New("failed to get customer reviews")
}

// essential logic
func addCustomerReview(review string) error {
   return performOperation(review)
}

withRetryAndTimeout(func() error {
   return addCustomerReview("This is a great product")
}, 3, 1*time.Second)

Leverage event driven architectures. In the above example, an event can be published to add a customer review. The underlying eventing platform provides fault tolerance.

In scenarios where we cannot convert an informal requirement into a formal requirement, it is essential to understand its value. Example for ease of expression we can write more descriptive code with intermediate variables. This increases lines of code and incidental complexity, but it helps with readability and context.

For example, cost calculation in the hire shop can be done in 2 ways, which one would you prefer? The second one bears more complexity but is far more readable and would be a preferred approach.

                    
func costCalculator(daysOfHire int, dailyRateInCents int, securityInCent int) int {
   return int(float64(daysOfHire * dailyRateInCents) * (1 - calculateDiscountRate(daysOfHire))) + securityInCent
}

                    
func costCalculator(daysOfHire int, dailyRateInCents int, securityInCent int) int {
   baseCost := daysOfHire * dailyRateInCents
   discountRate := calculateDiscountRate(daysOfHire)

   discountedCost := float64(baseCost) * (1 - discountRate)
   return int(discountedCost) + securityInCent
}

Eliminating or Reducing incidental complexity - Tech Debt

Tech Debt is an incident complexity which is consciously incurred by a team or a developer as a trade-off to a constraint such as tight deadlines, missing platform capability, etc. When an incidental complexity is incurred under this context, the team must put aside a register to keep a log, and pay forward the interest once the constraint has been lifted. We must not let tech debt pile over one another; this can have disastrous consequences. Beyond a certain point, piling of tech debt is nothing but a Code Mess. There is no silver bullet to eliminate incidental complexity against Tech Debt other than addressing it timely.

Eliminating or Reducing incidental complexity - Code Mess

Code Mess is very different from Tech Debt and is an unconscious or ignorant act of accumulating code cruft over a period of time. There are means by which we can keep check on code mess and not let accidental (pretty much through the rest of the article I have used the word incidental and not accidental, but in context of Code Mess it is a conscious choice to use negative connotation over a neutral one) slip up.

Using linters to enforce code consistency is a small step but incurs massive benefits in the long run.
Test Driven Development (TDD), allows to write testable code from the get-go which in-turn encourages design patterns such as dependency injection, strategy pattern, etc.
Code scanning tools (static and dynamic) allow insights into quality of code and good practices.
Many modern IDEs provide tools to identify duplicate or problematic code snippets.
Last but one of the most important one: Being mindful of Separation of concerns: logic, state and control.

Testable vs Non-Testable Code

Let's compare a piece of untestable and testable code from our hire shop example. A special discount applies when customers book equipment more than 2 weeks in advance. Let us see how this could be implemented. It would be either difficult to write the tests for code on the left and/or the tests are highly likely to be flaky due to integration with time.Now() function. Whereas for code on the right we can inject a desired object of time and write test cases easily around all scenarios. The latter also explicitly specifies input in arguments rather than hidden inputs within the code.

                    
func isPickUpAllowed(bookingTime time.Time) bool {
   currentTime := time.Now()
   earliestAllowedPickUp := bookingTime.Add(-30 * time.Minute)
   return currentTime.After(earliestAllowedPickUp)
}

                    
func isPickUpAllowed(bookingTime time.Time, pickUpTime time.Time) bool {
   earliestAllowedPickUp := bookingTime.Add(-30 * time.Minute)
   return pickUpTime.After(earliestAllowedPickUp)
}

Separation of concerns: logic, state and control

A software system consists of logic, state and control where logic is the rules and computations performed by the system, state represents the data and control is the flow or the order of execution of operations. Logic, and state can either be essential or incidental. This principle suggests that they all must be separated with some base rules at hand:

The essential state is the foundation of the system and should not reference either logic or control.
Essential logic (business rules) does not have a say in how, when and why the essential state might change.
The essential logic specification will make no reference to any part of the incidental state, logic or control.
Accidental State and control is conceptually least important. Changes to this should not impact any other specification but this specification can change if either essential state or logic changes. Separation of functionwithRetryAndTimeout from addCustomerReviews above is a good example of separating accidental state and control from essential state and logic.

The code below is a refactored version of calculateCost demonstrates separation of essential state and logic. Here the function now takes objects of HireDetails and HireCost for an equipment which represents the state. The function itself is a representation of essential logic.

                
func calculateCost(hireDetails HireDetails, hireCost HireCost) int {
   baseCost := hireDetails.daysOfHire * hireCost.dailyRateInCents
   discountRate := calculateDiscountRate(hireDetails.daysOfHire)


   discountedCost := float64(baseCost) * (1 - discountRate)
   return int(discountedCost) + hireCost.securityInCents
}

Summing it up!

When solving interesting problems, we will inevitably work on systems which will get complex. A complex software system can consist of a high-quality codebase. A high-quality codebase is an outcome of deliberate effort, achieved through principles such as those above to keep us in check on the degree of complexity we incur. When we look around we will find legacy codebases which turned legacy a bit too-soon as well as codebases which have stood the testament of time. We choose by our very actions, the course of our codebases.