The Viable Systems Model

4 months ago 13

The Viable System Model is a model for autonomous systems. What’s neat about it is that it works at an abstract level; it’s just as applicable for biological entities, computer systems and, as we’ll explore in this post, software engineering teams.

What’s interesting for me about models like this is that they provide an alternative to the top-down hierarchical models. As we saw in the previous post, software’s complex! Managing it in a simple cause and effect way simply won’t work.

It’s easy to look at diagrams like this and completely bounce off.

Let’s take it from the top!

A system lives in an environment (for example competitors, customers, the economy). For a system to be considered viable it must be able to survive in this environment and be long-lived.

These systems (let’s imagine a company) are recursive. For example, most companies have sales, marketing and development. These themselves are systems, and the components within those areas are also systems (e.g. business development team, marketing outreach, or software development team). You can continue this recursive pattern all the way until you get to biological cells, but let’s not do that!

Each viable system has to be composed of five systems.

They are as follows:

Primary Activities (System 1)
Information Channels (System 2)
Structure and Control (System 3)
Environment Monitoring (System 4)
Policy and Values (System 5)

Let’s dive into each of the subsystems of a software engineering organization. For each one, I’ll try and explain the Viable System Model perspective.

Primary Activities (System 1) represent the operational units that directly deliver the organization’s primary purpose. These are autonomous, self-regulating subsystems that interact directly with their local environment and have the authority to make decisions within their domain.

In software engineering teams, System 1 is represented by the development teams. Development teams are the primary value-creating units that build and ship software products. Each team should have clear ownership over specific product areas or services, with the autonomy to make technical decisions, interact directly with users/stakeholders, and adapt their approaches based on feedback from their local environment.

So, what if each team doesn’t have clear ownership or the autonomy to make technical decisions? Well, then you don’t have a viable system.

Information channels (System 2) manage the coordination and communication between System 1 units, preventing them from interfering with each other and ensuring their activities are properly synchronized. It handles conflicts and maintains system-wide coherence without centralized control.

In software engineering terms, I think of System 2 as the communication protocols, shared standards and coordination mechanisms between development teams. This might include tool like pull requests, processes such as shared stand-ups, or just technical mechanisms such as APIs between parts of the system.

Again, if you don’t have mechanisms to ensure co-ordination and communication between your “system 1” teams you weaken the system’s ability to self-regulate.

System 3 is focused on operational management and resource allocation. The role of system 3 is to ensure that each system 1 unit is performing effectively and efficiently. Its job is to monitor performance, allocate resources and intervene when necessary whilst maintaining autonomy of the operational units.

From a software perspective, this is the engineering managers and directors who oversee team performance, allocate budget and headcount, set operational standards and metrics, and ensure teams have the resources they need. This includes sprint planning oversight, performance management, capacity planning, and ensuring engineering practices meet organizational standards.

This is a really interesting challenge and links to the law of requisite variety I mentioned last time. System 3 needs to find the right attenuation of the system to only get involved when needed. When setting operational standards, that’s lowering the level of variety. Limit too much and system 1 can’t respond to change and with not enough limitations the system can’t be controlled.

System 4 scans the external environment for threats, opportunities, and changes that could affect the organization's future viability. It provides intelligence about trends, competitors, and emerging conditions that require strategic adaptation.

This work is forward-looking leadership that monitors industry trends, emerging technologies, competitive landscape, and changing user needs. This includes maintaining a technology radar, competitive analysis, research and development activities, proof-of-concept projects, and strategic technical planning to ensure the organization remains competitive and technically current.

The lack of an effective System 4 is another angle of looking at the Innovator’s Dilemma. Failing to recognize changes to your environment and adapting accordingly is indicative of poor communication from System 4.

Finally, we get to system 5. This is the organizational identity, formed of values, principles and ultimately purpose. It balances the day-to-day operational concerns of System 3 against the future-focused intelligence from System 4, making policy decisions that maintain organizational coherence and direction.

This is the interplay of policy, ethos and purpose over time. It doesn’t boil down to a single role or person, it’s the organizational identity as shaped by leadership (in the broadest sense) behaviour, culture and strategic decisions over time.

Hopefully I’ve managed to give at least a rough sense of how this model maps to software teams. So, what does this give you?

To be honest, I’m still finding out and suspect I will be for a long time, but I’m certainly finding some quotes that allow me to look at things from a different perspective! Here are some examples:

"The purpose of a system is what it does" (POSIWID)

If your development teams consistently miss deadlines, your system's actual purpose is missing deadlines, regardless of stated goals. If your architecture reviews consistently reject proposals, your system's purpose is rejection, not enabling good architecture.

What I take from this is when a system produces a given result each time, it’s a systems problem, not an individual problem!

"A good regulator of a system must be a model of that system" (Good Regulator Theorem)

If you’re managing the work of a system, then you must understand the work of the system. A monitoring system is only valuable if it can reflect the system it is monitoring. Generic monitoring is not sufficient!

Similarly, an engineering manager must understand the work of the team. Applying generic management techniques is not enough to effectively support and guide them (see here).

"Only variety can destroy variety" (Variety)

Complex software systems require equally complex management approaches; simple hierarchical control cannot handle the variety present in modern software development.

Team autonomy and distributed decision-making are necessary to match the complexity of the problems being solved. Without team autonomy or distributed decision making, System 3 must be as complex as System 1.

Algedonic Signals

When creating the Viable System Model, Stafford Beer emphasized the importance of "pleasure and pain" signals that indicate system health without requiring detailed analysis. In software engineering contexts:

Build failures, test failures, and production incidents serve as immediate pain signals requiring attention
Successful deployments, positive user feedback, and team satisfaction metrics serve as pleasure signals indicating system health

These signals should flow rapidly through all system levels, allowing quick response to both problems and successes

I’m just scratching the surface of understanding the concepts here, but now (thanks to recency bias), I’m seeing viable system models everywhere. When I look at Team Topologies, I now see it as a viable system model, the interactions representing System 2, the types of teams representing System 1 (and platform teams acting as System 3). Similarly, the Spotify model represents a viable system (and so on).

There’s not really a conclusion here - just me writing down some thoughts to get things clearer in my mind. More reading to do on this subject - any recommendations gratefully received!