Home Bookclub Episodes Software Archite...

Software Architecture for Software Developers

Simon Brown | Gotopia Bookclub Episode • March 2021

You need to be signed in to add a collection

Software architecture concepts will help software developers not only advance their careers but also do a better job in their current work. Simon Brown, the creator of the C4Model talks to Stefan Tilkov about why software architecture is something that every developer should understand, how the C4 Model can help with that and why diagrams are so useful in software development.

Share on:

Copied!

Transcript

Listen to this episode on:

Apple Podcasts | Google Podcasts | Spotify | Overcast | Pocket Casts

If you are not sure how much architectural work needs to be done before you actually start coding check the first episode with Simon Brown and Stefan Tilkov.

The C4Model

Stefan Tilkov: Let's talk a bit about your efforts here. I think there are two things that we should talk about. One is C4 and one is Structurizr. Maybe we can start with the C4 model, and you can go into a bit more detail of what that is and how it differs from UML.

Simon Brown: The C4 model is essentially a formalization of how I've always drawn software architecture diagrams. So in that kind of post-UML phase where the organizations I was working for and with didn't want to use UML and so they've thrown us back to Visio. I had a specific way that I do software architecture diagrams. So if I'm building a software system, I want a box in the middle of my diagram saying, "This is the thing I'm building." I want to list out all the different types of users, the roles, the actors, the personas, and then I want to also show my key system dependencies because, of course, as you well know, anywhere you've got a system dependency, there's some sort of interface and there's always some sort of risk associated with doing that.

That was always really my starting point for drawing out an architecture. Then I wanted to zoom into the system I was building and show deployable, runnable things. Like, if we were building a web application talking to a database, I would literally draw two boxes. Here's the web application and here's the database with an arrow between them. And again, this is really reflecting how I thought about the architecture from a developer's perspective, you know, as someone who ends up coding on those sorts of projects.

When I started teaching people how to do software architecture, I was quite focused on getting people to do the stuff we talked about before. So understanding architectural principles and guidelines and constraints and quality attributes. I had a little case study in my workshop where people would group together. They'd go and do a small round of up-front design for like an hour, and the output was one or more diagrams to describe their solution, essentially. After doing that a bunch of times, I realized that I couldn't understand any of the diagrams and neither could anybody else in the workshop. So now I thought to myself, "Well, I know how I do this. Why don't I figure out how to teach that to other people?" Because I just naively thought that everybody else did the same thing but it turns out they don't.

So that's where the C4 model essentially came from. It was formalized in the latter half of like 2005 up to 2010, something like that. And the C4 model is a hierarchical set of diagrams to describe software architecture. C4 stands for context, containers, components and code. So the context is a system context diagram, and that's basically what I described already. It's a very high-level diagram. It's a single box in the middle representing the system you are building. So maybe like an internet banking system. It's got the different types of users and actors and roles with personas around it with arrows, you know, using the system. And then a set of other boxes representing your system dependencies.

Then you zoom into the system boundary and you show what I'm calling containers, and I'll come back to that in a second because it's a little bit controversial. So I call these things containers, but basically, they are deployable and runnable applications and data stores. If you're building, you know, an angular single-page app that's sending JSON across the internet to a backend Ruby on Rails app which is sticking stuff in a MySQL database, you draw those three boxes and the arrows between them. So again, it's really reflective of the realities.

If you want to, you can zoom into an individual application or data store and show components inside them. What do I mean by components? For me, it's just a grouping of stuff. Usually, you know, a grouping of stuff, nice, well-defined interface. The components often or usually relate to how you are structuring your code from a high level. So if you think about your codebase as a set of components in a layered architecture, then your component diagram essentially represents boxes in layers. If you're doing ports and adapters or hexagonal architectures or package by feature, then that's what your component diagram is essentially showing you. And then if you want to get really into detail, you can zoom in to an individual component to show the code inside it.

So there are a few things to unpack here, I guess. Number one, why did I choose the term container? So, I think I got there before Docker. That doesn't make this right, but I tried to find a term that didn't have many associations, and I obviously failed very badly. So I didn't want to use the word process because that's not what I was trying to show. I didn't want to use the word application because that's not what I wanted to show. At the time, I was big into JHEE, and we talked about server containers and EJB containers. I just liked that container, kinda, metaphor. It's like, "Here's the thing that runs and stuff goes inside it." So that's why I chose the term containers.

And although the C4 model has the number four in the name, I definitely don't recommend doing all four levels. So for most uses and most teams, the top two levels are more than sufficient, and if you want to get more detailed, you can. Then you get into a whole question of should you start automating diagram generation and so on and so forth but again. That's a bit abstract for this discussion. So that's the C4 model in a nutshell. It's a hierarchical set of diagrams to describe software architecture at different levels of abstraction, and those different levels of abstraction allow you to tell different stories to different audiences.

The difference between UMLs and the C4Model

Stefan Tilkov: Okay. So my understanding… and I'll be interested to see what you think of that. My understanding is that the major difference between that and what UML does, for example, is that UML is far more generic, right. It's got all of those things somehow. You can absolutely 100 percent use UML to do the exact same thing. It just doesn't recommend anything, right. It just gives you tons of options, and you have to decide what to do with them. So you could use a component diagram and some sort of package diagram and the packages have sub-packages and you could drill down all the same… in the same way finally ending at classes or whatever it is.

And I've seen you actually recommend combining the C4 model with other diagram types that are in UML like a sequence diagram or a collaboration diagram. So it could totally do that. What I do see as the major thing that people find attractive, that people find useful beyond the UML part, is that you actually made some decisions, not all of them but some decisions to say, "This is the useful level," right? Just not any arbitrary number, it's four. Not any level, it's these. Right?

Simon Brown: Yes

Stefan Tilkov: This may really be a stretch but I wanted to run this analogy by you. It reminded me, weirdly enough, of something Eric Evans did in the DDD book because even before the DDD book… and that obviously was also one of those, "Oh, I'm doing that already while I'm reading this book," for me. He sort of proposed a number of, what in UML I would call stereotypes, right. Like these are the kinds of things beyond just classes, right. We have value objects and we have entities and we have services and we have those things. And he sort of made a very explicit choice to include these, whatever, 10 things.

It's not a real difference from just saying you have stereotypes and you can actually customize the generic UML class diagram or the class concept to mean something more specific but still, it's immensely useful because it gives people more to hang onto. It just gives them a starting point as opposed to the super generic thing that then somebody else would have to customize to actually become useful. Is that a fair analogy or am I totally off?

The difference between UMLs and the C4 model

Simon Brown: No, that's a completely fair analogy. I was just gonna ask you do you remember Peter Coad's "UML Modeling in Color," I think it was called?

Stefan Tilkov: Yes.

Simon Brown: It was the same thing. It's like you have these four different types of classes, and you make them different colors. So what's really interesting is the C4 model does not prescribe a notation. There is no C4 model notation, and you can do whatever you want and you can upload whatever visual semantics to whatever notation you choose. You know, different shapes and colors and all that sort of thing. Something I always do in my workshops is I say to teams, "You could use UML to do these diagrams." You're exactly right. You come up with a set of conventions and rules and maybe you use a use case diagram for system context diagram or maybe you have a component diagram and you apply appropriate stereotyping and that sort of thing.

I do offer this as a thing teams can do, and no one does it. And I still find that surprising. I think you're right. When I introduce the C4 model, I often get a lot of skepticism because… and again, I kind of have to be careful how I phrase things initially because it tends to put a blocker up immediately. If I say, "I'm gonna teach you a diagramming technique and it's a framework and it has a bunch of rules and it'll make drawing diagrams really easy," everyone will instantly switch off. So that's not the approach I can go in with.

If we have the discussion after the workshop and I say, "So, you've gone through this, you've created a bunch of useful diagrams. I can see you really like them. What feedback do you have? What do you think is the selling point of this?" A lot of the people actually say, "Well, you've given us a framework to work in." And it's the same thing. It's like, I've given them something to hand their ideas off, I've taken away a lot of the silly decisions that really don't matter all too much. It's just a way to draw a bunch of pictures. We know what the pictures are, and we can move on and we can get focused on something more important.

The difference between UMLs and the C4 model discussion

Stefan Tilkov: That's awesome. I think that the key point here is there are tons of places where your creativity is way more useful than in the kind of lines or the way you draw a box, right. I mean, that's...

Simon Brown: Like, who cares?

Stefan Tilkov: You have a choice with your architecture diagram notation, but no user is gonna be happier because you decided to use color this way. Just getting it out of the way is a very useful thing there.

Simon Brown: Yes

Should UMLS be part of a CS education?

Stefan Tilkov: Ok. Very good. So, when you think about UMLs, should we still be learning it and teaching it? Should it be part of the CS education?

Simon Brown: I'm going to say yes, actually. I think it's very useful. So again, if you go back to the early UML 1.0, 1.1, which was '98-ish, maybe something around that, sort of, time scale...

Stefan Tilkov: The '90s, yeah.

Simon Brown: Yes. You can actually still find some videos of Grady Booch and the others teaching UML and I think the best version of UML was actually those early versions because they were nice and lightweight.

Stefan Tilkov: They were simple.

Simon Brown: There was a smaller number of diagrams. Right, it was much, much simpler. And I've seen this complaint lots of times that over the years the vendors have all got involved, they've all wanted their specific features for their specific tooling, they're all focusing on UML and defining complete architectures and you magically press a button and all your code pops out. The whole UML thing has become sidetracked by vendors, essentially.

I think if you look at the core of UML, there's a lot of really good stuff in there. I don't like the fact that you have to read 750 pages to understand the entirety of it. The biggest thing I don't like about UML, aside from some tooling which is awful… different story. But the biggest thing I don't like about UML is it doesn't help you. And this is exactly what you were saying earlier. If you fire up a UML tool and you say, "Right, I want to describe an architecture," it gives you no assistance. Furthermore, there are no good examples or case studies about if you have a modern Mac services-based application, think about using these sorts of UML diagrams in this sort of way to describe what it is you're actually building and make it reflect reality and reflect the code and all that sort of stuff.

Should UMLS be part of a CS education? discussion

Those are the main problems with UML but having said all that, I still think it's a useful thing to teach. The C4 model, it doesn't cover state diagrams, it doesn't cover activity diagrams, it doesn't cover business process diagrams or entity-relationship diagrams. So even if you use C4 model diagrams I still think there should be supplemented where necessary with a whole bunch of other stuff from UML, from ArchiMate, from, you know, good old-fashioned entity-relationship diagrams. So I definitely think we should still be teaching it, but maybe with the emphasis that it's a tool in a toolbox. There's some useful stuff and there's some maybe less useful stuff.

Stefan Tilkov: It's almost a little annoying how much we agree, right. Maybe we have to talk about microservices at some point to find something we disagree with.

Simon Brown: The most boring discussion is when people agree.

Stefan Tilkov: No, I totally agree with everything you said. I think it's useful, I think it's way too big and way too complicated. I think the mere fact that at least with a lot of people, you can draw a diagram in UML notation, and a significant number of people will understand the semantics of what you just drew. I mean, that's an extremely useful thing, right.

Simon Brown: In theory.

Stefan Tilkov: It's kind of obvious, right? It's having a common language for those things and there are no points to be gained by, having a different one than all the others because your class diagram is super special or your entity relationship diagram, which is typically shown as a class diagram these days so...

Simon Brown: Yes

Stefan Tilkov: Yes.

Simon Brown: Just to actually pick up on something you said there, one of my recommendations is even if people are using UML, they should still stick a diagram key, a legend, to explain their notation because not everybody knows it. That's one of the biggest things. Once they see that complicated set of boxes and lines they've never seen before, they're like, "That's too much for me."

Stefan Tilkov: Yes, very good point. I also think that even within those diagram types, there are certain things that you might not want to use for certain audiences, right, like using inheritance in a diagram intended to communicate with a businessperson is not gonna help you in any way, right. Maybe it's a refinement if you wanna do it at some later point in time but it's not gonna ease things at that stage. Neither are UML stereotypes or, you know, the types of attributes or whatever.

How do you keep diagrams in sync with your code?

Stefan Tilkov: So, one of the things that comes up I think quite often is the question of how do you keep documentation in general and diagrams just specifically in sync with the rest, with the actual code because the code is obviously what matters and there's nobody… nobody gains anything if you have a diagram that's out of sync with actual reality. What are your strategies there?

Simon Brown: You've got two basic options and, in many cases, option one is the most simple. And unfortunately, at the moment, the most effective. So option one is just update it. So if you've got a definition for your tasks or your work items, you add a line at the bottom and it says, "Have you updated the diagrams and documentation given this, hopefully, small change that you've made to the code base?" And that's often the easiest way. It's just a process thing and it's done and we can forget about it.

Option two is much more complicated and that's really, "Well, let's autogenerate the diagrams and the documentation of the thing we're actually building." And that thing we're building could be the code itself. We could auto-generate diagrams off the code or we could perhaps auto-generate diagrams of built scripts or infrastructure provisioning scripts or maybe the live infrastructure or maybe things like distributed logging and tracing. So you've got a bunch of different inputs you can use to potentially automate and auto-generated diagrams. However...and this is where things start to get a little bit complicated. And I'm sure you've seen this yourself. If you open up an ID with like a million lines of code and you ask it to draw you a picture, what do you get? Like, just an utter mess.

Stefan Tilkov: Things like crossing lines and...

How do you keep diagrams in sync with your code? discussion

Simon Brown: Yes. It is not because your code's a mess, hopefully. It's because it's trying to show you too much. It's trying to show you exactly what the code looks like. It's reflecting that low level of detail. That's not useful to us as humans. We need to chunk it up and zoom out, which is where the C4 model comes into play again. You often get the same thing if you saw autogenerating for built scripts and running infrastructure as well because if you've got like 100 versions of a microservice running, do you want to show 100 things, or do you want to show 1 thing with some variability somewhere else?

So, the whole autogeneration thing is a bit complicated and it's a bit tricky at the moment. Also, a typical codebase doesn't tell you the whole picture. So if you think about the system context diagram I briefly outlined earlier, you've got a box in the middle representing the system, different users, different system dependencies. It's really hard to generate that high-level diagram just from using the code as an input, as a source. So that's some of the real-world challenges we face here. There's not enough metadata in most code bases to generate high-level pictures and if you generate the low-level pictures from codebases, you often get too much and there's often no easy way to get that happy medium. So that's why option one is often the easiest, unfortunately.

Stefan Tilkov: Yes. I'm old enough, as you probably are, to remember things like Together, it started out Together, C++, Together, Java, whatever. The Together case tool that simply you can't really say reverse engineered… simply showed you your code as a UML.

Simon Brown: It was a roundtrip, wasn't it?

Stefan Tilkov: Just complete roundtrip engineering where you could actually edit the code within the diagram. You could switch… you could seamlessly switch between the IDE because, essentially, the case was your IDE. I never found that an appealing thing because it didn't give me the abstraction that I want from a model, right. I want the model to focus on one aspect. Like, for example, the structure that you mentioned. If I wanted to focus on all of the details, I would look at the code and not at a high-level diagram, so...

Simon Brown: Yeah, me too.

Stefan Tilkov: Another, I think, extremely important point that you made that I also want to highlight, which is the one that the code, while being super important, is not everything, especially not in modern architectures, right. Maybe, I highly doubt it even then, but maybe back then like 20, 30 years ago when you wrote a program that essentially was a standalone thing that you ran on the command line and then produced whatever, a file, took a file as an input

, produced another file, then maybe the whole truth was within the code of that particular program.

But these days, everything is a complicated collection of independent, deployable units or containers in your terminology if remembered correctly. So all of those things communicate and the communication paths are sometimes only discoverable at runtime. They're not even in any configuration file or in anything that resembles code, right. It's really something you could only discover if you could look into the minds of the architects or actually… or the coders, developers, or actually observe or surveil the system while it runs, so...

Simon Brown: Yes.

How can you visualize software diagrams and documentation?

Stefan Tilkov: That is definitely a very complicated thing. Have you, by the way, found a solution to I think a related problem, even if you can figure out, like, say, dependencies between artifacts of some kind between deployable, for example, once you visualize them using something like GraphThis or any sort of thing, I think while the quality has gotten better in recent years, it's never as perfect as something that you would have drawn yourself, right. If you can actually realign the boxes and move this line a little bit there and this box a bit over here then it actually looks better. The problem, of course, is it's lost once you regenerate the whole thing where actually you have to do that over and over again. Has anybody yet come up with a solution that allows you to, you know, apply the visual changes like a DIF to the auto-generated graphic? Am I making any sense? Do you have any idea what I mean?

Simon Brown: Yes. So I must say I'm not a huge fan of the auto-layout algorithms like you get in GraphThis. I always find they put things in the wrong places and when I'm telling a story if I'm doing a presentation or something, I want to point things out or I want to group things visually because they kind of belongs together. They might be the same type but from a kind of nodes and edges graph perspective, you know, you've got your typical all dependencies to flow downwards, with things like GraphThis and PlantUML and they give you this weird layout.

Something I did a while ago was I started quoting some tooling called Structurizr, and I wanted to build something that would let people draw diagrams easily and, again, would take some control away from them. Does that make sense? No. To take some freedom away from them in terms of the notation. So, kinda, a fixed notation. But I still wanted the ability to move boxes around myself. So as part of the Structurizr tooling, you can upload a new definition of your model and the views in the model and the Structurizr tooling will attempt… and it's not perfect by any means but it will attempt to retain the existing layout. So I'm trying to do a blended approach there.

Stefan Tilkov: That's basically what I was asking for. So that sounds very interesting. So can you tell us a little bit more about the Structurizr? Just maybe a brief intro?

Simon Brown: So when I was doing my workshops on the C4 model, up until about five years ago, people would always ask me, "What tooling do you recommend?" My answer was always, "Just use Visio." You have no idea how irritating and frustrating that is. So I figured, "Well, I should try and do something. If I'm promoting this way of drawing diagrams, I should try and build some tooling myself." I sat down and I started to put together… and this sounds horrible, and it was. I started to put together an HTML web-based modeling tool specifically targeted around the C4 model, and it was horrible. It was the worst UX you've experienced in your life.

But the interesting thing about that was the stuff behind it, the actual framework and code I'd written, was a really nice way to succinctly define a model and different views onto that model, and that was essentially a set of Java classes. There were a set of Java classes sitting behind this web application representing people and software systems and containers and components.

I figured out, well, let's ditch the UI. Let's only have the UI for drawing the diagrams and let's allow people to describe their architecture using code. So you basically create a bunch of objects in memory, wind them together with a nice little API and that would create you, essentially, a directed graph. Once you have that directed graph, you can export it to different formats and visualize it in different ways. And that's essentially where the Structurizr tooling came from.

How to visualize software diagrams and documentation? discussion

So that was about five years ago. There's a whole bunch of different ways you can use the tooling and there's a community of different tools that have, kind of, popped up around it. So my original version of tooling was you write some Java code, you get some diagrams. Now there are libraries for all sorts of different languages that people have written and they're open-sourced. When we were locked down last April, I put together… so if you're familiar with something like PlantUML where you write text and get diagrams, I created something I call the Structurizr DSL, domain-specific language, and it allows you to define a model and a bunch of views as a single DSL file.

Also, whilst we were in lockdown and I wasn't doing very much traveling, I came up with a bunch of open-source tooling and there's something called the Structurizr CLI. So what you can do is you can define a C4 model, so to describe your architecture. And you can describe a set of views in that single DSL file. Then using the CLI, you can export it to my Structurizr tooling, which is available at structurizer.com, or you can export the views in that DSL file to PlantUML or to Mermaid or to WebSequenceDiagrams. So what it's become now is a way to describe an architecture using the hierarchical diagrams and the hierarchical viewpoints and stuff but with independence, in the way, you actually visualize that model. So that's kinda what it is today.

Resources for developers getting into architectural work

Stefan Tilkov: Very cool. Okay. So before we wrap up, one last question I have is obviously, we're going to point people to your books and to Structurizr, to your website, which has lots of resources. Are there any other resources that you think are useful for developers getting into architectural work?

Simon Brown: Developers getting into architecture work? There's a bunch of really good books that have come out recently. So Neil Ford and Mark Richards have a book. And the name completely escapes me.

Stefan Tilkov: I think it's "Fundamentals of Software Architecture."

Simon Brown: Yes, "Fundamentals of Software Architecture," correct, yeah. I can visualize it with the picture on the front. So yeah. That's a very good book for people looking to get into architecture. Greg Hope has got a book out called "The Software Architect Elevator," and that's a really interesting book because it talks about the software architecture role not just from a technical perspective but in terms of what that role means to the entire organization and it focuses a bit more on soft skills and presentation skills and influence and that sort of thing. So that's another really good resource I'd point people to. Plus, you've got Michael Keeling's "Design It." You've got George Fairbanks' "Just Enough Software Architecture," Eoin Woods, Nick Rozanski's book, "Software Systems Architecture." There's lots of really good stuff out there, kind of, focused now on software developers.

Resources for developers getting into architectural work - book examples

Stefan Tilkov: Excellent. Well, I'll definitely put them all in the show notes and then our listeners can definitely check them out. Ok. So, thank you very much for all your time, Simon. It was an awesome conversation. I really, really enjoyed it. Thanks for being with me, thanks for being with us, and enjoy the rest of your day.

Simon Brown: You too. Thank you.

If you are not sure how much architectural work needs to be done before you actually start coding check the first episode with Simon Brown and Stefan Tilkov.

About the speakers

Simon is an independent software development consultant specializing in software architecture; specifically technical leadership, communication, and lightweight, pragmatic approaches to software architecture. Simon is the author of "Software Architecture for Developers", a developer-friendly guide to software architecture, technical leadership, the balance with agility and communicating software architecture with sketches, diagrams, and models. He is also the creator of the C4 model and the founder of Structurizr, a collection of tooling to help teams visualise, document and explore their software architecture.

Stefan is a co-founder and principal consultant at INNOQ, a technology consulting company with offices in Germany and Switzerland. He has been involved in the design of large-scale, distributed systems for more than two decades, using a variety of technologies and tools. He has authored numerous articles and a book (“REST und HTTP”, German), and is a frequent speaker at conferences around the world