Home Gotopia Articles Expert Talk: Dev...

Expert Talk: DevOps & Software Architecture

As software architecture continues to evolve rapidly, we are constantly confronted with new challenges. Simon Brown, Dave Farley and Hannes Lowette cover some of the recent trends in software architecture touching on terms such as DevOps and how to deal with complexity. They also reference concepts that have stirred debates forever and are still not done right, like bounded context and continuous delivery.

Share on:

Copied!

Read further

Intro

Hannes Lowette: Hi, we're here at GOTO Copenhagen today. My name is Hannes, and I'm here with Dave and Simon. Maybe you guys can introduce yourselves for a little bit?

Simon Brown: I'm Simon Brown. I'm an independent consultant specializing in software architecture. I like to talk about software architecture and how software architecture fits into the modern agile ways we work, especially diagramming. That's one of my big things.

Dave Farley: Hi, I'm Dave Farley. I'm also interested in software architecture, and software development in general. My big thing at the moment is thinking in terms of software engineering. What does it take to build better software? But some part of that is certainly software architecture.

Evolution in software architecture

Hannes Lowette: All right. Maybe a quick one to start with. What do you think the most positive evolutions have been in the last maybe decade in terms of software architecture? What has enabled you to do better things? What are the big parts that have changed?

Simon Brown: I'm gonna say DevOps and pass it over to you immediately because I think that's a huge one for me.

Dave Farley: Well, I think without being immodest, I'm connected with continuous delivery and DevOps to some extent. I think that is one of the things. In part because of the stuff that I touched on in my introduction, which is that I think it's real engineering for software. And if it were really engineering it would allow us to build better software faster, and it does. That's what the data says. So we can do more sophisticated things, I think, using those kinds of approaches, those kinds of tools, that kind of thinking. So I would agree with that, although it sounds rather self-serving coming from me.

Simon Brown: No, I'd definitely echo the same thing. If you look back 10 or 20 years ago, teams were really struggling just to get software into production and there were lots of manual steps. With continuous regression, continuous delivery, and all the DevOps stuff, that's optimized and automated a whole ton of stuff. I think that's one of the biggest things you can point to over the past couple of decades that's really allowed teams to move faster. And with more of an engineering discipline, you're actually doing things that are a little bit more formalized and structured than they perhaps were at the start of the 2000s.

Dave Farley: I think that's true. I think in part, one of the ways that I think about this is in no doubt, agile was a step forward. It also allowed us to make some missteps, but I think bringing it in...

Hannes Lowette: Are you talking about SAFe?

Dave Farley: Not only SAFe, but certainly SAFe is a culprit. But I think some of the missteps, I think Simon would agree with this, is that it made everybody think that you gotta throw your brain out and just start from scratch with everything. It doesn't stop us designing. It shouldn't stop us designing. It shouldn't stop us thinking deeply about the systems that we're gonna build.

Hannes Lowette: So you see lots of developers that think, "We don't need an upfront design or architecture anymore. Because we're working agile we can push features out."

Dave Farley: There's a very big difference between the failure of big, upfront design, which agile countered, and design. To my impression, agile is about doing lots more design and taking design more seriously, not less. I don't think you build a complex system in one giant leap springing from your brain. It's an iterative, incremental approach to evolving complexity and systems over time. And part of DevOps and continuous delivery is to allow for that evolution safely and in a controlled manner.

Hannes Lowette: Yeah. Is supporting this iterative approach, the quicker delivery of new stuff to production what good architecture does, or is there more to it than that for you?

Simon Brown: Essentially, that's exactly what good architecture does. People say that the Agile Manifesto doesn't talk about design, and therefore, you should not do upfront design, to kind of echo the same thoughts. And I've seen teams go from big, upfront design to basically nothing, and I've realized that's now also a bad idea. And in order to move fast, in order to embrace change and deliver stuff quickly and use all of the DevOps tools and CI/CD tools to move faster and deliver stuff properly in a structured, more engineering-based way, you need a good design.

One of the principles in the Agile Manifesto actually says, "A continuous approach to good design enhances agility." You don't get a good design just by hacking code for free. You have to put some thought into it. And although I completely agree that we need to think about stuff in an evolutionary way because we're gonna get changes and we need to pivot and change direction, I think you still need a starting point, not all of the starting point, but a starting point with some principles in place. So that allows you to create that good structure, higher degrees of modularity so you can move fast. So yeah, it's a blended approach for me.

Hannes Lowette: I feel that you're saying that your design should evolve with your product and with the things that you learn from pushing stuff out. But you mentioned you want to get some stuff in place in the beginning already.

Dave Farley: Hold on, he didn't quite say that. Forgive me if I'm putting words in your mouth. What he said is that you start off with a model, with an idea of what your design might be. I would counter that from an engineering principle is that you start off with a model like that and assume that it's wrong. That's the step to engineering for me. So you assume that it's wrong and then you work in a way so that when you find out where it's wrong you can correct it.

Simon Brown: Yeah, right. And that's very different to big design up front, because when people did big design upfront all those years ago they assumed they were right, and they assumed that a blueprint to come up with was the thing they should always aim for. So I think we're saying, have a starting point and be prepared for that to change, and of course, DevOps and CI and CD give us the tools to make those changes easier if you have good architecture in the first place.

Managing complexity

Hannes Lowette: Okay, so what are the stuff that you would focus on first? If you know where you wanna go in the long term and what kind of architecture you wouldn't need, what kind of design you wouldn't need to support the final product, but you're not gonna build all of it at once, right? What are the non-negotiables? What's the stuff that you always need, even if you start out with your first version that you're pushing out?

Dave Farley: Do you mind if I take that first? Because I think I can set you up for fleshing more detail.

Simon Brown: Yeah, that's fine. Go for it.

Dave Farley: So from an engineering point of view, the things that I would describe are all about managing complexity. I would start to try and identify ways of compartmentalizing the system so that I'm able to understand the pieces and change them without affecting other parts of the system. I would say that's a deeply profound and important aspect of architecture and design. And then, if you're able to do that, so if you're able to build systems that are more modular, more cohesive, good separation of concerns, good lines of abstraction, tending towards being more loosely coupled between those piecess, that's the kind of defense that you then have to allow you to find out you screwed up and made a mistake and change things, and manage, make the code a habitable space that you can change. And I think Simon's stuff, as I understand it, takes that, and gives you tools that allow you to achieve those kinds of ends.

Simon Brown: Yeah, I was gonna say, it's literally the same thing. So Grady Booch has a great definition of software architecture where he says, "Software architecture is about the significant decisions”. All of that stuff is significant decisions. It's your overriding modularity strategy, whether you're building a monolith or some microservices, something in between. And again, it's, how do we make this thing so that we can change it in the future without having this whole blast radius effect that you might change it and everything blows up?

Dave Farley: And I'd argue to some degree that architecture is nearly all about that management of complexity. It allows us to build systems that are beyond the scale that we can hold in our heads.

Hannes Lowette: Or at least a part of it of the system that you can hold in your head, right?

Dave Farley: Yeah, you compartmentalize it so that each piece fits in your head.

Hannes Lowette: Yeah, okay. We've seen a lot of the practices from the big players that have moved into the common domain. If we look at technologies like containers, orchestration, more than ever the ways that we can do pipelines, has been commoditized. You can do that on so many platforms now. With great power also comes great responsibility. Do you think that people are hurting themselves with these technologies as well?

Simon Brown: I do.

Dave Farley: Yeah, I do, too. And there's an elephant in the room, so let's name the elephant. I think that people get microservices wrong all of the time. Like Simon, I'm an independent consultant, and most of the teams that I see, that claim to be adopting microservices, aren't by the definition of microservices doing so. If they're starting something new they start by assuming that they understand, what the services are, creating a separate repo, and then start working on each of those things.

What they've just done is build latency at the point at which they want to iterate quickly in order to be able to learn. So the other aspect of engineering is to optimize for learning. So you want really fast, clear feedback. If my service is in one repository and Simon's is in another, every time the conversation between those services changes to the smallest degree...

Hannes Lowette: You're changing them in two places.

Dave Farley: ...either I've gotta go and dip in his, or he's gotta dip in mine, and it's a nightmare. If we put more than one big repo we still have nice, service-oriented designs. But probably 90% of the time, my idea, he will tell me that I've screwed up his service.

Hannes Lowette: Can you do that, can you take that a step further? And when you're starting out, the team is still small, the company is still small, not just put it all in one repo, like host it all in one process, or do you think that's a horrible idea?

Simon Brown: No, that would be my recommended starting point for 95% plus of teams out there. I did a talk at a GOTO conference a few years ago called "Modular Monolith," same thing.

Recommmended talk: Modular Monoliths • Simon Brown • GOTO 2018

Hannes Lowette: I have a very similar talk.

Simon Brown: I think there are a few people now with talks. They're finally becoming fashionable again. I was seeing the same thing, a whole bunch of people have got this, like, 10, 15-year-old Java legacy application. It's a horrible mess, it's brittle, they can't change it. And they say, "We're gonna convert to microservices." And what they do is they take their existing design thinking, their approach to modularity, which is not very good because that's what governs the mess, and they're basically sticking JSON over HTTPS network links between things in their monolith.

Hannes Lowette: That could have been in-process costs.

Simon Brown: Yeah, right. And now the boundaries are wrong. The boundaries are hard to change and you've got something which is lock-step deployable, brittle, fragile, and slow. And they just don't get that there's a very different mindset shift there.

Hannes Lowette: Don't get me wrong at a certain scale you're gonna want separate services.

Simon Brown: Maybe. I mean, Facebook, Shopify, there are some big modular monoliths out there. Shopify has got a huge, big thing on their engineering blog over the past few years about how they've changed their Ruby on Rails monolith to become much more modular because they were running into issues.

Dave Farley: I'd argue that modularity is always good but you don't necessarily need inter-process communications all over the place, and often you don't need multi-threading in lots of places where people put it. Both of those things amp up the complexity by an order of magnitude at least.

Hannes Lowette: Not just the complexity, also how hard it will be to debug, how hard it will be to trace.

Simon Brown: And even just deploy.

Dave Farley: Yeah.

Hannes Lowette: Just to figure out, what is my software doing in production, that becomes extremely hard. Is that something that you take on from the get-go, like, visibility of your systems? To me, that always felt like one of the most important issues that a lot of people seem to be forgetting.

Simon Brown: This is why some big organizations who are very microservice focused give their teams autonomy but they have internal engineering and platform teams that bootstrap the product teams and service teams. So literally you can pull something out of their internal repo, bootstrap your service, and you get observability and monitoring and deployability for free into the production GCP environment. And all of that stuff is taken care of in a standardized way, and that's fabulous.

Dave Farley: The other thing that microservices give you if you do it well and at scale is it's the most scalable way of building big systems. Because you trade off consistency for independence. So this is the most distributed approach to development, but it means that if I'm writing a microservice and Simon's writing a microservice, I can deploy mine without testing it against his. That's how good the abstraction is between them, and that's kind of table stakes. You can't really count it as microservices if you can't do that, because that's the decoupling step. That’s the point at which we no longer care about the details. That means the protocol has gotta be stable between it. So you've got to be fairly sophisticated in design terms to be able to get to those stable protocols.

Hannes Lowette: But that requires some really competent architects because that requires both the business knowledge and technical knowledge to define those boundaries in the right places. Because otherwise, it will be working against you, am I right?

Dave Farley: Yes.

Simon Brown: And that's why most teams should not do this because it's hard.

Dave Farley: Yes.

Hannes Lowette: Exactly. Are there any tips that you can give people who are starting out in this?

Simon Brown: I'd go talk to Sam Newman. He's got a ton of tips out there, a couple of books.

Recommended talk: When To Use Microservices (And When Not To!) • Sam Newman & Martin Fowler • GOTO 2020

Bounded context

Dave Farley: I'd talk about a few things. As well as independently deployable, the other defining characteristic of microservices is that they're aligned with the Bounded context, and there's a good reason for that. So Bounded context is an idea from Domain-Driven Design, in which there's an area of the problem in which a concept is distinct. You might understand, let's imagine we're buying books. The shopping cart is gonna have the notion of a book and the inventory control is gonna have a notion of the delivery.

Hannes Lowette: Yeah, but it's a different book for everyone.

Dave Farley: It's different.

Hannes Lowette: It has different traits.

Dave Farley: The delivery probably needs to know the weight of it. The shopping cart probably wants to have the picture of it or the detail of the author or something. They're different but they're talking about the same thing. So that's the difference between two different Bounded contexts. If you align your services with a Bounded context, those are naturally more decoupled points in the architecture of the system. It's not 100% but it's a good starting point. You're likely to get away with it more if each of these services is aligned with these Bounded contexts rather than not, I would argue. And then you need to still go back to what we were saying earlier. You need to iterate fast to find out all the ways where you screwed up and you got it wrong and you got your communications too tightly coupled until it's stable, and then you can break them out.

Hannes Lowette: Is that something you look for in your architecture as well, because you brought up DDD, this alignment between what the business domain is and what you actually see represented in code?

Simon Brown: Sometimes. So I'm the creator of the C4 model, which is a hierarchical way to draw architecture diagrams, and one of the questions I always get is, is there a one-to-one mapping between things like a Bounded context on two things on the software architecture diagram? And sometimes there is. Sometimes those concepts, as you say, do map really nicely, and other times you'll often find a Bounded context spans multiple software systems, or perhaps multiple C4 model containers, like different services.

Recommended talk: Why Architectural Work Comes Before Coding Part 1/2 • Simon Brown & Stefan Tilkov • GOTO 2021

So that whole mapping between the domain world and the architecture world, the technical aspects of the architecture world don't often line up, and I think that's okay. But I think people perhaps need to appreciate that that's not always the case, and there are too many people saying, "I must have a complete one-to-one mapping between stuff in DDD and stuff in my architecture”.

Hannes Lowette: Well, it's a nice utopian image but I've never seen it in reality.

Simon Brown: It is. No, me neither.

Dave Farley: I think there's another aspect to this. I think a lot in terms of the separation of accidental and essential complexity in architecture, and in my ideal architecture, I'm going to have little bubbles of domain logic that know nothing about the accidental complexity of the system. And then I'm gonna build the accidental complexity, persistence clustering, durability of messaging, all those sorts of things into the infrastructure that support those services, and therefore, they're isolated from it. If you can get to that, that gives you a fantastic degree of flexibility and freedom because you're simulating the problem domain in your domain logic.

I was involved in writing something called "The Reactive Manifesto" describing reactive systems, and I was involved in building a financial exchange on the model I just described. And it was the most beautiful big system to work on that I've ever seen. Because each of these little bubbles of domain logic was single-threaded so they were dead simple. They were stateful. You didn't have to worry about anything, and everything else was persistence clustering, scalability, resilience, all of those things, were outside.

Hannes Lowette: We're out of the codebase?

Dave Farley: They were not out of the code base, out of the domain part of the codebase.

Hannes Lowette: Oh, the domain part. Okay. One of the things I've been playing with is the Distributed Application Runtime, DAPr. I don't know if you've looked at that?

Dave Farley: I haven't looked at that. No.

Simon Brown: I haven't, no.

Dave Farley: The closest commercially available stuff to it that I've seen are actor-based systems of one form or another, Erlang, Akka, Oracle, those kinds of things.

Hannes Lowette: I just did a talk about Akka.

Dave Farley: Yeah.

Hannes Lowette: The cool thing they do in the DAPrs, you're right, your services against a package that has a couple of abstractions. You have abstracted key value storage, you have abstracted messaging based on logical endpoints. But all of it in your code is abstracted, and in your development machine, you could run the services. It works on Kubernetes with sidecar containers, so you always talk to the sidecar container and that talks to the other services through gRPC.

Dave Farley: I think what I'm describing is a step further in abstraction than that.

Hannes Lowette: Even further than that?

Dave Farley: Yeah.

Hannes Lowette: Okay.

Dave Farley: So for example, I could have an account service that took a message that said, "Create an account," and it would create an account in the internal state of the actor. And it would send out a message maybe saying, "Account created," or something like that. No storage, no persistence anywhere else, but no notion of a database. But it was in the messaging, so I could just record the message and then replay the message, and that kind of thing to get back to the same state.

Hannes Lowette: Okay, so thinking about things in an event-sourced way, and the state as a result of events.

Dave Farley: Yeah, that's part of the persistence of state. It's part of the accidental complexity. It's not the only architecture, of course. That'd be a ridiculous thing to say, but it's a very nice one. I've just made a video on my YouTube Channel about this, but I think it's an interesting idea in terms of further abstracting the cloud. There's moves towards stateful things of just stateful serverless. And they're not doing it very well yet in my impression, but I think if you could imagine something like that you could imagine offloading a lot of the accidental complexity to the cloud services, sort of, further raising the bar. Because not everybody is an expert architect, and not everybody needs to think about all of those things at the level of detail that we do at the moment potentially, I think.

Simon Brown: But perhaps they should do. I'm gonna rein this conversation back in a second because although I'm a big fan of abstraction, and particularly abstractions leading to high modularity, do you remember how enterprise Java has been some all those years ago? So we had remotable enterprise Java beans, and then somebody said, "Well, we could put that same thing locally and you could use the same interface, and you could have a local and a remote call." And that abstraction was fantastic but many people tend to forget that actually there could be a network call here at runtime which would have massive performance impact.

So I think abstractions are great but development teams do need to understand what those abstractions are, and what the tradeoffs of those abstractions are. And to come full cycle, back to the earlier conversation, that's exactly what your architecture is about up front. It's like, where do we want these networks to be, and what are the tradeoffs?

Dave Farley: Absolutely. I wouldn't disagree with that at all.

Hannes Lowette: I would've done the same thing but in WCF and not enterprise.

Simon Brown: WCF, same thing.

Hannes Lowette: It's the same story. I'm a C# kind of guy. In the recent projects that you've done, what are some takeaways that we could get, above all, to check that you don't do this? What was the single move that you see a lot that ends up hurting people, apart from microservices?

Let’s talk about design more often

Simon Brown: From my perspective it's people just jumping on the first solution they come to and not really considering the tradeoffs. So that's why I'm a big fan of doing some design up front and also having a good, simple way to visualize your potential starting point in that potential architecture. Because then you can evaluate it, dry run it, review it without writing lots of code. So yeah, just make sure you have a clearer idea of your starting point before writing tons of code.

Dave Farley: I think I've got two. The first one is, I think, echoing what you've just said, Simon. My perception of most software systems that I see is not that they're badly designed, it's that there is no design. They're just, kind of, big balls of stuff. There's no obvious organizing principles that you can discern.

Hannes Lowette: File, New Project, and let's start hacking away.

Dave Farley: Yeah, and that's problematic. There's no management complexity. You've just got a big ball of mud. And so I think as an industry we don't talk about design enough. We don't do enough design. We don't think about it. We argue all the time about whether it's C#, or Java, or whatever else. That doesn't matter as much as good design versus no design, or bad design.

Hannes Lowette: Is there maybe a problem in our industry then?

Dave Farley: Not one.

Hannes Lowette: Yeah. I agree, but let me finish the question. It's about the way we communicate about things.

Dave Farley: Yes.

Hannes Lowette: You see lots of conferences where we talk about latest version of 2x, or this small building block as one. And then you have the very high-level stuff, where on a very big scale we want to be doing this. In the lifetime of a developer, from when he graduates college until the point where you guys are at, like, expert architects, there are so many steps to take and there are skills that you need to learn. But I feel like the path to there is not something that is often communicated. And a lot of the blocks in the middle, like the run-of-the-mill solution architecture for small-to-medium enterprises is something a lot of people have to learn on their own because there's not a lot of communication about it. Because those are not sexy subjects to give conference talks about, for instance.

Simon Brown: Yes, that's why we have so much work. There's a complete lack of any fundamental teaching, and training, and skills in most organizations. And you're right, when you come to conferences, all the new hyped trendy stuff sells. Our stuff, it's not boring, but it's kind of the essential fundamentals that people really need to know, but it doesn't sell seats in cinemas.

Continuous Delivery done right

Dave Farley: If you'll forgive me mentioning one of your competitors, I won't name them. One of your competitors in organizing conferences.

Hannes Lowette: I don't work for Trifork.

Dave Farley: No, he does over there. One of your competitors had an adoption graph I saw a few years ago and it was talking about all of these new, sexy technologies, and all this kind of stuff, and early adopters doing these sorts of things. And the late majority was where continuous delivery was, and I just kind of fell on the floor laughing because I was just saying, "Really? You think that's what's the state of the industry is, that continuous delivery is normal?" It's not.

Hannes Lowette: It's not.

Simon Brown: No.

Dave Farley: And so what most organizations talk about in terms of these engineering disciplines in continuous delivery is that...

Hannes Lowette: What do they say then, because they have a pipeline to deploy their stuff they're doing continuous delivery?

Dave Farley: Yes. If you're in Jenkins you're doing continuous delivery and you're not. Continuous delivery is working so your software is in a releasable state all of the time, that's it. And if you adopt that discipline it drives you in this direction for being much more disciplined, caring much more about design, and architecture, and all those sorts of things. Because you can't do it if you don't do those things. It's just not feasible. And we don't have many things like that. If we came up with something that genuinely counted as software engineering it would make us build better software faster. Because if it didn't, we wouldn't count as engineering. Continuous delivery does that. That's what the data says.

Hannes Lowette: But you can't do continuous delivery without having your team involved in running the system in production.

Dave Farley: Well, you can do that but it's not optimal. It's better. What you need to do is you need to close the feedback loop, so you need the team to be monitoring, understanding what's happening in production, even if there are other people that are looking after it.

Hannes Lowette: Yeah, they don't have to set everything up but at least they have to not, like in the old days, develop a system, throw it over the wall and it's ops' problem now.

Dave Farley: So my advice, I'm steering us off topic, which I apologize for, but my advice for continuous delivery is that you aim to get to a releasable state once every hour. You can't do that if you don't have a great architecture. And you can't do that if there's waste in the process. You've got to remove waste out of the process. So every time there's a hand over between different groups of people and stuff like that, that's waste. So you have to get down to these small, focused teams to be able to do this, and they need to know a lot of stuff. Trying to drag this back to architecture for a minute, I think that one of the problems in our industry is that we get so blind-sided by sexy technology that we lose our focus.

Hannes Lowette: I saw some speakers call it, and it's a term I've always used as well, magpie development.

Dave Farley: Yes.

Simon Brown: Shiny things.

Hannes Lowette: Yeah, like the shiny thing. "Oh, shiny, I wanna have that. I'm gonna use it whether it's appropriate for my problem or not."

Dave Farley: I did a little exercise recently for a book that I just finished writing where I wrote a simple CRUD application in the latest sexy web technologies, and I wrote one in the technologies of 1995. And in the code that I wrote and needed to do the same job, there was about a quarter of the code in the technology of 1995 as it was in Angular and that tool set. Now the Angular stuff gave me stuff that I didn't have in the 1995 version. It gave me more browser indepentence and all those sorts of things. There were some benefits. I'm not saying there's no progress, but there's nowhere near the progress that we assume there is because we're too close to the hardware. And they've got this exponential progress, we don't. Software development doesn't move at that pace, and there are fundamentals that matter.

DevOps: just a title?

Hannes Lowette: Something that has bothered me when we're doing consulting business in Belgium and DevOps has become a job title or a function description for someone. I mean, "We need a DevOps guy." What's your take on that?

Simon Brown: Oh, it's the same old story in IT. Something comes along and there's a lot of good stuff behind it. The masses get ahold of it and don't really understand it and just copy what they think it means. We've seen this with agile. We've seen it with everything haven’t we?

Dave Farley: Funnily enough, if you forgive me touting my YouTube Channel, I'm publishing a video on that topic this evening on my YouTube Channel, 7pm UK time. But one of the things that I say in that is I've been fortunate in my career to be close to the birth of some reasonably significant ideas. I was involved with the people that invented DDD, and DevOps, and a whole bunch of things. And my biggest takeaway from what's going through those inventors' heads, those creators' heads is we didn't mean that. There's this, kind of, dilution effect. As ideas become more popular, everybody just reads the words and assumes they know what it's talking about.

Everybody thinks that continuous delivery is about deploying stuff into production all of the time, frequently. It's not. It's working so your software is always in a releasable state.

Hannes Lowette: Yeah, and getting the whole team to care about that.

Dave Farley: Everybody thinks that microservices is about having little things talking REST APIs in a separate repo, and it's not. It's nothing to do with those things, and they're all like that. And I think we fall down. I dislike the DevOp. I'm sometimes referred to as one of the people that helped popularize and create DevOps, and I do dislike the DevOps term because it's so easily misunderstood. But the idea is absolutely spot on. The practices are extremely good. DevOps evolved, continuous delivery started a little bit sooner, earlier than DevOps, but they co-evolved from different angles. DevOps could be from operations, continuous delivery coming from development. And in the middle they're talking about exactly the same ideas to a large degree.

Hannes Lowette: I think they only meet in the middle if you understand both sides of the problem.

Simon Brown: Which was what the whole thing was about in the first place.

Dave Farley: It was.

Hannes Lowette: Exactly. It's like bringing the two sides together.

Simon Brown: Climb over across the wall. That's exactly what it's designed to stop.

Hannes Lowette: Yeah, like, break down the wall, make it one team. That's what it's about, not about, like, we need to hire a guy that knows about Puppet and Kubernetes.

Dave Farley: No, no, it's not that.

Simon Brown: Which is what this has turned into in any case.

Similarities between software architecture and architecture

Hannes Lowette: This is what it has turned into, sadly, sadly so. So in Alan Kay's opening keynote he was speaking about and drawing some similarities between the software world and the world where they build big structures. You have an architect that designs the whole thing up front, but then you have the engineering teams that have to come in and actually execute on the architecture and make it so that the bridge doesn't collapse when you're driving your car across it. In that world, those things go really well most of the time, and in software we seem to be doing a terrible job. Are there any engineering takeaways from an architecture standpoint that we can give the audience to incorporate in their software systems?

Recommended talk: Is Software Engineering Still an Oxymoron? • Alan Kay • GOTO 2021

Dave Farley: I think that there are. That is the theme in one of my talks at the conference, to be honest, so I'm touting my own wares. But I think that as a discipline of discovery learning, we should be optimizing to be experts at that, and as a discipline of managing complexity, we should optimize for that. Good architecture plays its part significantly in both of those things. We need to be able to get fast, efficient feedback on the quality of our work. We need to work so that we can make progress incrementally and iteratively. And we need to build modular systems so they can work on one part of the system without breaking other parts of the system, and so on.

Part of Alan Kay's definition of software engineering is that engineering, in general, was about making things and repairing things in principled ways. And I quite liked that as a takeaway. And I think that the things that I'm talking about in these optimizing for learning, optimizing for managing complexity are principles on which we could start to build a genuine discipline for software engineering in our profession, and I think we need it.

Simon Brown: From my perspective, it's really tempting to apply that same technique to software where we do all the design and we have a very predictable way to build the thing that we want to build. And you're right, that's how many building projects work. There's a great talk, again, I think at a GOTO conference actually by Mary Shaw, and Mary works for the Software Engineering Institute. And she does this whole interesting comparison between the building world and the software world, and she basically says, "Software is not engineering yet," cue all your stuff. And she says, "We're still in the early craft and artisanal phases." And again, a lot of that is because it's a very immature industry.

We are learning a lot as we go along. Technologies are changing and I don't think we've found some of those underlying fundamentals and principles yet that are broadly applicable regardless of the technologies and the techniques that we're using to build our systems.I'm not a building architect or a structural engineer so I might have this completely wrong, but my idea is that when you design a structure you have to do the modeling on the structure around stresses, and strains, and load weights, and all that sort of stuff to make sure it doesn't fall down.

I think what's interesting is, once you start to factor in things like continuous integration, continuous delivery, DevOps, things like continuous testing, this whole concept of fitness functions in the "Building Evolutionary Architectures" book, and that's the same thing. And for me, fitness functions are a way to start doing some of that. So if you want to build something that's very low latency you build yourself a bunch of fitness functions that assert whether you can hit those latency targets, for example. So yeah, I'd like to see more teams doing stuff like that. The downside of it is it's hard, and it takes effort, and it costs money, and you have to get some benefits of doing it, I guess.

Dave Farley: I think we in our industry have a dramatically significant advantage that if you are building a bridge or something like that you probably computer model it to do all that stress calculation, all that sort of stuff. And you'd test the model and all that kind of stuff. Our model is the real thing, so there's no empirical discovery that's required after that. The other significant, magnificent advantage that we have is that production for us is free. So once you've done all of that discovery, and learning, and design, and you've got your sequence of bytes that represent your system, however big, or complicated, or distributed it is, you press a button and you clone those sequence of bytes for essentially free. And I think that's a dramatic advantage that we have. And so I agree with Simon that software development in general is not yet in our engineering practice, but I think that we do know some of those principles. We just haven't pulled them together.

Simon Brown: I think if you go back a number of years, people tried to create a model, so not the real code. They created a model of the system they wanted to build and they tried to run simulations on it. When I went to Dubai a number of years ago, the Burj Khalifa, big, tall building, there's a bunch of models in the basement that show you how they modeled things like wind flow around the various sides of the building. You need to do this stuff with buildings, because you can't stick up the Burj Khalifa and then stick it in a wind tunnel. It's too expensive. I mean, you could do it but it's not worth doing. And I think 20 years ago we tried to apply that same technique, let's model the software without building it.

Dave Farley: That's the wrong way around.

Simon Brown: And simulate it, and you're right, production is free. We can just test stuff.

Hannes Lowette: Because that's the biggest difference between the two worlds, right? If your building fails, you're looking at the cost of constructing a new one, whereas in software, you can make changes to something you already built without sinking the whole cost again into the same project.

Simon Brown: Hopefully.

Hannes Lowette: Hopefully. Well, most of the time.

Dave Farley: That's the other key difference, to my mind, between software engineering and development. Software engineers are usually starting off and go, "Oh, shit, this is gonna go wrong." Sorry, I shouldn't have sworn.

Hannes Lowette: It's usually going wrong.

Dave Farley: “Oh, damn, this is gonna go wrong, because it usually is”. And so starting off trying to think about the ways in which our system can go wrong, I think, is really important because that's how you do a good job.

Hannes Lowette: I think that's the best tip anyone has ever given me. Whenever there is a fire in production step up and be part of the team that investigates because those are the days that you learn so much about what you shouldn't have been doing.

Dave Farley: I did some consultancy for a bank a few years ago and they asked me to advise them on building resilient systems, and we walked into this room and all sat there. We hadn't met before, this roomful of people, and everybody was a bit reticent to start talking. So I just said, "Well, if we're building resilient systems we're gonna assume that everything is gonna go wrong from the start." And they said, "What? We assumed we had to make sure it didn't go wrong." I think engineering is about assuming stuff will go wrong and then you learn from that, and predict where it is that it might go wrong and defend against that, another callback to Alan Kay's presentation where he talked about Facebook and their outage, and the kind of things that they should've been doing to avoid that kind of outage.

Hannes Lowette: Well, as with any outage, you only learn what you should've been doing when it goes wrong. And luckily, most of the time it's fixable, just not always behind the scenes so that nobody notices.

All right, I think we can wrap up here. Thank you so much for joining us today and for the interesting takes on a lot of things, and enjoy the rest of your conference.

Simon Brown: Thank you.
Dave Farley: Pleasure. Thank you very much.