Expert Talk: Continuous Architecture
What is continuous architecture and how does it fit in today’s world? Has the role of a software architect changed over the last few years, and what are the main skills you need to be good at architecting software? Pierre Pureur, co-author of “Continuous Architecture in Practice,” and Kurt Bittner, Enterprise Solution at Scrum.org, give an overview of what software architects — or those who dream of becoming one — should consider across each of these questions.
Read further
What is continuous architecture and how does it fit in today’s world? Has the role of a software architect changed over the last few years, and what are the main skills you need to be good at architecting software? Pierre Pureur, co-author of “Continuous Architecture in Practice,” and Kurt Bittner, Enterprise Solution at Scrum.org, give an overview of what software architects — or those who dream of becoming one — should consider across each of these questions.
Quality attributes in software architecture
Kurt Bittner: Hi, everyone. I'm Kurt Bittner, and I'm here today with my friend and colleague, Pierre Pureur. We're going to talk about architecture, specifically continuous architecture, but we should lay some foundation first. Pierre, people tend to throw around the word architecture rather loosely, and I'm curious about what the word architecture means to you.
Pierre Pureur: Good morning, Kurt. Thank you for this conversation, this is going to be great. So for me, architecture is really... I mean, a classical definition, right, defined as the fundamental concepts of properties of the system, and we think of it as elements, relationships, and design principles. So, I think of architecture at a concept for level. Typically, the objectives in architecture are to:
- achieve some quality attribute requirements,
- to define some principles, standards, and blueprints, of course, blueprints are what people think most about architecture,
- to build some usable, and if we are ambitious, even reasonable services, and lastly, but definitely not least,
- to create a roadmap for some kind of artificial state.
So, those four things define what architecture is to me.
Kurt Bittner: So, it's interesting, the quality attributes that you mentioned. I know we're sort of jumping into something fairly directly here, but how do those quality attributes...what are they, and how do they shape the architecture?
Pierre Pureur: I think of them as more important than functional requirements because you can really design any kind of blob of software that meets functional requirements, and yes, the good news, it will meet your functional requirements, the bad news, it will not work very well. So, quality attributes are well-defined in architecture, and quality attributes tend to, a lot of them, they tend to change over time, the importance of them. At this point the ones in this age of cloud, agile, we think of quality attributes, the most important ones are scalability because your system needs to be able to scale to a level to be able to handle your productive volumes and number of users, performance, because if it doesn't perform well, basically, nobody's going to use it, so you're dead on arrival, security, because security is, as we all know, becoming more and more critical, and then resilience. Resilience is making sure your system goes to production and stays in production without having to resort to extreme measures to keep it running. But those are the four quality attributes we are thinking of right now. They change all the time, as I said.
Kurt Bittner: That resilience one is really interesting because I think a lot of people think of architecture as being an upfront sort of thing, and in reality, at least in my experience, architecture has a lot more to do with the overall lifetime of an application or a system or a product. So, the resilience of it over time to changes in the environment, to changes in requirements, to changes in the people working on it ends up being interesting and drives a lot of interesting decisions.
Pierre Pureur: Yes. Yes. And resilience you're right, resilience has become very, very important lately. A long time ago, when we used to design, I don't even think we used the word architect at the time, we used to design a system on mainframes, resilience was not a big concern, mainframes were considered almost... they didn't break down, you could do anything you wanted, and the thing would just run. Nowadays, that's not the case anymore. I mean nowadays, systems, you're going to have some hardware issues, some network issues, some software issues, so resilience has become very, very important. It's not a case of trying to prevent a failure, it is more a case of, well, what do you do if a failure occurs? Because a failure will occur, it's just a matter of when. And being able to gracefully recover from failure is becoming extremely important.
The fifth quality attribute element to good software architecture
Kurt Bittner: So, you mentioned that these quality attributes tend to change over time, and so if you were to add a fifth one, is there a particular attribute that you think is sort of lurking out there on the sidelines, that you might think of adding to this list of four that you mentioned?
Pierre Pureur: Yes. So, in retrospect, of course, we wrote a book about a year ago, so we can look back and say, well, maybe I would do things a little bit different if I had the chance to do it again, we probably would have added sustainability. If I think of sustainability, the first thing that comes to mind is resilience because to be sustainable, a system has to be able, of course, to stay in production, but it's much more than that. Sustainability is really about your software system, to really be resilient to changes, so to be...as things change, and things will always change, you need your system to adapt, and hopefully, and that's an important consideration, at manageable costs. Sustainability cannot be adapted at any cost, it has to be basically, adaptation at, basically, an affordable cost.
It's kind of invisible or a stealth quality attribute. Cost is always present, and you need to look at the cost of anything you do in architecture.
One thing about sustainability, I've read quite a few articles lately that talk about sustainability in...environmental sustainability. Especially some cloud vendors are trying very hard to prove that their cloud is sustainable, it does not affect the government too badly. Now, of course, with bitcoin, that becomes even more important. I'm not thinking of sustainability in those terms, though environmental sustainability is important. But I'm more thinking sustainability in terms of, over time, can your system react nicely to changes, and at an affordable cost?
What is continuous architecture?
Kurt Bittner: So, the principles are really interesting. What I would like to dive into next is basically, how do you get started with this? And for the listener who's not familiar, Pierre has written a book on continuous architecture, along with two of his colleagues. This notion of what is continuous architecture, and how do you get started with it, it's really interesting because I think also, my experience with architecture in the past is that people tend to think of it as being something that's upfront, and then you go develop it and you deploy the system, and the architectural work is done. So there's continuous from the standpoint of the development cycle but also continuous from the standpoint of the actual lifetime of the system, and I think both of those attributes, in this sense, are interesting.
Pierre Pureur: Yes. So broadly speaking, at this point, there are two schools of thought about architecture. One is the one you describe, Kurt, which is the upfront architecture, which is the traditional one, we try to do everything up front, and we don't write a line of code until all the architecture is nailed down. The problem with that approach is, of course, it's very hard to know everything at once. I think that Kurt, you did mention that some people believe that something like 40% to 60% of requirements, and that includes quality attribute requirements, are not true, not real. So, designing an architecture when more than half your requirements are incorrect is not going to give you good results. Also, it's really hard to predict how a system is going to evolve. So if you try to design everything up front, you are in, I think, for a bad surprise as time goes by. Plus, the reality is people are going to lose patience a little bit with the time it takes to get a design together, and people are going to start kind of trying to get out of the starting gate and write some code. So for all these reasons, more than 20 years ago, people moved to an agile view.
The idea of the agile view was the architecture was going to emerge, and you write code, and somehow the architecture, like a whale out of the water, kind of emerges. The problem with that is that if you do that, you're going to end up doing a lot of refactoring, you're going to do a lot of, basically, adjustment because there is some planning you can do upfront. Architecture is planning, and you should attempt to do some planning.
So when we started thinking about what continuous architecture could be, we started thinking about getting some principles. None of the principles are new or earth-shattering, but I think together, they make a lot of sense. The principles, there are six of them, the first one was architect products rather than projects. And again, quite obvious, but people don't necessarily do that, okay? The focus is on quality attributes, don't worry too much about functional requirements. Yes, you need to have a system that meets these functional requirements, but again, if your quality attribute requirements are not met, then you're not going to have a very efficient system. The third one was really to attempt to delay decisions until you're sure that you have to make the decision. Don't make decisions too early, try to make decisions based on facts. The fourth one, try to architect for change because things will change. So, try to do things small. That doesn't mean necessarily using microservices, but think small design, if you can. The fifth one, don't forget to build is fine, because most people are architects to build a system, but don't forget you're going to have to test to deploy an operating system. And that's almost more important than building a system. And the last one is the well-known principle and model of the organization of your teams after the design of the system you're working on. If you try to organize the wrong group in the layers, front, middle, back, that's not going to work very well. The idea is trying to cut across layers, and have one kind of nice, homogeneous team that deals with all aspects of the system. So, those were the six principles.
Kurt Bittner: Those reflect a lot of similar experiences that I've had too, that having an empowered, self-managing cross-functional team that's focused on delivering value is one of the real sorts of foundational aspects of agile approaches, and a lot of people lose sight of that. One of the things that were interesting in that is that notion of delaying decisions until absolutely. And so, one of the questions I think teams would ask themselves is how do we know when a decision is necessary? What kind of questions should we be asking ourselves about decisions to say to answer the question, or to say we don't need to make that decision right now, we can delay it until later? Or the converse of that is to say we need to make this decision right now because this is going to fundamentally affect a lot of work going forward. And maybe, I'm sure you have some thoughts about that. We've had some conversations, I know, about those things.
Pierre Pureur: Yes. Yes. I mean, look, I think one of the key attributes of an architect is to be patient, okay? Don't rush into making decisions too early. In my experience, I've seen teams that start with the end in mind, and they know they want to bring in some technology. And the reason could be because someone told them, well, we know we need to be on the Amazon cloud. The answer is going to be Amazon cloud, and you don't even ask that question, right? If you do that, the problem is you're going to miss a lot of things. The idea is to be patient, and make your decisions when you have all the facts. And that's important, try not to make decisions based on guesses because at least half the time, your guess is going to be wrong. If your guesses are wrong, your decisions are going to be wrong, okay? Try to make decisions when there is no other choice when you have to, okay? Don't wait too long, but don't rush into making your decisions. One thing I will offer as a piece of advice is beware of bringing in products, especially the ones that may hijack the architecture, okay? Make sure technology doesn't take over the architecture because that's a problem, okay?
Then we get into cost, back to cost, right? Some decisions are going to be not accurate, some decisions are going to have to be reversed. So for each decision, there is a cost of reversing the decision. So I think by looking at, basically, the cost of that decision, of reversing the decision when you make that decision, you can get to a point where you can ask yourself, do I need...am I so sure that I have to make a decision right now, or can I wait a little bit longer?
Recommended talk: Continuous Architecture in Practice Part 1/2 • Eoin Woods & Simon Brown • GOTO 2021
The unacceptable cost of rework
Kurt Bittner: I think, when we were talking about this last week, it occurred to me that if I put this into scrum terms, then I might say something like during sprint planning, the team should ask itself is anything that we're planning to do during this sprint going to have an unacceptable cost if we have to redo the work at some point in the future? So inevitably, as you said, architecture work involves rework, you want to minimize that, and yet you don't want to lock yourself into particular technologies or decisions if you find facts in the future that tell you that you need to do something different. So, I think that notion of unacceptable cost, it's an interesting question for people to keep in the back of their mind anytime they're planning work, and say, if we have to redo this in the future, does it have an acceptable cost? In that case, then you need to make a decision now, and you need to, you know, understand the consequences of that decision. Would you agree with that?
Pierre Pureur: Yes. Yes, absolutely. I feel that the continuous flow of decisions is more and more important because that's really... I mean, people think of architecture as blueprints, nice diagrams, and so on. Yes, you have to have those things which you have to communicate as part of the architecture, but most importantly, we think the basic unit of work is the architect, and the architect polishes decisions. Doesn't necessarily make decisions, but facilitates decisions. As you make decisions continuously, you're going to learn, you're going to have his continuous loop of learning, and you're going to learn from your mistakes. Because you're going to make mistakes, everyone does, and as you learn, you're going to go back and revisit your decisions continuously. So, you kind of have this kind of full feedback loop and circle where you gain knowledge, you make a decision, you gain knowledge, and you revisit that decision constantly.
Recommended talk: Organization: A Tool for Software Architects • Eberhard Wolff • GOTO 2021
Kurt Bittner: Yes, that's interesting. One of the things that I've always tried to practice, and if I were to add a principle to your principles, I might say, make everything replaceable, or make everything replaceable at relatively low cost. That argues for very modular code, very isolated services with loose connections between them, and lots of things that are sort of becoming either widely adopted now or more prevalent. So, one of the things that sort of relate to that, and relates to this question of how to get started, is something that I know you and I have talked about, this concept of a minimum viable architecture, in a sense. You know, what does the first iteration of that architecture look like, at least at a conceptual level, and then from there you can start iterating and improving and learning and replacing. But you know, how much architecture do you need to get started?
Minimum viable architecture
Pierre Pureur: Yes. So, the concept of minimum viable architecture came out of the minimum viable product concept, and that kind of follows principle number one, which is thought products, don't think projects. So, your first step in minimum viable architecture is going to be to support the implementation of a minimum viable product, something which is small, but good enough to be put in production. And that's a keyword. Your first iteration has to be something you can put in production because the whole point of the minimum viable product is really to get feedback from customers. Well, you can't get feedback from customers if you don't have something in production. So, the first iteration, I'm thinking of the first iteration of MVA is really to have something that supports a minimum viable product functionally. Your MVA is going to be good enough to be put in production, however, you're not going to design an MVA based on guesses, and we go back to facts versus guesses. You're going to design your MVA based on what you know. If you're thinking about performance or scalability, you're not going to worry too much about scalability because the MVP is probably not going to be released to a lot of customers, so you're going to have a small customer base. Performance is going to be good because, typically, if you don't have a lot of users, performance, traditionally, is good? And resilience is also going to be acceptable. Security, however, is still important because you don't want to put something out that is not secure.
Then I happen to believe that if there's one thing you should do initially, it is to put some kind of instrumentation framework so that you can monitor the performance and scalability of your system so as your system and your MVP becomes popular and successful and expands, and we hope it does, at that point, you're not kind of caught flat-footed, and you're not facing a big problem. So, you know your implementation framework is going to tell you, basically, oops, be careful because it may be time, and this is all part of the feedback loop, it may be time to start considering performance scalability because your system is bigger, being used by a lot of people, so more than you expected, so time for a security measure.
Kurt Bittner: Yes, that's a really interesting concept because I think a lot of people in that minimum viable product, tend to focus on functional requirements. So they'll have a nice sort of in the old days, we would call it a demo, you know, they'd have a nice set of visual capabilities that would be very exciting, and get everybody interested. But your discussion reminds me of something that I did with a team years ago, and that we had an online transaction processing system, and we had certain, in a sense, service level agreement kind of things about how scalable the distributed system had to be, and so, there were certain requirements around latency, and certain requirements around response time, and certain requirements around several concurrent users. Our first iteration on this particular application had virtually no functional requirements, but what it did is it simulate the workload that would be happening in that system. We knew approximately what kinds of things would be exchanged back and forth, and so we wrote our first sort of minimum viable architecture, in your terms, to basically, a startup that many concurrent users, a startup that many concurrent processes, simulate the distribution and the lag time and latency in the system, and prove that some of our most basic choices could be satisfied. Because if we couldn't do that, then we just really needed to cancel the whole initiative, or significantly rethink it. So, that gave us a lot of confidence, and then from then on, you know, we could start replacing the fake transactions with real transactions, and gradually building the system out. It very much worked in the way that you described, with making these decisions over time. We weren't concerned about the functional requirements initially, it was more about the scalability, security, reliability, and latency kinds of requirements. So, very interesting.
Pierre Pureur: There is one more point on MVA and MVP. Traditionally, an MVP system, a minimum viable product, is a throwaway system. That's how people think of it. That's where I differ from that school of thought because I believe that the idea of an MVA is building something you're going to be able to grow and build upon, and I think that's a key thing. Yes, you initially, roll out a minimum viable product, but you're going to build that product, and you're going to build the architecture in tandem. If you are lucky, you're going to end up with a successful full-fledged system. If you're unlucky, the system will not go anywhere, but you still will have learned a lot, which is important.
Recommended talk: Continuous Architecture in Practice Part 2/2 • Eoin Woods & Simon Brown • GOTO 2021
How to apply continuous architecture to existing systems
Kurt Bittner: Right. So now, the question I think in a lot of people's minds, certainly in mine, is that architecture work is really...it's not easy, but it's less challenging on a new product. But if you have existing products, or you have a new product that relies upon existing products for some transactions or services that it provides, and that existing architecture is just often a mess, you know, it's just a tangled web of, you know, dependencies and other things. Somebody I talked to one time said if you use the building analogy, it resembles a ruin rather than a functioning building, a ruin that people are squatting in and living in. So, how can an organization apply continuous architecture concepts to a bunch of existing systems that they already have, that they need to extend or that they need to build on top of? Let's start with that question, and then we can move on to a related one.
Pierre Pureur: Yes. So, it's interesting, Kurt, on Friday I was talking to a common friend of ours and we were talking about exactly that topic. Murad, of course, I'm talking about Murad, and Murad right now works for a large international bank. They are going exactly through that, they're going through, basically, reengineering of systems. When I asked him about how are those decisions made, decisions to start over versus to try to keep on updating and enhancing, his answer was very, very direct. He said those decisions have often to do more in politics than technology. What happens is you get someone at a very high level that says, well, you know, I've had it with systems working on a mainframe, and so on, time to move to a modern cloud, time to move to Amazon, Google, or pick one, right? Nothing really to do with technology. So kind of leaving that aside, okay, I think that the main issue with a system that has been around for a long time is that technical debt builds over time. Technical debt is like entropy, ok? It's very hard to avoid technical debt. Every decision we make kind of involves some technical debt. If you are lucky, when you make decisions, you get technical debt. But that's luck. Most of the time, you end up being saddled with more and more technical debt.
Recommended talk: Continuous Architecture • Murat Erder • GOTO 2016
So, what happens over 10 years, 20 years, 30 years, basically, at some point, your technical debt becomes impossible to be paid, and you need to do something. You get to an inflection point where you say, well, do I want to keep on kind of modifying the system, enhancing our system, and maybe spending a lot of money on trying to make a system a bit better, or just keeping it above water, afloat above water, or do I just want to give up and start again? Another factor in this, in addition to cost, is some systems become impossible to use to support the business strategy, and it's becoming impossible to roll out new products, okay? If you reach a point where the system stands in the way of a business strategy, then you have no choice but to scrap it and start over.
If you haven't reached that point yet and if you think that the system, and there is this almost zero investment discussion, because at some point, the cost of physically making a system is going to be too high, and you don't get enough return on investment. So, there is a way to start replacing your system kind of piece by piece, service by service, and that's, an example of that is the so-called strangler pattern, where you strangle your service in your system, and you replace the service with a new system. That can work to a point, except that some older systems are far too compact and monolithic to be able to be a kind of modernized service by service. If that's the case, you have no choice but to really go back to square one, and start again.
Kurt Bittner: One thing that I find sort of exacerbates the dependencies between systems, which is often...well, you mentioned monolithic, it's because there are too many dependencies between things. But one of the things that I've noticed is that it helps to start abstracting the data, and not be tied to specific database schemas, and not interacting...basically, use services between different parts of the application rather than common data. That's a problem in a lot of older systems, is that they synchronize, basically, by using common data and common databases, and then that tends to ripple.
How to become a software architect
Kurt Bittner: So switching gears a little bit, I'm sure a lot of the people listening to this are interested, perhaps, in how do I become an architect? How do I approach this from a career standpoint? So one of the questions that we've talked about before is the question of is architecture a role, or is it a skill set? Then if it's a skill set, what are the skills of the architect?
Pierre Pureur: Oh, yes. So absolutely, an architect is becoming a skill set, not a role, okay? I predicted, a few years back, that architect as a job description will disappear. At the time, I was the chief architect in a large insurance company, so that didn't make me exactly popular, but that's, unfortunately for long lifetime architects, I think that's happening now. In terms of skills, you need innovation...you're not born an architect. Some people are born with specific skills, like the ability to conceptualize, which helps, but I don't think there's anything magic about being an architect. To be an architect, you have to, first of all, you have to be able to design.
I mean, architecture is a design-oriented activity, so if you can't design, you're not going to be a very successful architect. So, designing is important. Leadership- architects are technical leaders. I know a friend of mine who used to say that 50% of architecture is communication. So, you have to be able to lead and communicate clearly. Communication skills are critical. Stakeholder focus, architecture is about really serving your stakeholders. It's about serving a business. If you're only focused on technology, you're not going to be a very successful architect. So, it's all about focusing on...number one, knowing who your stakeholders are, and number two, focusing on those stakeholders, which goes back to products versus projects.
Conceptualization, concept, architecture is all about concepts, right? You need to be able to conceptualize and address system-wide concepts. Architects are not just concerned about one little module, they are concerned about the whole thing. They have to be able to look at one module, but they are also able to look at the whole story.
They have to be able to focus on systematic qualities than little functions, that's very important, back to quality attributes. Lifecycle, remember principle number five, right? You have to be able to architect for the build, test, deploy and operate. You have to be involved with the whole lifecycle. You shouldn't just think that you're going to build something, and throw it over the wall to some other people. No, it doesn't work that way. Then the last one is the ability to balance concerns and compromise. Architecture is about compromising as well, making decisions, and compromising. Most often, you're going to have a lot of or at least two of your quality attribute requirements that are going to conflict. Performance may conflict with usability, security may conflict with a lot of quality attributes. You have to find a good enough solution, you have to be able to compromise, and that's harder to do than it's said. That would be kind of my quick list of skills.
Recommended talk:Expert Talk: DevOps & Software Architecture • Simon Brown, Dave Farley & Hannes Lowette • GOTO 2021
Kurt Bittner: Building on that a little bit, too, I think about experiences that I've had, and that people I know, who I respect as architects seem to have had. This is more just adding a little bit of detail to some of the things you mentioned. But having to support an application in production, understanding what happens with load, understanding what happens with security, understanding how to instrument the application, to anticipate failures, knowing what kind of things you have to do, the kinds of things you have to do quickly when you need to patch something and get it running again I think helps you understand what happens, and how to make that application more supportable. I think introduction or the start of my education around things like architecture, I was both building and supporting, building the application, and supporting it in production, so I learned a lot from that.
The other thing that I think helped me early on was a stint that I did in a job where I was focused a lot on performance and scalability, and it gave me some insights that I would not have had if I was just building and deploying applications, so I was doing a lot of performance tuning. Then I found I learned a lot as a developer, from having to maintain and modernize somebody else's code. You start going into someone else's code, and you look at it and you realize this is just a mess, and you start seeing mistakes that you make that other people are making, and you start recognizing, oh, wait, I make that same error, whether it's from having various kinds of services that have some weird side effects, or other things, and isolating different parts of the code. I find those things, if I was recommending to someone, who's maybe just a developer starting, I would say try to get those kinds of experiences, because you'll learn a lot from having to maintain someone else's code, you'll learn a lot from supporting an application, you'll learn a lot from, you know, maybe even doing some work in operations. I know a lot of developers don't like the idea of having to do that, that's maybe not the fun part, but it's something that pays a huge dividend over a long period, to understand what happens to things in production. That's some advice I would add to what you said.
Pierre Pureur: Absolutely. I mean, I think, you know, Kurt, when we started basically, people didn't worry too much about when you were in development, we didn't worry too much about how the application was going to do in production, you started to worry with scalability because mainframes, right? It was great, the mainframe was there to scale. If you didn't scale, you just bought a new mainframe, one that would hopefully scale better. Security was the domain, and it was until...I think it was fairly standard, security was the domain of some people's rooms, which usually were locked. You couldn't get into the location where security engineers are sitting. All you knew is from time to time, they would get out of that cave, and run some kind of mysterious code against your code that would flag all your security problems, and then they would disappear again. This has changed a lot. Nowadays, security is everybody's problem, and especially it's the architect's problem. So if you want to be an architect, you have to get involved in performance, scalability, resilience, security, that's critical.
Testing in a continuous architecture framework
Kurt Bittner: I was thinking security is an interesting thing. In the old days, everything was physically secured, so you know, you had to have a badge to get into the room with the computer. Once you opened everything up to the internet, then it's a whole different set of problems. When I think about continuous architecture, I also think about continuous delivery, and one of the keys to making continuous delivery work is having a very high level of automated testing coverage. I'm wondering if there are some examples and some other things you can share on that. How do you start testing those quality attributes early, and build that into your automated delivery pipeline so that you can have automation support, and more than just sort of thinking...? So, continuous architecture isn't just the way you think about the problem, it's the way that you start instrumenting your delivery pipeline. Are there any things that you can suggest there?
Pierre Pureur: Yes. When people talk about what are we testing, another shift left moment and you try to test as early as you can, they usually talk about functional testing. That's great, but we are back to the blob, right? You can functionally test a blob of software, and yes, it's going to meet your functional requirements, but it won't perform at all in production. So, the idea is really to try to test your quality attributes as early as possible in the process, kind of shifting left in automatic testing of quality attributes.
The one that comes to mind is performance. I think that performance testing should be part of every automated functional test. You should have a test and a series of tests that test performance as you integrate your piece of code into the system. For scalability, you probably are not going to test scalability every time you run a functional test, but you probably want to test scalability quite often. Again, automate your scalability testing. There are a lot of possible scalability tests, but the idea is you need to be able to do some of those tests. Which gets us into resilience, okay? At some point, you're going to stress test your system. And you're going to do that automatically, and you need to do that often, you should...
I still remember times when performance testing was something you did at the end of a life cycle. So typically, two or three days before going into production, you performance test your system. What's wrong with that? Well, what's wrong is obvious, right? The problem is you're going to find out, usually, whenever we expect something to be in production on Monday, you'll performance test your system on Friday, and more often than not, you're going to find out your system isn't performing. What do you do next now? Do you call off the implementation? You're not going to have a lot of friends, doing that. You go in with a system that doesn't perform? That's not going to work very well. Scalability is the same. So, to avoid this Friday kind of crisis, you have to test continuously your performance scalability, as well as your resilience. Google, I believe, came up with the concept of Simian Army, okay, which the whole idea was you start disrupting your system...you don't do it, of course, every day, but you start disrupting your system because it's much better to start disrupting your system in the test than it is to be at the receiving end of a disruption in production. Then, of course, security, that's probably the one that people are most familiar with. But again, the whole concept is don't run a security inspection code...I'm sorry, a code inspection for security at the end of a process, do it at least, you know, once a week or from time to time to make sure that you don't have problems in your codebase that cannot be resolved quickly.
Kurt Bittner: You mentioned the Google concept of the Simian Army, and I also remember there was a similar kind of thing at...I believe it was Netflix, and they had this Chaos Monkey idea, where literally, they inject random failures into the system, and then force people to deal with that, the developers to deal with that. In the security area, something that comes to mind is the idea of ethical hacking, literally having people trying to hack into your system to expose security flaws, but of course, people that, in a sense, work for you, or at least are working against you, but ultimately, in your long-term interest. I think is an interesting technique to throw in. Because the problem with many of these things is you don't know what you don't know.
Pierre Pureur: Correct.
Kurt Bittner: Having someone else coming in, injecting real-world kind of errors or real-world kind of hacking or security breaches into your system is ultimately a good thing because developers can very often have kind of a tunnel vision, where they can only see the kinds of problems that they have encountered before, they're not familiar with, perhaps, new kinds of ways to get into the system. I think those are really interesting ways to make sure that the architecture is evolving as you're developing, and as you're developing the rest of the application.
Pierre Pureur: I was going to say the message here is don't wait for the Friday before you roll things into production. Make sure that your first performance, scalability, security, resilience are run often so that you can learn from that, and hopefully, adjust your decisions.
When is software architecture “good enough”?
Kurt Bittner: One of the things that architects have been accused of doing in the past is, in a sense, continuing to polish something and perfect it, to overinvest in perfection. I think an interesting question is how does an architect decide what is good enough? Because, you know, it's never going to be perfect, and you know, you can waste a lot of time and effort, in a sense, developing a solution to a problem that doesn't exist. So, how does continuous...I can see how continuous architecture might work that way, but I think...could you say a little bit more about that?
Pierre Pureur: Yes. So, we have principle number one, right, you're trying to develop a product, not a project. The difference is a product as a long-term view, but also, you want the product to be in the hands of your customers early, a minimal viable product so that you can get feedback. The whole point is to get feedback on what you have done. So, I think that waiting for a system to be perfect is pointless. What you do is you build a system that is good enough, meets your quality attribute requirements as you know them. Not the ones that may happen a year from now, but the quality attribute requirements as you know them now, okay? And you move the thing in production, and most importantly, you implement so you can collect feedback both on a functional basis, do people like it, do your customers like it or not, do they use it, and on an operational basis, which is doing the system really perform to expectations, does it scale, and so on. Based on that feedback, your choice of decisions just keeps on going and going and going, so the number of cycles is infinite. I mean, you're going to be cycling forever. But the thing is, you build a system that is very, very resilient to change, and at the end of the day, it is sustainable, which is the goal.
Kurt Bittner: Are there potential traps that people can run into, that they go too far down a particular path, and how could you recognize that you, in a sense, you're going too far down the path? So, and what I see in what you just said is that there's this balance between meeting the needs that you have today, and anticipating the potential growth tomorrow. So, I don't know if this happened to anybody. Yesterday was the Super Bowl, and you know, there were ads run on the Super Bowl. I remember having a conversation with someone one time, and they used this phrase that I love, called "catastrophic success." And what that was is that...
Pierre Pureur: Good one.
Kurt Bittner: You're a little startup, your company buys time runs an ad, and gets a lot of interest, and all of a sudden you have orders of magnitude more interest in your product or your website or your service than you've ever had before. So I see that there's a certain amount of meeting the needs today, but anticipating the possible sort of growth of that product over time so that you're not caught up in sort of a catastrophic success kind of thing, all of a sudden your product takes off, and you have a lot more things. Maybe that would be one thing that we could talk about, and then I want to get to one other last question.
Recommended talk:“Good Enough” Architecture Part 1 • Stefan Tilkov • GOTO 2020
Pierre Pureur: Yes. So very quickly, because I think we're running a bit out of time, I think one of the best examples, Kurt, was what happened in 2020 with the COVID situation, where a lot of people worked from home, and they got into retail trading of securities, and a lot of companies got very badly surprised with the volume. I'm not only just talking about small companies, not just the Robin Hood of this world, but also large brokers and large investment companies. They started having some outages, and the outages will happen at the worst possible time, typically when the markets were gyrating crazily, and when people wanted to trade, they could not trade anymore. How do you prevent that? There is no perfect recipe, but at least as part of your resilience strategy, you can build enough guardrails so that, number one, you know what's going on. This is where instrumentation becomes so important, you know exactly what's going on.
The number two is you recover from failures. There are strategies on that, there are tactics on that. Actually, in the book, we talk about quite a few of them, bulk walls, circuit breakers, and so on. The idea is really to try to plan for failure, so to try to kind of realize that failure will happen. Your ad will be catastrophically successful, to use your term, and basically, what's going to happen is you're going to have the volume which you don't expect. What do you do at this point? Well, some people believe when you run on the cloud, some people believe that scalability is the primary cloud provider. It's not, okay? It doesn't matter how much horsepower you have on your infrastructure, at some point, you're going to run out of something, and your systems will collapse. So, the whole point is to anticipate that potential collapse and to try to put measures in place that will avoid the catastrophe thing happening.
Kurt Bittner: Great. Well, I think just to wrap up, and this has been a great discussion...
Pierre Pureur: Absolutely.
Main takeaways about continuous architecture
Kurt Bittner: I always learn a lot from talking with you, I think to leave the listener with...could we leave the listener with a couple of ideas they could maybe just get started with tomorrow, regardless of what their role is, whether they're an architect or developer, or perhaps a manager who's got some oversight over teams that are dealing with architectural issues? What are some things that people could take away to get started? Other than reading your book, of course.
Pierre Pureur: Of course. So, the two things I would advise our listeners to look at to get started, number one, look at the principles, the six principles. I mean, they are very, very useful. Again, they are not revolutionary, but today, they really kind of define how we think about architecture. But the second thing, I believe, is the four quality attributes. The four essential activities. Focusing on quality attributes is the first one. That's very important. Drive and revisit your architecture decisions. It's all about decisions again, decisions are important. Know your technical debt. Anything you do is going to affect the thing called technical debt, and probably have an impact on cost. The last one is to implement your feedback loops. Feedback loops are so important. It's important to know what happens in your system. Again, nobody is 100% right. Statistically, 50% of the time you're going to be wrong. It's important to learn from your mistakes, and get better decisions next time around.
Kurt Bittner: That's great advice. For those of you who want to learn more about this, I read the books, both the original Continuous Architecture book, and the Applied book, and I'd recommend reading both to learn more. So anyway, thank you, Pierre. Thank you for a great conversation.
Pierre Pureur: Thank you. That was great. Have a great day. Thank you.