Software Architecture for Tomorrow: Expert Talk
About the experts
Julian Wood ( interviewer )
Serverless Developer Advocate at AWS
Sam Newman ( expert )
The Microservices Expert. Author of "Building Microservices" & "Monolith to Microservices"
Read further
Challenges in Distributed Systems: Resiliency and Networking Failures
Julian Wood: Welcome to another episode of "GOTO Unscripted." We are here in the wonderful Amsterdam at GOTO Amsterdam. And I am joined by the awesome Mr. Microservices himself, Sam Newman. How are you today, Sam?
Sam Newman: I don't like being called Mr. Microservices.
Julian Wood: That sounds terrible, doesn't it?
Sam Newman: That's the thing I'm known for but it's not how I define myself. I'm an author. I like writing. I kind of work to write. I enjoy it. I consider myself a consultant, writer, and trainer. I'm just interested in technology. I don't like being known for one thing, that sort of boxes me in a little bit
Julian Wood: Especially microservices. They're small boxes. So that's even worse.
Sam Newman: But lots of them.
Julian Wood: There we go.
Sam Newman: So there is that. There is that.
Julian Wood: So where does your inspiration for your writing come from? Is this sort of a progression that happens, or just in your consulting you come up with the right ideas and speak to customers, or where's the sort of...
Sam Newman: Well, I think for me it starts with the fact I quite enjoy writing. I was fortunate enough to have a chat with Kent Beck about six months ago and he says, "I work to write, I don't write to work." And I think very early on I enjoyed writing blog posts. When I was at Thoughtworks, I'd write these long rambling posts in various different... I just gravitate towards thinking in that way. Itt's like it's not a case of do I need inspiration to write, it's like I want to write so I'm looking for things to write about if that makes sense. I still really enjoy that process. I'm in the middle of a book at the moment. I say in the middle. We all know when developers say, "I was about 80% done." When I'm in the middle I mean, I'm about 20% of the way into the book of I think the process, but I still...
Julian Wood: So you've done 80%, but you've still got 80% to go?
Sam Newman: I've done 80% of the visualization of the book being finished. But not actually the work of actually writing the book enough, but we're getting there. We're getting there.
Julian Wood: Can you say anything about the book or...?
Sam Newman: I'm trying to get out of the little box. So I'm doing a book more broadly on...it's gonna be called "Building Resilient Distributed Systems." So looking more broadly at distributed systems in general, which is obviously a much bigger topic than just microservices, but then also narrowing the focus by purely looking at that through the lens of resiliency. I'm kind of aiming the book at the increasing number of developers that find themselves being involved with or building a distributed system, because these idiotic things called microservices became a bit too successful. And having to answer questions like what is a timeout, and how do I set one, and should I retry, and all these sorts of, kind of... So it's not a book where I'm talking about some sign consensus theory because that's what people with Phds are for. This is a book, like, how do you make your stuff more resilient, both the technical side and kind of the people's side? So, I'm really looking at…
Julian Wood: So what are the challenges that people come across? Is it spinning up these services and having multiple copies of them, or is it the architectural practice of understanding how distributed systems work, or what do you think the biggest challenge that people would come to a resiliency point of view from?
Sam Newman: I think if I think about a lot of the developers that I chat to in an enterprise environment, it's not really understanding that a network exists. And I don't mean that in a way that they don't know networks. They know a network exists, but they're often so abstracted from what's happening that they're sort of unaware of the common things that can go wrong. And when you're doing a small amount, of course, across a network you see issues not that commonly, it's a bit rare. But as you start making more calls, you know, you're more likely to see the kind of nasty ways in which networks do fail. So I kind of distill it all down and say, look, all problems from distributed systems really come from two kinds of constraints. The first being that you can't beam information instantaneously between two points, right? Yes, quantum entanglement is a thing. No, quantum entanglement is not a networking protocol yet.
Julian Wood: Yet.
Sam Newman: Yet.
Julian Wood: Yet.
Sam Newman: Maybe, maybe, although apparently, so at least one of the physicist I spoke to said, "Even if it was it would only be for reads." So it's not as, like, you saw, right? And the other thing is sometimes the thing you want to talk to isn't there and really all of our problems from distributed systems, certainly a space of resiliency, but just reasoning about them kind of come from those two constraints that we can't escape from. Then really it's the implications of that. So what I'm trying to do is start really easy with saying, well, what happens if you talk to something and it doesn't respond quickly enough? That's a timeout. Okay, so you're gonna try again? How often are you gonna try again? If you're gonna retry again, can you retry safely? So we need to talk about idempotency. So it's very much that kind of, I'm trying to write the book in a way which is probably the journey you want to go in in terms of how you think about these calls. And no, the answer isn't just stick a circuit breaker on it, or the service mesh will handle it, or I've got this amazing microservice framework and it just makes the network not exist. It's like, you know...
Julian Wood: I think that's a super interesting thing. I mean, strange, because I work, you know, with serverless applications at AWS, and I came from an infrastructure background beforehand. And in fact, one of the things that attracted me to serverless was this abstracting the network away because we will look after, I'm gonna say AWS, but you know, other cloud providers. We will actually look after their networking component because people don't realize, you know, hot and cold mirroring, and sharding, and all those kinds of things are super difficult. Definitely from a service perspective, that's one of the jobs to be solved, well, improved on, I don't think you can even solve it but to help with is, yeah, this whole networking challenge. And it's service discovery, and it's DNS, and it's...yeah, but never mind the availability of partitioning backoff jitter, all these kinds of weird words that developers are like, "Oh, what's going on?"
Sam Newman: But also in that environment, right? You're from a cloud vendor point of view, you are likely going to be able to give people quality infrastructure that will have lower failure rates than the things that most organizations can do...
Julian Wood: Can build themselves.
Sam Newman: ...themselves, right? However, fundamentally they're responsible for putting applications on top. So, while you might have a lower failure rate, you still have to ask yourself the question, well, maybe I've just written a bad bit of code and the service falls over, or it gets stuck in a deadlock... So this is, like, you're sort of getting a developer to realize, well, every time I ask a service to do something what's your approach if that service says no, or the service isn't there? And yes, you can talk about CAP theory, but 99% of the time actually the discussion about what you do next is really more about understanding your business context. That didn't work. What does that mean? Is it the end of the world or is it okay? So I think, you know, there's definitely things I talk about in the book, which is, by all means, do things to reduce the chance of failure. However, you can't eliminate the possibility of failure, so therefore, you do at least have to reason about what happens when things do fail because they will.
The reason I wanted to write the book was I got interested in sort of David Woods's work, so the four concepts of resiliency, looking at resiliency engineering in the broader world. So looking at everything from biological systems to air travel to everything. I found some of his stuff would be a little bit academic and a bit dry, although also being really, really fundamentally useful. There's all kinds of great insights. So, the kicking-off point for me with the book was can I take these concepts and sort of explain how they might impact a normal developer working, like, on the project? So, you know, he talks about this idea of sort of robustness, right? Which is something bad happening, but you can only...being able to ignore it. But he talks about robustness as being something which you're prepared for. It's a known potential issue. You have multiple instances of a service because you are aware of the fact a service may die.
Julian Wood: One may fail.
Sam Newman: You build that in. So you're building robustness in, but that requires advanced knowledge of these kinds of failure modes existing. But then he talks about rebound. Okay, so what you're saying does go wrong, how quickly can you bounce back? And then it's how prepared are you to deal with surprise? And then it's how do you ensure you learn from the problems that went on to make yourself resilient? Those are the sort of four concepts, resiliency that he talks about. So for me it was about sort of starting, getting kind of a developer thinking in those terms, which is not just narrowly I'm going to stop a bad thing from happening, but actually, how do I do the other parts of it? But trying to take those ideas and sort of give people practical things they can do with it. As always, I like to try and write a short book, but my last timeout is 7,000 words. So I failed already, but I also think it's about sometimes being... I try to be clearer with the things I'm saying nowadays. I'm not always succeeding. I'm enjoying the process. I'm pretty sure the combination of my reviewers and my editors will get that, the editor will get that chapter down a little bit. I'm really enjoying the process so far.
The Cognitive Load Dilemma: Microservices, Monoliths, and Developer Trade-offs
Julian Wood: Do you think this is sort of the next stage of what developers need to know about? Because things are at a higher scale and more business facing applications, and more places, you know, people understanding their code and understanding that kind of thing, but these further things of resilience, cost, availability, and scalability, is that the sort of next frontier that developers should be exploring?
Sam Newman: I think the problem with that is we have an ongoing cognitive load issue. So I expect every developer to be over all of those things. I don't think it makes sense.
Julian Wood: The full stack developer that needs to magically do everything, full-stack, 10x developer.
Sam Newman: As Charity Majors once said, "You're not a full-stack developer unless you also design the chips that you run on," right? So, for me, it's an argument that what I'm doing is solving the wrong problem, which is people are overly distributing their workloads in situations where that is not justifiable. For many people, microservices have become the default architectural pattern which was never what we intended ever. It was a useful tool in toolboxes, right, for some people. People still get surprised when I say, you know, "You should start monolith first," right? It's easier. It's easy to work with, right? So to an extent me writing the book is, okay, well, I still keep telling people not to do microservices over here, and now I'm having limited success, right, because market forces are working against me because there's lots more people out there that can monetize microservices than I can monetize doing the right thing. So with that going on, I think about the poor developer who's on a project where decisions have often been made for them. And so for me, I'm trying to almost create, like, that rubber ring that we put around somebody when they're drowning. It's like, let's just take you through a bit of a journey. Now if some of those developers eventually become people who are then able to make decisions and become those architects and tech leads, hopefully, they'll now be better armed with these kinds of things you need to get good at. So when they next make a decision, should I distribute this workload or not, they're kind of more aware of some of the challenges and downsides they're going to have to face. So, my initial hope is that lots of developers are drowning. So let's help them not to drown and then let's also hope that it was a little bit more aware so they can make smarter trade-offs in the future. I think, you know, you go to the average enterprise organization, which is just probably about two-thirds of the customer base I work with, they don't have scale issues. They have organizational scale issues. They might have domain complexity issues. They don't really have big data issues. They like to pretend they do, but they really don't. And so in a lot of those situations, the organizational challenges are pushed into being more distributed, so it's like saying, "Fine, do it, but just be aware of the challenges you're creating by doing this."
Julian Wood: You mentioned trade-offs and I think I can see developers always wanting more information and it's always about trade-offs. There isn't a right or there isn't a wrong, you know, microservices or monoliths. I mean, I work in the whole service kind of thing and everyone says, "Oh, that's perfect for microservices." And then, you know, we talk about Lambda functions, and like, well, no, it's actually quite all right to have bigger Lambda functions. If that works for you and you've got everything in one place, brilliant. You want to use a container, brilliant. It's all understanding these trade-offs. I think, you know, having words that developers can learn and understand, even some of the terms, can help them to, you know, think about those distributed applications.
Recommended talk: Expert Talk: Are We Post-Serverless? • Julian Wood & James Beswick • GOTO 2024
Sam Newman: The analogy I often try to use, so I sort of talk about trade-offs all the time, is I say, well, think about going to a doctor, right? You've got an ailment and the doctor might think, "Well, the way I'm going to help deal with your ailment is I'm going to give you some medicine. I'm gonna give you a pill." I've got a bad knee at the moment. "I'm gonna give you this pill, it's gonna cure your bad knee, right? I think I'm pretty sure it's gonna cure your bad knee. By the way, there's quite a few side effects for this particular pill, right? You might grow a third leg or the legs you've got might drop off, right?" And then you're like, the doctor, right, in that situation is deciding are the side effects worse than the thing that cures you? And I think if you're going into a decision like should I use microservices or not thinking in those terms, I think it's absolutely fine because I think people will make rational judgments. I think the issue is people are not having a conversation. They're not having a discussion. They're not looking at the alternatives. The decision has been made somewhere. It's a convention over discussion. Everybody uses microservices.
It's become I'm old enough to remember the phrase, "Nobody ever got fired for buying IBM." Right? And the idea being here that what will everyone buy is IBM. So you just buy IBM as well. If you don't buy IBM, people are going to look at you weirdly. And if it turns out that not buying IBM was a mistake, it's your fault. But if you do buy IBM and the IBM software turns out to not be very good, it can't be your fault because everybody's doing it, right? I think microservices have become the new nobody getting fired for buying IBM. It's just almost this convention in the industry. So people aren't even having a conversation about trade-offs. And even within microservices, there's extents, right? There's an organization that has 20 people, 20 developers per microservice. One end of the extreme, just very conservative ratios to organizations like Monzo that have 20 microservices per developer, and this spectrum is within that. But I think if we create the space to actually have a conversation, talk about the trade-offs. And for me, actually when you're making that decision, it all winds back to when I went to that doctor, I had a problem, my knee hurt. That's why I'm considering taking this pill. So when you're thinking about a shift to a microservice architecture or whatever else, it's like, what is it that's driving you? So thinking outcome-focused, right? What is the problem we're trying to solve? What is it? What journey we're trying to go on together. What outcomes we're trying to reach. Focus on the outcome. Microservices are an activity.
Julian Wood: And implementation-detailed.
Sam Newman: They are. This is the "Underpant Gnomes" thing from "South Park," right? Step 1, build microservices, step 2, step 3, profit. It's like, that's just... Your users don't care about microservices. What you want to talk about is, well, these are the things, the changes you want to make to our system That then gives us an array of ways to solve it. Microservices now, just one of the number of different aspects you can look at, but you should feel free to look at other things. But again, it's that people are often time-poor. If they do microservices then everyone else is doing microservices. It's just a quick way of short-circuiting and moving on, but the implications of this kind of architecture across the whole development ecosystem are quite profound. The reason why the second edition of my book was so thick was, like, I tried to write a book which was, if you can do this architecture, here's how you do it well. And here's all the implications of that. Are you ready for it? I hope to scare a few people off. I don't know if I succeeded, but…
Recommended talk: Martin Fowler and Sam Newman: When to Use Microservices (And When Not To!)
The Evolution of Microservices: Lessons from a Decade of Change
Julian Wood: You were writing your microservices book 10 years ago. The Version 2 has come out. What are the sort of...if you had to wind the clock, well, you have written Version 2, which is winding the clock back and rewriting... Or do you think...but in a nutshell, what are the issues that require a Version 2? And obviously, this realization that I've written this great book and people have actually gone too far on it. What are the things, if people are...maybe junior software developers or even senior software developers are figuring out how to do these kinds of things. The microservices story, how would you summarize what people should be thinking about?
Sam Newman: I think the core idea of how to do microservices well didn't really change for me, right? It's a focus on independent deployability, and I think that the shift in my brain was it took me about... I remember Martin Fowler was kind enough to review the first edition of the book. We had a conversation. He mentioned information hiding and parsing and I kind of was aware of Parnas's work, dimly. And Adrian Colyer had done some work looking at that in the context of microservices, and agents much smarter than I am. It took me about another eight years to really realize how important information hiding was. This is a concept. So the second edition, a big part of that was centering that concept a lot more.
Julian Wood: Can you explain the concept for...
Sam Newman: So it was a concept developed by David Parnas back in 1970s. So Parnas was looking at modular architectures, modular software. Background at this point, mainframes getting bigger, more powerful, code bases getting bigger. How to break code bases apart so we can all make it more manageable?
Julian Wood: Sounds very familiar.
Sam Newman: Very familiar. And this is where the concepts of module software comes from, work by Constance Jordans happening around the same sort of time around coupling and cohesion. So he was saying, "Well, how do you come up with a modular system where you allow people..." See what you want with a module if you want people... Well, one of the three characteristics you talked about was allowing people to work in parallel on work. And the idea is you should be able to change one module without having a requirement to change another module. And so he looked through a bunch of techniques and the thing he highlighted is this concept called information hiding. The way he sort of describes it is you create a stable boundary. The code that changes more frequently is hidden behind that stable boundary. So you're literally hiding information. Some people view this as data hiding which is overly narrow. I'd argue even in encapsulation is an overly narrow interpretation of information hiding. He means hiding information. Parnas actually commented on Adrian Colyer's blog post back in 2016 saying that.... Because Adrian thought, "I wonder if Parnas thinks you shouldn't even be able to see the code of a module that you're using because that's knowledge coupling." And Parnas says, "Yeah, I don't think you see the code. I don't think you should see the code of a module you're using." If you're just using it, you know, editing, you should know what the code's like, right? So this idea is that you then are free to change the internals of that without it changing the interface between things.
That's one of the biggest challenges of microservices, a lot of organizations are picking those architectures up because they are, as I said, enterprise organizations, their main driver typically is they're trying to encourage a degree of organizational autonomy. They've got a large number of developers, a lot of bureaucracy, too much, too many meetings, too much coordination, unnecessary coordination. They've heard the words "autonomy". They looked at the blurb on Dan Pink's bookand opened the book, but they've read the back of it. And they thought, "Oh, we should do that." And so they're now looking for a different way of working. So the key thing there is, well, how do you create teams with the right boundaries so you can make changes without it rippling? So talking about information hiding as being, you know, the way I interpret it in the services environment is don't expose anything unless one needs it. Which is the opposite of what you often see people doing, which is someone might want this, I'm just gonna expose everything I've got over the interface. But if you want independent deployability, you have to get really good at backwards compatibility, ergo you need to get very, very good at being... You know, everything you hide you can always expose it later, you can always change it later.
It starts for me with that. And so almost the reason to write the second edition was the second chapter of my book where I took coupling, and cohesion, and information hiding, talking about how they're all related, talking about modular systems, how much so, is there a modular system? So big hat tips to Parnas, and Fowler, and Adrian Colyer for giving me those inspirations. And then say, "Oh, and by the way, how did you find your information hiding boundaries?" Let's talk about domain-driven design because it turns out, especially enterprise organizations, there's this thing called bounded contexts and what's this Eric Evans is talking about? Oh, the idea of a bounded context being like a cell where some information can pass and some information can't pass, the information hiding boundaries. So sort of trying to, from a modeling point of view, bring all of those things together. I'm just making those ideas hopefully a little bit more clear. So I think if I had my time over I would love to make those ideas more crisp. The other thing I would like to have done is make it a little bit more clear that I expected people to break their user interfaces apart. Right? And I didn't make that clear enough.
Julian Wood: So this is a microfront ends, they're parts or is that part of it?
Sam Newman: No, I would just say just normal sensible user interface design. I mean, let's be really clear, the whole industry shifting over to single-page apps was an absolute disaster for component and modular base interfaces because we had websites. But, well, we had web pages, you could break up a web-based user interface around pages and widgets. I mean, I was doing this stuff with motifs and CDE, like, in the late '90s, right? You could take a user interface and break that apart into components. It could be worked on by different people and different screens. The single-page app stuff took us down a wrong path. It was like there will be a monolithic interface. We'll dominate the browser pane, and now what we ended up with was people who broke apart the back end and now had 70 people working on a monolithic front end. And this was, "Oh, what if we invent this brand new concept called micro frontends where we break things down into things that could be worked on independently? Oh, like...
Julian Wood: We should call them widgets.
Recommended talk: Building Micro-Frontends • Luca Mezzalira & Lucas Dohmen • GOTO 2022
Sam Newman: We call them widgets or web pages. This is all the stuff we've had before. So I think the micro frontends, I'm glad it's happened. And Luca Mezarilla's book, "Building Micro-Frontends" is a great place to kick off if you... He's working on a second edition at the moment. I think it's healthy that this has happened. It's frustrating that it took as long as it did and that the people who pioneered a lot of that initial single-page framework stuff, I know at least one of the people, should have known better, to be honest with you. But I'm glad things are sort of writing themselves. We could maybe have a different conversation about just how overly heavy single-page apps now become whereas, you know, if you've got a micro frontend where you've got multiple different single-page apps all being blended together in a single browser pane, it could easily be 10 times the size of a Linux operating system worth of data.
Julian Wood: Just for the front end.
Sam Newman: Just for the front end which... But at a certain point, I have to be aware that I am now sounding increasingly like a grumpy old man who wants the kids to get off his front lawn. So, this is why I like writing about timeouts instead because it's a very easy conversation to have.
The Complexity of Asynchronicity and Communication Styles
Julian Wood: But when we were talking about words earlier, another thing you've been used to, being about at GOTO Amsterdam here is sort of being asynchronous, and asynchronicity, and all that kind of thing. And I know that's a loaded word with lots of weird meanings and everything. What are you talking about?
Sam Newman: Well, it's a question I... Actually, one day were masterclass here as well on Monday on how much services communicate... One of the big questions I get is synchronization. This is asynchronous communication. And I always struggled with it because I often thought it was the wrong question. So I tend to talk about event-driven versus request-response first . But I was kind of inspired to try and solve this issue because I realized I didn't like any of the deficit...
Julian Wood: So solving the definition issue, rather.
Sam Newman: Well, what does it even mean? And again, this was inspiration. There's a great Pat Helland, who is always good value for money. He wrote I think it's Substack. There's only a few different versions of it, where he revisited the term eventual consistency. And he sort of has his view that we should not use that term because we use the term consistency to mean lots of different things in computing. And so as a result, it leads people to be confused about what eventual consistency means and also confused about what consistent consistency in an acid transaction means. So he found a better term, so we could call it eventual convergence. Eventually, the replicas converged.
Julian Wood: Converged.
Sam Newman: It communicates a meaning, it avoids the problem around consistency. And so I thought, well, asynchronicity is really confusing to me. So why don't I look at all the definitions of asynchronicity out there and try to come up with a better term for it? Basically what I found is that there are a huge number of different definitions of what asynchronous means in computing, in general. And even if you just look at asynchronicity in terms of two computers talking to each other, there's asynchronicity in terms of how low-level networking protocols work. You speak to practitioners and experts, you read what's in the Reactive Manifesto. The Reactive Manifesto contradicts itself multiple times around this. For some developers, they think something's asynchronous when communicating on a service if they're doing non-blocking calls, right? Other people think you have to have an intermediary. You've got to have a message broker. So you start picking and peeling these things back. And then the issue is there is no definition of asynchronous. So all I got to in this talk... It still feels like it's a hanging thread. I'll have to revisit it in another 10 years from now to just say don't use the term asynchronous, talk about what's important. Say, "We want non-blocking clients. We want to use an intermediary for message broker delivery." Because just saying it is asynchronous has zero meaning. The definition that always makes me chuckle in the Reactive Manifesto, and I do kick the Reactive Manifesto a bit in my talk. This is largely because I always feel like I'm punching up because those are smart people that wrote it and I'm not as smart as them. But they say, you know, they basically say that something is... They talk about sort of, well, asynchronous communication as basically the server processes information after it has arrived as opposed to what, before it arrives? How's that gonna work?
Julian Wood: So even when it arrives is after it's arrived.
Sam Newman: Well, exactly. But an arbitrary point in time after it arrives, but, like, also it's no different from a HTTP call. Like, I know they are struggling to try and define it and they take an attempt at it, and they took a valid attempt at it, but it doesn't really work because we actually make much sense of the definition. And then they use a definition of asynchronous that comes from the "Oxford English Dictionary." It's actually what you first find... If you search Google, what does asynchronous mean, it's the first definition that comes back. The "Oxford English Dictionary" has over 40 different definitions of what asynchronous means. And the one they picked is not the one that's from computing. It's the one from medical contexts. So it's just like these words are overloaded, there are different meanings out there. So let's just not talk about it, let's talk about other things. Let's talk about more important things.
Julian Wood: Funny, in my mind, that links with the microservices where we often talk about the boxes, you know, Microservice A needs to talk to Microservice B. But we always draw that over the little line that connects the two kinds of things. And I think that line is probably fundamentally more important in terms of community, or definitely in terms of communication than the actual boxes in which direction the line is going, and it's coupling, and it's pulled, you know, pops up, all these kinds of things. And so I think this asynchronous discussion is very much more about the lines rather than the actual boxes. So maybe it's a complementary part to your previous work.
Sam Newman: Typically when I have conversations, and this is what we did in the masterclass is that, you know, the line is denoting a logical dependency, saying this thing depends on this at runtime. How you implement that, that's a separate set of conversations. I then try to start by talking about the stylistic differences, that whole request-response conversation, we are having a chat. Right. I am talking to you. Maybe I'm asking you to do things. You're letting me know if you can do those things or not. But even that nature of interaction, which is actually really familiar, a lot of developers, that's their natural way of thinking about communication.
Julian Wood: Everything behind an API, get a request, that?
Sam Newman: Absolutely. But then I'll talk to developers about that, and they'll say, "But what if we did it with a message broker instead? So I could put a request in a message, put that in your queue. When you're ready you can pick that up. You can then send me a response back. They're like, "Oh, I can do that." "Yeah, of course, you can." That's request-response as well. The style is the same, but we've made some different implementation choices. So I think it's actually just... Because often people say, "Oh, should I do gRPC or should I do, like, REST of HTTP?" Well, let's actually wind back and talk about do you want to do request-response or event-driven? Because depending on those styles, that's probably going to take you down certain paths. So it's almost trying to wind people back a little bit to think what style of interaction is going to best suit your problem space, how you think about the world? You know, what's most familiar to you. And then we can have those other conversations off the back of it. And as I say it's, again, I think it's because people are time-poor. They jog to run into a project. They see they're using Kafka. They don't understand why. It's like trying to get back to fundamentals a little bit. But coming back to your point, there's an arrow. That's useful to know there's an arrow. Now let's talk about how the arrow is implemented and what that means for your system.
Julian Wood: I think the easy...even a lot more stylistically to add to that arrow of, you know, once you've even got first path, okay, this microservice needs to communicate with that, you know, push-and-pull semantics, and you know, all those kind of things. And in fact, I like your request-response that can go via a queue or multiple queues on the way. Because people think of that, "Oh, that's you know, anything asynchronous is event-driven architectures." And I think they miss that little driven, you know, the event- driven architecture that that event needs to drive something to cause something to react.
Sam Newman: There's sort of bits in between there. And I think the other thing is the other confusions that people seem to have is as though your system has to be binary as in, oh, it's all request-response, oh, it's all event-driven. This is why, like, I've seen people say, "Oh, they're comparing event-driven architectures and microservice architectures." Like, no, no, no, a microservice, so this comes back to the information-hiding thing. I try to get people to think about when they're kind of managing a microservice, sort of creating it, and think about it fundamentally from the point of view of the consumer. Your consumers are other microservices, user interfaces, external parties. Think outside in what they need to do their job, use that to drive your API design, your event design, whatever else that might be. That also encourages information hiding because you're only exposing what's needed. You end up with an interface that's nice and easy to use, but different consumers may have different needs. They might need that information in different ways. It's not much of a step then to think, well, one microservice may want to expose its functionality or subsets of its functionality over different types of end point. So saying you might mostly be using request-response, but there's some problem spaces you've got, they just fit events really nicely. Don't tie yourself in not syncing, "Oh, I must do request-response." No, we'll just do an event there. One microservice can do some event stuff and it can do request-response API. It's fine to have a house style for consistency to say this is mostly what we do within our architecture for convention and understanding, but dogmatically saying I must do this, it just sits uneasily with me seeing people, you know, pretending they're doing events by sort of smuggling them over request-response and hoping their boss doesn't find out. It's all a little bit silly. But also you're getting away from this idea of... Also talking about the idea that a microservice can have more than one endpoint. Like, talking about split horizon API's for public versus private stuff. And again, I think those things were a bit more familiar to me because I remember back when you'd have two different NICs in a server for different networks because this was the admin network and this was the one that went public. And you can definitely...
Julian Wood: Back up network and all the other networks, yeah.
Sam Newman: Exactly. But also, like, that got wrapped onto that port onto that NIC. So from a security point of view I had physical...
Julian Wood: Behind a different firewall literally by cable.
Sam Newman: That packet can't get across it. There's no physical... So, I think it actually then comes back to the fundamentals. I'm studying logical relationships thinking about those stylistic decisions. Because once you've done that, what is a potentially massive solution space becomes more reduced. Okay, I won't do request-response. Well, actually, I'm probably looking into these types of services. Oh, I want to be event-driven. Well, I've probably been guided towards these types of technical solutions, and I think that actually can help simplify people's work and the decisions that they make.
Critical Thinking and Decision-Making in Software Architecture
Julian Wood: So other than buying your books, how would developers reason about this? What are they actually learning as a concept? Is it architectural patterns? Is it coding libraries, understanding distributed systems? If a developer is coming, or to a conference like today, or even just, you know, browsing YouTube videos on the internet, hopefully on the GOTO channel, what's the kind of thing they can even look for to explore this?
Sam Newman: I would say looking up anything on critical thinking I think would be an excellent first start. I also think having conversations, stopping and thinking about trade-offs is not something that comes naturally if you haven't done it before. I really do like the idea of architectural decision records because they become a forcing function for you to have a conversation with your colleagues, decide what you're going to do and actually write that down. So I think having conversations about decisions that you need to make, talking it through with your colleagues, and deciding and just trying that out. Someone says, "I want to use something shiny." Let's talk about it. What are the alternatives? I think it always comes back to starting with knowing what problem you're trying to solve together, and then running through the options. But it's about practice.
Julian Wood: That's a people element to it as well. We talk about the architectural practice from an individual developer, but there's wisdom in the crowd even if it's a small team to... And I think when you mentioned documenting these trade-offs as well. I mean, I love even internal things. I've seen that scene where, you know, people have written the reason they chose something and then all the reasons they didn't choose something, x, y, z because, you know, the next developer or next team is going to come in the pipe, read this, and go, "Oh, that was a really good reason." All that reason has changed. And so people think we made a pragmatic decision to go because that was based on the inputs. Inputs change and you can evolve.
Sam Newman: I think something like an architecture decision record, 75% of its value is saying there's a point in time where we have to stop, have a conversation...
Julian Wood: Decide.
Sam Newman: ...make a decision, and decide. Twenty-five percent of the value is then you've created a record of that for future reference. But it does then become freeing in terms of changing your mind. Right? We will discover new things. We pick this because of x, y, and z. Well, later on, we're revisiting it because x, y, and z maybe don't apply anymore. Or we were wrong about x, y, z. But that can help them be a bit more free. But I think a lot of the value, I don't want people to start doing things like ADRs and just as a box-ticking exercise. That it's almost like we just say we're stories, an Agile, a story is a placeholder for a conversation. ADRs are a forcing function to make sure you've had the conversation. And it should be a conversation, not an architect in a room writing down what they think should happen and then telling the team they're going to do it, right? So, I think it's trying it. You will get better, and yeah. Book some critical thinking classes on those sorts of things. They're always worth going to if you get a chance to go to them.
Julian Wood: I think it's even interesting because these are all things that developers are grappling with, and in the future obviously, you know, with gen AI and the whole kind of AI thing going nuts at the moment, you know, those critical thinking and those fundamentals I think are going to be, you know, certainly as applicable even if more applicable if your gen AI assistant is helping you write code or even making architectural decisions. You as an architect decision-maker are still going to need to reason with the bigger problems.
Sam Newman: I think in terms of critical thinking and maybe being aware of what's happened in the past, it still amazes me how many people think gen AI is going to write my systems for me. We've been here before. There was a thing called Model Driven Development. There was a reason why UML 2 and beyond failed. It's because it turns out that code generation is not the same thing as code maintenance. So on that particular front, I don't think it's gonna happen anytime soon. Maybe I'll be proved wrong.
Julian Wood: I think it will help, but ultimately that the decision-making matrix, before you've even written a line of code, there's been a huge amount of work that's happened beforehand. And yeah, at this stage, the machines haven't figured that out yet.
Sam Newman: One of my biggest hopes actually for the AI stuff is not about code generation, because I think that's one of the least interesting things it can do. I think I'm way more interested in trying to modernize quite an old code base and using these things to help explain what the hell's going on. That for me is significantly more valuable than, oh, it's like a slightly better version of templating I could do in IntelliJ, right? That's the more interesting stuff because...
Julian Wood: That's interesting because it's something I hadn't actually thought of before. I mean, we've got a tool that Amazon could do that does that as well where, you know, it's right-click to explain this code to me. And the uptake of that has been massive, and we're thinking, "Oh, well, that's all gonna be the cool kids writing new code." But yeah, that, just never mind refactoring, it's understanding what old things do.
Sam Newman: But something as simple as, you know, you select a piece of information. It's like, where is this getting changed? Like, going into a C code base, for example, right, where you don't have that sort of strong static typing or background to be able to trace things through. Human beings can reason about it and step through, but being able to say, "Well, where's this coming from? Where's this getting set?" Right? Trace this through the system for me. You know, things like that I feel that we are...the majority of the work we do is maintaining code bases, sometimes and often not the code bases that we wrote in the first place. And so that's the stuff I'm hopeful we'll get. I think the code generation stuff is really uninteresting, to be honest with you, but explaining this code, or helps you when you're learning code, or the big challenges we've got with things like ChatGPT. There's no concept of truth. So, but one of the benefits of generating a piece of code for me is I can check to see if that code is correct.
Now interestingly, there was a study done. It was a paper I read. I think it was about a month ago. It was looking into whether or not ChatGPT is giving better answers than Stack Overflow. What they found is no, it's not. The answers in general Stack Overflow were better than on ChatGPT's answers were. But people were more likely to believe ChatGPT than they were the Stack Overflow answers. And that's because when Stack Overflow people qualify, I think this is the right way of doing it. This is what works best for me. ChatGPT speaks authoritatively around...has no concept of truth and yet speaks with authority and people like to believe authority. And that's unfortunate, but at least with code, I can take the code it gives me and runs it and sees if it does the right thing. So I at least can validate that. So I think of it as a tool for helping learn a new programming language, and they get really useful, explaining a piece of code, absolutely. Generate this code for me to the point where I can rely on what it's written, maybe not.
Julian Wood: But maybe that's why, you know, it's all gen AI assistance. They are there to assist you with understanding and helpful things, you know, get rid of some of the grunty work that you don't need to do.
Sam Newman: Assistant is a much better word than co-pilot. Right? A co-pilot of a plane is somebody that can fly the plane, right? Not somebody who is, you know, making sure you're okay and bringing you a cup of tea or a refreshment, right? They are somebody who can fly the plane. When you call it co-pilot, we perceive it very, very differently. An assistant, absolutely fine. Words are important. It's how it impacts our perceptions around things. So, you know, maybe again having all of these things talking with authority and being called co-pilot, maybe that's a bit of a misstep that we'll maybe regret in the future.
Julian Wood: Interesting. Well, Sam Newman, thanks so much for spending time on "GOTO Unscripted." Hearing all about the new book coming up, good luck with the rest of the writing and the editing process. Looking forward to seeing that, and yeah, your history over many years of moving the software architecture, profession, discipline forward. Always useful to hear what you have to say. Thank you very much.
Sam Newman: Thank you.