Home Collections Software architecture State of the art of dora metrics & ai integration

Software Architecture

State of the Art of DORA Metrics & AI Integration

About the episode

Share on:

Copied!

Charles Humble: Hello and welcome to this new mini series for GOTO Unscripted called The State of the Art. This series will explore generative AI, but also other emerging trends in platform engineering and security, new practices and new programming languages. I'm Charles Humble. I have about 30 years experience as a programmer, architect and CTO, and I'm currently working as a freelance consultant, journalist, editor and podcaster.

I've just published a book for The New Stack called Kubernetes at the Edge, and I mention this because during the course of writing it, I found myself reflecting on the fact that during my career I've seen five major shifts in how enterprise software is built and deployed. The first was the adoption of high level languages like Java, C-sharp and Visual Basic. The second would be the agile movement. Then we have public clouds, then DevOps and particularly continuous delivery, and also microservices.

Each of those succeeded because they reduce time to value. They enable someone in a business to go from having an idea to getting the idea in front of a customer faster and at a lower cost. That combination allows businesses to try more ideas and use the market's response to determine which ones resonate and which ones don't. For all the hype and excitement, it remains to be seen whether generative AI will turn out to be a sixth shift. But many people are already persuaded that it is. Generative AI certainly holds out the promise of being able to write software faster. But if that's true, how do we avoid unintended consequences? And is that solving a problem we actually have?

My guest today, Nathen Harvey, is someone who has thought deeply about this topic. Nathen leads the team at Google Cloud's DORA, where he guides organizations in optimizing their software delivery. He leverages DORA's extensive research to help teams enhance developer experience and improve the speed, stability, and efficiency of software delivery. Nathen is also dedicated to fostering the growth of technical communities, such as the DORA Community of Practice, and has co-authored several influential DORA reports and contributed to the O'Reilly book 97 Things Every SRE Should Know.

He and I met up recently in London at the LeadDev conference, where he gave a fantastic keynote called Navigating the Generative AI Revolution in Software Development. Assuming the video of that is out, we'll link to it in the show notes. That, along with the DORA research that underpins it, is the underlying theme for this show today. Nathen, welcome to the show.

Nathen Harvey: Thanks so much, Charles. It's really good to see you. It was great to see you in London a few weeks back.

Charles Humble: Thank you. Lovely to see you as well. Despite my ridiculously long introduction, I managed to get all the way to the end of it without explaining what DORA is. I'm imagining all our listeners and watchers have some idea, but can you give me your description? What is DORA?

Understanding DORA: More Than Just Metrics

Nathen Harvey: DORA is a research program that's been running for well over a decade now. I think we're up to 12 or 13 years. This research looks into the metrics and the capabilities and conditions that teams need to understand how they're doing with software delivery performance and, probably more important than how they're doing, what things can they do to improve that software delivery performance.

Software delivery performance is the crux of being able to deliver software to your users so that you can test out whether this is a feature that your customers want, that the market wants. The more we're able to ship features, the more we're able to ship change, the better off we are in terms of being able to make decisions about what's the right thing to do for our customers and our users.

This research program each year runs an annual survey. This survey is an anonymous survey that goes out to organizations around the world. We're very lucky to get responses from organizations of every shape and size in every industry vertical. In 2025, we received nearly 5000 survey responses. We complement this survey data with some qualitative data as well. In addition to having these 5000 survey responses, we also had about 100 hours worth of qualitative interviews that our researchers did one on one with practitioners and leaders.

The participants of this survey and our research program, many of them are developers of some sort, certainly technical practitioners, whether that's someone who identifies as a DevOps engineer or a software engineer. But we also hear from folks like product managers, team leads, engineering managers, CTOs, and leadership. It's a pretty broad perspective that we get each year, which really helps inform that research.

Charles Humble: How did you get involved in DORA? What was your starting point for that?

Nathen Harvey: My starting point—when the research program first launched, it launched out of a company at the time called Puppet Labs. They ran the first ever State of DevOps survey. At the time, I was a practitioner working in an organization where we were starting to adopt DevOps practices. We were moving from the data center into the cloud. I was leading a web operations team, and so I participated in the first survey. In that way, I like to say that I've been involved since the very beginning.

A few years on, I joined a company called Chef, where some of the lead researchers—in particular, Dr. Nicole Forsgren and Jez Humble—were both working at the same time that I was working at Chef. While they were there, they were continuing this DORA research program. They eventually left Chef and founded a company called DORA. I stayed at Chef until I joined Google, which shortly after I joined, actually acquired DORA. So Nicole and Jez and I got to work together again. They've both since moved on to other things, and I've taken over leading the DORA program for the last four or five years.

Charles Humble: A question that comes up for me is how neutral the research actually is. Obviously this is now a Google project. Google has Google Cloud. I'm aware you don't promote Google Cloud in the reports. But Google has also been pushing very heavily as an AI-first company. Given all of that, how much does Google's agenda and influence affect what DORA covers or what it writes about, if at all?

Nathen Harvey: That's a completely fair question. First and foremost, the research has always been program and platform agnostic. We aren't asking about specific technologies, but we are asking about those capabilities and conditions that high performing teams have. I'm employed by Google, as are the researchers and a large number of people contributing to DORA. But we maintain a fair bit of independence in terms of what we research.

DORA has always been very interested in understanding what are the trends within industry and trying to spot those trends and follow them as they evolve. As an example, DORA started right at the early stages of cloud, and so DORA had research into this idea of whether public cloud was going to help organizations. Through that research, we were able to identify that it wasn't whether or not you were using cloud, but rather how you were using cloud. We've evolved that thinking into this idea of flexible infrastructure. When you have flexible infrastructure, teams tend to do better than teams without that flexible infrastructure. One of the ways to get flexible infrastructure is through public cloud.

We do maintain a good bit of independence. It is true that this year's report, the 2025 report, is called The State of AI-Assisted Software Development. We live and work within Google, so we and our colleagues are pushing AI and advancing the state of the art. At the same time, so many people across the industry are doing this. From a DORA perspective, we want to understand how AI is impacting software delivery and how it's impacting these teams. This is actually the third year that we've looked into AI, progressively going deeper and deeper. It would be false to say that there's absolutely no influence. There's certainly influence. But we remain fairly independent from Google Cloud.

From Four to Five Metrics: The Evolution of DORA

Charles Humble: Before we get into the AI report itself—there have been two of them now—I want to ask you a more general question. This is something that I've seen a bit in my own consulting practice, and it has to do with the DORA metrics. These are quite well understood. Obviously, this particular question is somewhat anecdotal, but I'm interested to know if it's something you see as well.

I've had a couple of clients who were having team performance issues. They'd adopted the DORA metrics, but they then seemed very unclear as to what improvements to make or where to focus. They could see what the metrics were telling them, but they weren't then doing the next bit, which was making a hypothesis and doing some validation. Obviously there's a lot that's very specific here, because if you're a young team, you probably want to be looking at different things than a huge team in a bank. But I wondered if you could tell me if this was something you've seen yourself, and if it is, do you have any general advice for people?

Nathen Harvey: Yes, it is certainly something that I've seen. Over time, the research has continued to evolve. While you mentioned the four DORA metrics, I'm happy to share that four have actually become five. For the past two years, we've looked at five different metrics for software delivery performance.

These metrics fall under two high level factors: throughput and stability. On the throughput side, we have three measures. Your lead time for changes—how long does it take for a change to go from committed to running in the production environment? Your deployment frequency—how frequently are you updating that production environment? And your failed deployment recovery time—when something goes wrong on a deployment, how long does it take you to get that deployment back to working?

On the stability side, we have your change failure rate. This is a measure of how many deployments require immediate intervention—something has gone wrong and we need humans to intervene on these deployments. And finally, our fifth metric, the newest one, is your deployment rework rate. When you make a deployment and have to rework that deployment in some way, whether that's an unplanned deployment to recover from some failure, that would be considered rework.

Those five measures taken together have a couple of really interesting things about them. First, they can be used for any type of software that you're shipping into production or making available to your users. Second, we oftentimes think of throughput and stability as trade-offs of one another. But over the years, over the decade-plus of this research, we've shown that they aren't trade-offs, but rather they tend to move together. Teams either have fast throughput and high stability, or low throughput and low stability. We probably want the former, not the latter. We want to be able to move fast and move in a way that is very stable.

When it comes to your question about how do we go from understanding what these metrics are measuring to knowing what to do with them once we understand how things are going—these metrics serve two purposes. One purpose is that these metrics, in our data and research, we've shown them to be leading indicators for the outcomes that teams care about, things like organizational performance and well-being for the people on your team. But these metrics are also lagging indicators for the capabilities and conditions that you have on your team. For example, you need a climate for learning. You need fast flow and fast feedback in order to improve those metrics.

With the DORA model and through our research, we investigate those capabilities over time to understand how those capabilities relate to software delivery performance. We find things like documentation quality, streamlining change approval processes, continuous integration—all of these contribute to better software delivery performance.

The thing that I really encourage teams to do is to use the DORA model of these capabilities together with these metrics. Use the metrics to get a baseline for how your team is doing today, and then take into consideration the capabilities and run a quick assessment across those capabilities. Where does our team have a weakness or an opportunity to improve? Then go run an experiment. Take DORA's findings as your hypothesis.

A good example of this: DORA has found that as you get better at continuous integration, that improves software delivery performance. Maybe your team isn't practicing continuous integration or has opportunities to improve it. Sit down and write that hypothesis: we believe that by improving our continuous integration practices, we're going to see software delivery performance improved as measured by these metrics. As a result of that, we're going to see happier customers.

Now that you've got that hypothesis written down, you can go back and start improving or changing how you do continuous integration, and you can validate that hypothesis. You can validate our research in the context of your team. That's what's super important—you're taking our research and contextualizing it for your team within your organization.

Charles Humble: That's a fantastic answer. Thank you.

The Surprising Impact of AI on Software Delivery

Charles Humble: It would be remiss of me not to start diving into the AI stuff at this point. There was a rather headline-grabbing finding in the 2024 DORA report, which basically said the more AI was used, the worse both stability and throughput became. More rollbacks and so on. That's essentially what we've been optimizing for most of the last 20 years. It's not a great finding. I should say the 2025 report suggests the opposite. But I was interested if you could give us some context. I want to talk about that finding and what maybe it's telling us.

Nathen Harvey: In 2024, we did see that as teams are adopting more AI, both throughput and stability were falling off. There was a small reversal in 2025. As teams are adapting to AI, we see throughput is actually improving as you use more AI, but we still see high levels of instability when you use more AI.

When I think about this, there are probably a couple of different explanations. The first explanation is probably the simplest, and that is we're using a new tool and using new processes as we're developing software. Any time you introduce something new, we should expect to see some fall off in performance as we learn how to adapt to this new technology or this new way of working, and how that might improve overall performance.

But the other thing that is really interesting here, and another hypothesis that I have, is that as an industry, one of the biggest ways and first ways that we've used AI within software development is to use it to generate more code. When you think about the entire software delivery process—everything from having an idea to getting that idea into customers' hands in a tangible way within your systems—it is rarely the case that writing code is the bottleneck in that process. And yet we've put a lot of emphasis on AI to help us generate or author more code.

Your bottlenecks are usually downstream from that. We know that in any system that you're working in, if there's a flow of work that goes through that system, somewhere there's a constraint. If you make an improvement that's not at that constraint, you're probably not improving the overall system performance. In fact, you may be harming it. In 2024, I think that's what we were seeing. Generating more code and generating code faster, but the mechanisms, the feedback mechanisms, the flow mechanisms that allow that code to get into the production environment hadn't adapted, weren't able to account for or handle this new onslaught of additional code. I think that's a big reason why we're seeing that. We're starting to see some of that change now in 2025.

Charles Humble: I've been in this industry a long time and I feel like we've been over this so many times—the idea that typing is not the bottleneck. Which is why measuring lines of code or GitHub commits or any of those things are just terrible metrics. As an indicator of performance, they don't really tell you anything. It feels a bit to me like we're going round another of these loops of optimizing individual developer productivity but doing that at the expense of the system as a whole. It seems to me I'd have thought by now we should have moved away from that, but apparently not.

Nathen Harvey: That systems thinking is so important. You could lay the same criticism against the software delivery performance metrics. They're not actually telling you whether or not you're delivering business value or customer value. I would agree with that. But our research has shown that as you improve software delivery performance, you're also improving those outcomes that matter.

I challenge every organization that's using these to put those software delivery metrics side by side with the business metrics that they actually care about. You should see them moving together. When you don't, that's cause for investigation, cause for inquiry. Let's figure out why these aren't moving together and see what we can do to change that.

AI as an Amplifier: For Better or Worse

Charles Humble: In the most recent report, and also I saw you emphasize this in your keynote, you talked about AI as an amplifier. Can you unpack that a bit for me? What does that mean?

Nathen Harvey: I really think that AI—what we're seeing in the data is that it amplifies the underlying systems that exist within organizations. In organizations where software changes can flow easily and there are good feedback mechanisms along the way, AI tends to amplify or speed up that flow of software. In contrast to that, in organizations that are disconnected or there's a lot of friction in getting changes through your system, AI tends to highlight where the pains are.

I'll give you a really good example, a crisp example that we've actually been researching for a couple of years. If we go back to 2023, which was really one of the first years that we investigated AI, one of the findings was that the code review process was often a bottleneck for teams, with or without AI. But we found that those teams that had faster code reviews had about 50% better software delivery performance. It's a one-to-one mapping—as you improve code review process and speed, software delivery speed also improves.

Now inject AI to help us create more code. If we haven't addressed code review as a bottleneck, we're now sending more code into that code review process, and so that's going to feel even more painful. That's a great example of AI amplifying some pain that we have there. The code reviews are actually going to take longer, get slower, because we're generating more code.

Contrast that with an organization that has a good, high quality, fast feedback process when it comes to code reviews—AI is going to help us improve that. In fact, what this may lead you to as an organization is understanding that maybe we'll get more benefit out of using AI in the code review process than we will in the code authoring process. That's one of the ways that you can contextualize these findings within your own team.

Charles Humble: What would you say are some of the things that technology organizations need to get into place before they try and scale AI adoption?

Nathen Harvey: This is a really good question, and it's one of the research questions that we had. Earlier in our research program, we looked at the cloud and it wasn't that you were using the cloud, but rather how you were using the cloud. This year, we wanted to understand what capabilities really enhance AI adoption and the impacts that AI adoption is having.

We investigated a bunch of capabilities based on the hypotheses that we had, but we narrowed it down to about seven capabilities, which we are now calling the DORA AI Capabilities Model. These capabilities and conditions, when matched together with heavy AI adoption, amplify that adoption on the outcomes that you care about—things like reducing friction, things like better product performance.

The seven capabilities: First is a clear and communicated stance on how and where and when AI can be used within your organization. This is really important to have. The others are having a good data ecosystem—following the old adage, garbage in, garbage out. The AI has to have access to good data. That data has to be well connected, which leads us to our third, which is AI having access to that internal data.

Fourth, we have strong version control practices. If AI is generating a lot of code for us, we probably want to be checking in those changes very frequently so that when something goes wrong, we can rely on the rollback capabilities that version control gives us. Another is working in smaller batches instead of taking on a big change. What we do as engineers, we take large changes and break them down into small changes. Doing so helps the AI have even stronger impact on the things that we care about.

Number six is a user-centric focus. We are using AI for our users, using it to support their needs to help delight them. And then finally, having a good internal platform in place. We see that again as an amplifier of the AI adoption that happens within an organization.

Those are the seven DORA AI capabilities. We expect that those are going to change over time. Just like our four software delivery metrics are now five, we expect that these capabilities are going to change and evolve over time as we learn more, both as an industry and as a research program.

Charles Humble: It's worth saying, I think, some of those are very software-specific, but some of those apply equally well outside of a software organization. Having a clear AI policy, having a clear AI stance is something that every organization that's adopting AI in any context needs in place. I'm thrilled to see that. It's validating a lot of ideas that I think people have, and giving them weight. I think that's tremendous.

Nathen Harvey: When we look at AI adoption, we are looking at it beyond just using AI to write code or within the software delivery practice. We really are looking at AI across the organization.

The Documentation Paradox

Charles Humble: Earlier on you mentioned the importance of documentation. This has been a lovely thing for me because that's been a drum I've beaten for most of my career. I have to tell you, at times it's a lonely drum. Having the last few years being able to point to DORA and go, "Look, it turns out that documentation quality does actually have an influence on software delivery performance," has been nice.

In the most recent report, you say that AI adoption helps improve documentation quality. I'm going to be honest, I find that quite surprising. As an editor, all of my experience has been the opposite. Anything that's obviously written by a large language model is just painfully difficult to edit. It's partly because it breaks your heuristics. You can normally tell whether a human being is going off the edge of their knowledge and think, "Oh, I need to fact-check that a bit more." With an LLM, it's super hard to tell. Can you help my understanding of this? How are people using AI for documentation? How's that helping? Why is it improving?

Nathen Harvey: It's really interesting. Documentation, like you said, is something that we've looked at for many years. We have a saying that documentation is like sunshine. It really unlocks a lot of the things that matter to technology-driven organizations and teams. You're also correct that in 2024, in particular, what we saw was that as you increase your usage of AI, documentation quality tends to improve.

It's really interesting because we aren't asking about documentation quantity. You don't measure documentation by the pound. We do ask survey questions that get to the quality of the documentation, questions like, "If there's a problem, do you reach for the documentation?" That's a good indicator that you have good quality documentation.

While we saw those quality signals improving as you use more AI, we have a couple of hypotheses around those. First and foremost, maybe you're using AI to generate more documentation and perhaps it is helping improve some of that documentation quality. I'll speak from personal experience. I pointed an LLM at a project that I was working with that had no getting started documentation. I said, "Write documentation that helps a developer get started in this particular project." It did a fair job. Certainly it required some editing, but the reality is I wasn't going to write that from scratch myself. Even though maybe I should have, it's not fun to sit and look at a blank piece of paper. I'd much rather take even some poorly written words and clean those up, correct that, and make it better.

I think that definitely helps. More engineers are putting a little bit more time into documentation and leaning on the crutch that is the LLM. But I think there's another thing that we sometimes discount. One of the things that I think LLMs are pretty good at is summarizing existing documentation. I can hand it the DORA report from this year, 142 pages long. I don't have time to read that. Let me hand it to an LLM and get a summary of that, and then decide which parts I want to go and read. LLMs are very good at summarizing documentation, summarizing content, and giving us the key points.

While I think that LLMs and AI may be improving the documentation quality, they may also be helping teams get more utility out of their existing documentation. Perhaps you have documentation that's spread around your organization, or maybe even conflicting documentation. You can feed all of that to an LLM and try to draw some of the key points out. Maybe that gives you better utility out of those existing docs.

I think there are a couple of different ways to look at it. Is it improving the quality of our documentation or helping us get more utility out of the existing documentation? The answer is likely yes to both, or at least it's impacting both, and hopefully improving that. But again, we look at the world and we see these patterns that are emerging. Everything comes back to the context within your own team. I would encourage you to take our hypotheses, take our findings as findings that we see, but you have to prove them out within your own context as well.

Trust, Ownership, and the Human Element

Charles Humble: I want to move on to some of the more human aspects of this. One of the things is trust and how much you trust the output of the model. We sort of touched on this a bit with documentation. I think it's entirely possible—in fact, I've had this experience—I've asked an LLM to document an API, and it's produced documentation which includes calls and features that don't exist at all. I've also done an experiment on one of these reports and pointed it to a press release and asked what I'd missed, and it gave me six things that it claims I've missed, and I couldn't find any of them in the report. I had that sort of typical, "I can't find this, maybe I made a mistake" type response.

I guess what I'm getting at is there is this broad question of how much do you trust the technology? Maybe I'm just using it wrong, but how much do you trust the technology and what does that look like?

Nathen Harvey: This is a really fascinating question because for two years in a row, we've seen individuals reporting feeling more productive when they're using AI, that it's improving the underlying quality of their code base and their documentation. We ask about trust in both 2024 and 2025, and we see a trust paradox, if you will, where we see a significant portion of people that have very little or no trust in the output of AI. In fact, in 2025, it was 30% of our respondents had little or no trust in the output of AI. On the other hand, there were 4%, I believe, that trusted the output of AI a great deal.

I personally believe that you should not trust the output of AI a great deal. You should not trust it 0% either. Both extremes, I think, are wrong. In 2025, we're recording this in October of 2025, I think either extreme is incorrect.

This trust paradox really reminds us of some of the things that we've seen throughout our research, and both of us have experienced throughout our careers—this idea of getting fast feedback on whether it's software changes or a document that you're reading. You have to be able to get that feedback very quickly, and you want that feedback to be very good feedback. This does lean us toward this idea that we want to have those strong feedback mechanisms in place so that when we generate something with AI, we can test that. Is this valid? Is it truthful? Is it comprehensive? Is it doing the things that I like?

The other thing that we see in the data, which I don't think is a surprise, is that as you use more AI, you tend to trust it more. Usage and trust are kind of a virtuous cycle. You use the thing more, you trust it more. Because you trust it, you're more likely to use it. Because you're using it more, you're more likely to trust it.

But I think this also goes back to this idea that trust doesn't necessarily mean that it's accurate. Trusting the output of AI doesn't mean that the output of AI is accurate. It means that to me, trust really is about meeting my expectations. I expect you're going to give me mediocre responses, and I continue to use you and I continue to get mediocre responses. That improves my trust in you. Now, that's not to say that AI always gives you mediocre responses. I'm not saying that. But as it meets your expectations over and over again across time, that I think is the thing that really builds trust.

I know when I ask it a question, if I ask the question like "Find six things that are missing," it's going to find six things that are missing, whether or not it's valid, whether or not that's true. I asked it for six things. It's going to give me six things. That's the nature of AI today. It wants to please, it wants to answer the question that was asked. It's not as well tuned at saying, "You know what? You've done a great job here, Charles. I can't find anything that's missing."

Charles Humble: There's another question that I have here, and it has to do with ownership. This is probably a whole discussion in itself. But I suppose what I'm getting at is, does AI change our relationship with the code? If we've written some code with an AI assistant, who owns that code? Who is responsible for that code? What is our relationship to that code?

Nathen Harvey: That's a really interesting question. In fact, we looked into that a little bit and reported some of our findings in the 2025 report. There's a section there called The Socio-Cognitive Impact of AI on Professional Developers. We looked at the relationship that developers have with their code when using AI in terms of things like authentic pride, the meaning of the work, the need for cognition and existential connection to that code, to your point, the psychological ownership over that code, and skill reprioritization—what skills are our teams prioritizing?

I would say that we are very early in this evolution and the adoption of AI. There wasn't a whole lot of rich, meaningful data that we found. We found that they're relatively close to one another, whether you're using AI or not. Things like "The code I wrote is my own," do you agree or disagree with that? They're usually pretty close to one another. I think we're in the early days of answering some of these questions and, in fact, individuals internalizing how we feel about some of those questions.

The Devaluation of Skills and the Future of Expertise

Charles Humble: There's another human question here, and it's an obvious question in a way, but it's this idea that AI devalues skills. For example, do I need a specialist? My AI can build the UI for me. Do I need a security team when I can automate that work? Do I need a documentation specialist when I can generate the documentation?

A lot of conversations I'm having with people—a lot of people are genuinely, I would say, grieving. A lot of people are genuinely feeling like they spent a whole bunch of time developing a set of skills that are no longer relevant, not just within IT, but much broader than that. I'm a not terribly successful musician in my spare time, and I know lots of people who are media composers who write music for TV and games and films. A lot of them are saying, "This is being taken away from us, basically."

Is this something we just have to accept, a bit like the Industrial Revolution or whatever, that some jobs are going to go away? Or do you think there is still a role within our industry for specialist skills?

Nathen Harvey: I think it's really interesting. I think that AI does open up opportunities for people to pick up new skills and become kind of novices at those. My biggest concern, honestly, is that AI enables this devaluing of the skills that other experts have. I think documentation writing, technical writing—yeah, I can generate docs. Do I need technical writers anymore? Well, if I'm a software engineer asking that, just turn that question around and think, do I need software engineers anymore?

The answer, if you're a software engineer using AI, the answer is obviously yes. We continue to need software engineers. I think it's really important that you consider technical writing as the example here. Do you have expertise in that skill? You don't. Maybe AI has made you a novice technical writer, but you're not an expert, certainly not today. Leaning on that expertise that an expert technical writer brings to the team, I think is really important.

I think on the one hand, AI is democratizing a bunch of different skills. A product manager can very quickly build a prototype that maybe they couldn't build before. It's pretty clear to anyone that looks at it that that prototype is not ready for production. It is really important that we, as we think about doing some of the things that other experts do, keep a couple of things in mind.

One, you might do those because you've always wanted to and just haven't felt that you have the skills to do so, and just doing that for fun. There's a lot of value in that. I can now go and compose some music that I couldn't compose before, and that's fun. But recognize that the work that you're doing is not expert work. Treat it as such. Maybe it helps you communicate better with those experts, because now you've learned a little bit more of the skill. But I think it's really important to remember and recognize that we still need the expertise, even if the AI allows us to scratch the surface in that field.

Charles Humble: I think that's a fantastic answer. Thank you.

Looking Forward: The Future of AI in Software Development

Charles Humble: DORA is very geared to where we are today. It's basically a snapshot of here is the state of the art right now. I'm going to ask you a desperately unfair question, which is what do you think the future looks like? Where are we going with this AI stuff?

Nathen Harvey: You've raised a great point. DORA does take a snapshot of the industry year on year. We do continuing research throughout the year. But DORA, just like most of us, Charles, is pretty bad at predicting the future. DORA tends to stay out of that game.

But I do think we're seeing some echoes here of things of the past. You mentioned these at the top of the show—whether it's agile or DevOps or the cloud. In all of these changes, in order for them really to stick, I think there are a couple of things that we should keep in mind that we've learned from the past, and we should probably expect these patterns to repeat in the future.

One that DORA has talked about a number of times is the J-curve of transformation. That J-curve of transformation is really when we have a new technology or a new way of working, we tend to initially see a dip in productivity. Then eventually that dip bottoms out and we start to see real changes or real improvements based on adopting that new technology or that new way of working.

I expect that we're going to see that. Our goal within DORA is really to understand how do you get out of that dip? What are teams doing that's different than other teams that helps them get on the upward trajectory? That's part of why we have the DORA AI Capabilities Model. I think that's one important thing to keep in mind.

The other thing that's important to keep in mind is that we have to think systemically, think about the entire systems that we're working within, and recognize that every single one of us is working in a complex system. This complex system has emergent behaviors. There are people involved and technology involved, and they all interact and respond differently to stimulus. We need to understand that, and know that we can't get rid of critical thinking skills. We can't get rid of experimentation. Experimentation is one of the best ways for us to continue to learn how and where to best use these technologies. I think we're going to continue to see some echoes of what we've seen in the past.

I do think that AI is here to stay. One piece of evidence that I'll bring to that—in 2024, we asked how is your organization reprioritizing AI? It was very clear that almost every organization is prioritizing AI from the top down. We also see broad adoption at the practitioner level, so it's this top-down and grassroots. I think that provides a really strong signal here that this is here to stay. It probably in five years won't look like it looks today. But we will still be talking about AI. Or maybe we'll stop talking about it because it's just the way that we do work.

Charles Humble: Do you think—you mentioned you've now got another metric in the standard set of metrics. Do you think that those core metrics will need to change or evolve because the process of building software is changing?

Nathen Harvey: I think what we will see is a continuing evolution of the capabilities that drive the software delivery performance metrics. DORA has always had a center of gravity around software delivery. I do think that in the future, it will still matter how often and how safely a team can deliver new functionality, new changes to an application.

I believe that, as Dave Farley says all the time, one of the measures of quality of a software application is the ability to change it. We always need a way to measure that ability to change. I think that the software delivery performance metrics are a pretty good way of looking at that.

Will they change over time? Maybe. If you'd asked us five or six years ago, we probably wouldn't have said that. But we see that the science is always evolving. Today we have five, whereas a few years ago we had four. I think that they may continue to evolve, although I do think that they're durable enough that we can stick with these and get a lot of good utility out of these for years to come.

Getting Started: Practical Steps for AI Integration

Charles Humble: Fantastic. My final question would be if team leaders are looking to integrate AI into their processes now, where do you recommend they start or what do you recommend that they do?

Nathen Harvey: There are a couple of different options here, and we talk about these a little bit in the report as well. The first option is really think about adapting your current systems to take advantage of AI. I think one of the best ways to adapt your current systems is to make sure that you understand your current systems or to gain a better understanding of them.

One of the best practices that I've seen and utilized over time is value stream mapping. If you get together with a cross-functional team and really map out what does it take to go from an idea all the way through to a thank you from a customer? How does that information flow through your organization? Where are all of the handoffs? Where are the friction points?

Identifying those friction points can give you a pointer into where might we inject some AI? Or more generally, think about how do we do that thing differently with or without AI? Having a value stream map and going through—importantly, the mapping exercise, it's the mapping that's more important than the artifacts of the map—really helps collectively us understand our systems. I think that's one thing to adapt your systems with AI.

Then the other thing is to start evolving our systems and thinking in terms of AI-native. I think we're so early in this AI journey that it's hard to imagine what is AI-native software delivery? What is AI-native building an application from scratch and managing that application over its lifetime? What does that look or feel like? I think, like I said, it's very early, but this is another way that we can start thinking about how do we best leverage AI today.

I would say that the general physics of software delivery haven't actually changed. We still have to write code. We've got to validate that code. We've got to get that code approved and ship it off to production, where we're then going to monitor that code and so forth. Is that going to change in an AI-native way? I don't know. It certainly hasn't today. But let's all go figure that out together.

Charles Humble: Fantastic. Nathen, thank you so much for joining me on Guys Unscripted.

Nathen Harvey: Thank you so much, Charles. It's been a lot of fun.

Contributors

Nathen Harvey

DORA Lead and Product Manager at Google Cloud

Charles Humble

Freelance Techie, Podcaster, Editor, Author & Consultant