Home Gotopia Articles CI/CD Evolution:...

CI/CD Evolution: From Pipelines to AI-Powered DevOps

Julian Wood sits down with CircleCI's Olaf Molenveld to explore the evolution of CI/CD, from managing deployment complexity to AI-powered DevOps.

Share on:

Copied!

In partnership with

CircleCI

CircleCI is the leading Continuous Integration and Delivery platform for software innovation at scale. With intelligent automation and delivery tools, CircleCI is used by the world's best engineering teams to radically reduce the time from idea to execution.

View partner

Read further

The Real Challenges of CI/CD

Julian Wood: Welcome to another episode of GOTO Unscripted, where we get to talk to interesting people across the industry about the cool things they're building and the stuff they're working on. I'm happy today to be joined by Olaf Molenveld from CircleCI. Tell us about yourself and what you do in our industry.

Olaf Molenveld: Thank you, Julian. My name is Olaf Molenveld. I'm living in the Netherlands, and I'm active as a Technology Officer at CircleCI. We got acquired with our canary releasing platform Vamp about four years ago by CircleCI. From that point on, I've been working within the CI/CD space as a technology advisor, looking ahead to see what's about to happen in this space and where we need to be. You need to be where the puck is going instead of where the puck is right now. I have 25 years of experience in the internet industry. You can see my gray hair, my beard, scar tissue all over the place, but I'm still enjoying it very much.

Julian Wood: Writing software is one of the bedrocks of our software industry, and today with AI and vibe coding and agents, everything is going wild. But getting software into production, or at least getting software in front of your customers, is the second part of it. It's great to write your software, but if you can't get it to run anywhere, that seems a bit useless. You're at the critical part of getting the software to people safely and securely through pipelines. Tell us about the challenges that people are having. Writing code is getting much easier. I was coding with AI systems this morning and building some scripts which blew my mind in terms of productivity as a terrible coder. What are you seeing out there that people are finding challenging or even delightful when they're trying to get the software out into the world?

Olaf Molenveld: That's a very good point, especially in this space and this area of machine learning and AI. Why do we write software? That's an interesting question. I always like to ask the three or five times why. I think we write software because we want to create value for the users of the software. My background when I started was building things like content management systems and e-commerce platforms. One of the core things about those systems is that you allow non-technical people, the business, to do their thing with the software. They can create their own websites or web pages or product categories online.

I think it's the same way with writing software. You want to have it in the hands of users. You can write the most beautiful code, really awesome regular expressions that you're super proud of. I was a coder at some point and I still do it for fun. But if nobody is using that code or the application that the code is making up, then what's the use? You designed this beautiful car with the most awesome features, but there's no factory line, no factory that churns out these cars, so nobody's driving them. Nobody's enjoying the experience. Getting the software in the hands of people is the most important mile. Otherwise it stays in your little computer domain and nobody's telling you it sucks or it's awesome, or can you change this for me because then it's even better. That's the feedback loop that every product person or software developer with a product mindset is looking for.

Julian Wood: That's interesting, you're talking about the factory and automation, which I think ties into CI/CD. I read an interesting social media post knowing we were going to chat. Somebody said that CI/CD is what everybody's working on, but before you get to CI/CD, there are so many things you need to do beforehand. You need to have robust tests, you need to write code with tests in it, you need to have governance, security, analysis of your code base, all in place. Then CI/CD is the last mile, as you said. But you need to have written your tests, built your pipelines, done all this stuff beforehand. When DevOps started, that was the genesis of all of this. Yes, you need to have automated tests. Yes, you need to be able to have small bits of code that you can iterate over quickly, small bits of changes through your pipelines. But that's a lot of work. Where do people start? What are you seeing in terms of where they go wrong or where the CI/CD becomes difficult and complex for the last mile when they haven't done maybe the preceding stuff?

Olaf Molenveld: That's a very interesting angle. In the DevOps space, we always talk about moving left or shifting left and shifting right. We did canary releasing for containers and microservices, which was typically about moving right. How do you get your stuff into production? The difference between deploying and releasing is also a very interesting one. Deploying is something technical, but releasing is more like opening it up to people who can start using it, and then you start getting the feedback.

Julian Wood: And that can be a business decision or can I allow something to be accessed?

Olaf Molenveld: Yes, it can be very gradual, staged. Then you want to understand how it's being used and if it's up to speed or if it's actually doing what it's supposed to do. But then you also have shift left, which is more the testing period: the unit tests, the integration tests, synthetic testing, all that stuff. I think the reason you want to automate is because before DevOps, and in my mind DevOps is a cultural thing, it's developers working with ops and ops working with developers, understanding each other so they can have the right handovers. But even before DevOps, if you want to have automation where you can deliver software that's not broken or buggy, you want to have your tests automated. When you change something, you want to make sure that it's not breaking something. Having this in a continuous loop is important.

You change something and you look at your dashboard. In the early days, like Cruise Control and Jenkins, I still remember people would say "you broke the build" and then the dashboard is red, or "wow, that's awesome" when it's green. But you know why it was awesome? Because people would look at this thing, not just me as a developer. Other people would also see that something is broken. Maybe I need to help, or it's more like a collaborative thing. I think the automation helps people to work together.

The complexity comes when we start introducing more services, microservices, or slicing up pieces of the monolith. Then you get dependencies outside of your monolithic code base. If you update this thing, how do you update your database? If you start doing mono repos, what tests do you need to run if you only touch this part of the code? That complexity is introducing something which we did with code, like automated testing and integration testing to make it more dependable. We're also seeing this now happening on a layer above where you have the CI/CD or your DevOps automation. Infrastructure as code is automation. You write a program to automate things. But now we also have another layer, like a meta application that is building and testing and deploying and releasing and measuring.

You get these challenges: before I change something in my delivery pipeline, in my factory, how do I make sure that it's not breaking something somewhere else in my ecosystem? It's becoming an ecosystem, not only the big monolith that you deploy every three months.

Julian Wood: I like that view of the analogy of running the factory and then what the factory produces. In modern factories creating cars, for example, you do have the cars that you are creating, but the robots and the machines that are actually making those cars is another whole ecosystem itself. That's the care and feeding of your CI/CD system. If your robot that is punching some piece of metal breaks down or doesn't work, you don't get cars out.

Olaf Molenveld: I think it's a very interesting analogy. I'm remembering Northvolt, which was a battery producer, a manufacturer. They went bankrupt a few months ago. The reason they didn't survive was interesting because they were not able to manage the automation, the machines that allowed them to produce the batteries at scale. That's the challenge. We are in the business of building the machines that allow you to deliver your software at scale with the right quality. The machines are a totally different area of knowledge and experience and understanding.

Today, there are only a few Chinese companies that are able to build these machines that can produce these big batteries at scale, and they only have the engineers that are able to configure these machines. That's the big challenge: how can we allow companies and teams to also manage and set up their machines to create these pipelines that churn out their software in a scalable, effective way with the right quality that they require?

The Evolution of Pipeline Architecture

Julian Wood: The analogies I'm thinking of, I work on serverless applications in AWS, so it's often all about microservices, small little functions or APIs that you're delivering. The analogy that I've spoken about before is that you often had one pipeline. I remember looking over people's shoulders at Jenkins scripts and being like, "oh my word, how on earth does that happen?" One pipeline, everything goes through. But I think the modern reality today is you never have one pipeline. There can be many pipelines in every feature branch or every little component. Every little microservice can have its own pipeline that can maybe dynamically spin up to run something and shut down. From a serverless perspective on AWS, that's great because you want everything dynamic and you only want to pay for things you use, and have the dynamic creation of your pipeline.

How do you see that? How does CircleCI fit into that? I'm also thinking about how do you even codify these pipelines? Because that's software. It's instructions, it's YAML, it's Jenkins scripts, but it's evolved. Tell us the state of play of what that looks like: lots of different pipelines and how they are actually represented in code or infrastructure, this blurring of code, infrastructure, and pipelines.

Olaf Molenveld: That's very interesting. Those are two very interesting topics. Let me see if I can remember them. Initially, with CircleCI, we were very much mapped to the git repo. You do a pull request and merge, and that triggers something, it starts testing things. It was really focused on the code and the version control system. But right now anything can trigger a pipeline: maybe a Docker update somewhere, an S3 bucket that changed, your machine learning algorithm, a new model that you want to apply. That's something we within CircleCI are very focused on. We're saying anything should be able to trigger a pipeline. And a pipeline can be anything. It doesn't need to be kicking off integration tests or unit tests. It can be doing a security scan and then pushing a new container to some container registry. Then maybe that push can trigger another pipeline that does something else.

You get this chain of things which are, again, like microservices and event sourcing. It's just doing the same kind of things. That's a challenge. How do you keep track of how all these things, these pipelines, interact with each other? It's like, why did my build break? Now it's why did my pipeline break? I'm expecting something happening here. Why didn't this happen? Or maybe you got something else in the pipeline.

When we started in the CI/CD space, people were writing their own scripts, bash scripts. In Jenkins, you have these Groovy scripts that everybody loves to hate. At some point, it's hard to maintain. It's hard to understand what's happening. The same thing with software: if somebody else needs to take over, we need to manage this thing. It's better to have a more structured way of defining what's happening.

We went from codified pipelines to more descriptive, DSL-based, deterministic pipelines, which is basically YAML or JSON. These pipelines are more like reading a book. You read from top to bottom. That's easy to understand and also easier to maintain. Now lately, with the increasing complexity of CI/CD workflows, you see this move back like, "yeah, but I want to codify CI as code because then I can write tests and I can do all this automation that I want to do." We're moving back to where we came from. But on the other hand, we also have all these abstractions and native functionalities that we want to provide to people building the machinery, the pipelines.

What I'm seeing is that it's always grays, it's not black and white. It's combinations where maybe you use the CI/CD platform as the orchestrator with the dashboards and the APIs and the usage insights. But then you do the logic of your pipeline or your workflow maybe in Dagger or as a bash script. With CircleCI, we also have dynamic workflows where we have a TypeScript-based SDK that you can use to create dynamic pipelines.

You see that it's being mixed to manage the complexities, and debugging the pipelines becomes much more important. I think that's also where mature systems can create value, like SSH into the box, tracing, telling you why things are not working as expected, trace back, maybe step tracing through your pipeline. All these things that we did with software before, we started to move up the stack.

Julian Wood: That's an interesting thing you brought up of having different software that maybe generates and runs your pipeline. You mentioned Dagger, which seems super interesting as running your pipeline as a sort of compiled object kind of thing, as code. So there are different ways of generating a pipeline. Is CircleCI then able to have visibility over all of these? Is that the value you bring, that you have this observability and this operational overview of multiple different ways you may be running your pipelines, sort of one level below?

Olaf Molenveld: That's an interesting one. I think that has to do with how deep you want to integrate with those tools. One of the things that we're working on is, for example, the deploy and release story, which is what happens after the software is successfully built. We had, and we have, an agent that you install in your Kubernetes cluster, and you can use it to deploy your stuff into a Kubernetes cluster, use the Kubernetes deployment mechanisms or use Argo Rollouts. But at the same time, people are saying, "I don't want to install an agent." So we also allow you now to do agentless deploys. That means that we allow you to, in your pipeline definition, monitor certain ports or metrics that are available, and we can observe them.

I think it's a balancing act. You want integrations for things like observability, that's a no-brainer. But at the same time, you also want to allow people to customize what they want to observe because everywhere is different. What does it mean, a correct deployment or release? It can be anything: a ping to a port or something super complicated, like a synthetic test that you run for days.

Julian Wood: Is that what CircleCI provides, a sort of overview? Wherever you are, separating the deployment side from the test side, from the build side, and having this view. I think of tentacles, which is always a nautical theme with Kubernetes. Having these tentacles into all the other systems to find out where builds are happening, what's going on, doing some checking. All of that is getting the software out there rather than the software running itself. Because if something goes wrong, something needs to kick off a rollback or something. You mentioned Kubernetes, maybe you need to replace the pods or do blue-green deployments. All this kind of thing, it gets very complex very soon. You must have a lot of these hooks into all these kinds of things to make that a reality.

Olaf Molenveld: Yes, I think where our value lies is providing the unified orchestrator that allows you to do all these things from very simple, monolithic pipelines. Then you can grow into enterprise scale where you have tons of microservices over multiple environments, and you're chaining all these things, and you're observing OpenTelemetry things and feeding this back into your orchestration engine. You have inner loops which are closer to code where you want to move as fast as possible. But if you start to move into the release area, you can have an outer loop that maybe takes hours or days to have enough traffic to make an informed decision. Having these services that are running for hours or days without timing out and keeping track of the performance of what you need to observe, that's the big challenge.

From the CircleCI perspective, this orchestration layer, having a unified holistic view, a single pane of glass, and then hooking it into all these different observation layers, but also the test frameworks, the build tooling, caching, you name it. That's where we bring the value. It's also fine if people write their own logic or script their own pipelines, because the complexity might be too big for our DSL to handle what they want to do. That's totally fine. We still add a lot of value.

Local vs Remote Development

Julian Wood: You mentioned the inner and outer loop, and that makes me think of another tension that developers always have. Developers always want to do everything on their laptop. They want to run absolutely everything. They want to recreate all the clouds on the laptop at the same time, containers going up and down everywhere. What are your thoughts on how developers should think about that from a CI/CD pipeline? Because people are going to run some tests locally or they're going to do something quickly, and then they have this mental model of sending it off remotely to the pipeline, and the pipeline does stuff, and that takes much longer. How should developers think about what they can iterate on quickly on the local machine and what they put in their pipeline, or is there a blurring and a merging with that process?

Olaf Molenveld: I think we're moving in circles. But it's not a circle. When we did an earlier talk, we were told that if you look at a circle from the top, the circle just goes in circles. You only see this single circle. If you shift perspective, the circle goes up. The complexity increases. We went from codified pipelines to more DSL-based pipelines. And now it's like how we use it is becoming more and more complicated and complex. It's the same, I think, with this area where first you did it on your machine. At some point it became too hard or too large or too resource-intensive. The team grew and then you moved it into the cloud. At some point, you now see people saying, "yeah, but if I run it on my Mac M2 or M4, it's much faster than running on this infrastructure that we run our build infrastructure on."

We have a huge range of build infrastructure that we manage for you. It can be super large instances, GPUs, anything, but also very small. Right-sizing is an interesting challenge. Typically, optimization is not something that people focus on a lot. First it's more like, "this works." And then at some point it's like, "oh, maybe we can rightsize this a little bit," less memory and more CPU or something else.

But we also have self-managed runners. You can run on your local machine or on your own infrastructure, build infrastructure, or do hybrids. You can do certain things on our build infrastructure and some things on your own. What you see happening is that people start moving these self-managed runners into their own setup, basically running the entire pipelines on their own machines. That's not applicable to all scenarios. There are lots of dependencies to databases and whatever, it's hard, but you see that move. Also that means that the CLI becomes more important again, like it was in the past. You see tools around like, "can I run this pipeline on my local machine just to see if it works and then everything works as I expect it to work?" Again, hybrids, but the challenge is to provide for these different scenarios.

Julian Wood: That sounds super useful because the speed is of the essence. When developers are coding, they're going to try something else. They want their linting and their stuff to happen. They want some of the security checking. They want to do this, they want to mock some things. So there's always this trade-off between local and remote. The more you can do locally and speed it up, the better. It sounds like a great, flexible way to do it.

Olaf Molenveld: Also, context switching. If you're working in your IDE, we have a VS Code extension. We have integrations like these days we have MCP servers that are kind of like the big hype, that you can integrate with Claude or any of those tools. The better you can reduce context switching and keep people in the flow, the better I think.

Common Mistakes and Optimization Strategies

Julian Wood: Before we talk about AI, one of the things that people get wrong when either building pipelines or shipping software or setting this all up or thinking about things, CI/CD, pipeline, getting things into production. What are the big mistakes you see, or obvious mistakes or silly mistakes or whatever kind of mistakes that people do that they can save themselves time and money by understanding and sorting out?

Olaf Molenveld: Right-sizing is a big area. We have, for example, a usage API. We have insights dashboards where you can see the memory usage, CPU usage, time taken, whatever. But the usage API allows you to pull very granular data from the system that you can parse and drill into. This gives you a ton of information about where to increase, maybe build infrastructure or reduce it. So right-sizing is definitely a thing where people can gain.

Caching, making smarter use of caching. Also optimizing parallel pipelines, matrix builds. When I started programming, we did a lot of performance-based optimization because computers was expensive. It was more expensive than our time. So we did stress testing and saw where we could shave off a few seconds. I think this optimization mindset can really be interesting and increase the speed, reduce the cost.

We can do better with the information that we provide. I think that's one of the most interesting things about the MCP server. Because when you're running software, you're not daily looking at the new features that we're delivering or the improvements that we're delivering. If you can ask a system like "where can we optimize?" with our observability applied, and it says "oh, there's this new feature that helps you to do this in a better way," or "there's a bottleneck that we observe," I think that's super interesting.

What I also see is, it's more like a cloud thing. When cloud started happening, you had this discussion about TCO, total cost of ownership. What does it cost to manage your own infrastructure versus somebody else managing your infrastructure? We see the same thing happening with runners because self-managed runners, depending on how your organization is structured financially or budget-wise, sometimes people think, "oh yeah, but I'm not paying for infrastructure because there's another department in our company. So if we switch to self-managed runners, it's cheaper because then we don't need to pay for what CircleCI does for us." So you pay obviously. And then it's like, "yeah, but somebody is managing these things."

We actually see people backtracking there. It's like, "yeah, we need to move everything to self-managed runners." Then it's like, "when we made the calculations, you're not that expensive, but we need to do the entire TCO calculations." So again, moving in circles.

Julian Wood: Any advice in terms of even building pipelines or defining them? What are the things that would help people with their CI/CD pipelines in the definition and architecture and deciding how to run their pipelines?

Olaf Molenveld: I think having the iteration, having the thing run and looking at what it's doing, the consumption that it's doing, what the speed is taking, and then really trying to dig into the metrics and figuring out where you can improve how it's working. I think that's one of the most obvious things.

Julian Wood: Does it help then to have smaller pipelines that do specific things that can be more efficient? When you're iterating on this new software layer of your pipelines, you're not breaking somebody else's microservice pipeline, and you concentrate on your one.

Olaf Molenveld: That's a very good one. Slicing it up, the strangler pattern. What we see is, CircleCI started in 2011, so it's been around a long time. Let's say you came into the platform five, six years ago. Imagine the complexity of the pipelines. If you grow and grow and grow and more teams are added and more services are added, and they're all in one big chunky pipeline, that's super hard to maintain and also to debug, to optimize. So yeah, the same thing as with software, strongly, like slicing up.

Julian Wood: Do you see developers managing the CI/CD pipelines more and more, or is it a platform team component or a bit of a blend?

Olaf Molenveld: The platform team thing is definitely a theme. That's happening. It depends on the scale of your organization. In my mind, eight out of ten software developers don't want to be bothered with the ops side of things.

Julian Wood: It's the fracturing of DevOps where you're like, "yeah, I do this. We tried, but we didn't. It's hard."

Olaf Molenveld: It's a different mindset. You need to really love that part, and that's all good. In smaller setups, you see the team owns everything: the software and the tests and the pipelines and the infrastructure management around it. That makes sense. But at some point you scale out and then you also see this separation of concerns. There are compliance things or security things, and then it becomes much more sensible to have standard templates and policies. We have an OPA-based policy engine that you can apply to your pipelines. We have config policies, we have overrides that you can apply. You see this almost like a library of templates that are managed by a platform team.

Julian Wood: That's super useful because it's just so clear to me. We're taking the same software things and then applying it to the pipelines. So it is separation of concerns, being nimble, being efficient, doing things in your pipelines. But you need to consciously think of your pipeline as software, and it needs to be managed, curated, looked after, and loved in a way. Otherwise it's going to go wild and you're going to have conflicts and all kinds of things between different teams.

Olaf Molenveld: It's the same with the developers' backend code. Yeah.

Julian Wood: But you need to consciously think of your CI/CD pipeline as part of your software architecture and deal with it.

AI and the Future of CI/CD

Julian Wood: We delayed long enough. I've got to talk about the future of AI. You've sprinkled in some MCP acronyms and things. Where do you see the future of this AI-infused world we live in? With MCP, definitely agents. More software is being written, more autonomy in so many different ways. So pushing software out, defining pipelines, querying pipelines, doing that observability from natural language. What's bubbling up in your mind of the future of this AI-infused CI/CD?

Olaf Molenveld: The entire thing with natural language interfaces is a thing, like the vibe coding. I think you could also have a DevOps vibe at some point where people say, "just create a pipeline for me that does all these things," and then the pipeline exists. But it's tricky because it's hitting operations, it's hitting production. If you build and test and get your stuff into production, you don't want an agent to hit production without you being in the loop or being involved.

What I'm seeing is there's a ton of value in the system acting as an expert, like the gray old guy or girl that has tons of scar tissue and tells you, "yeah, I think you can optimize here and there," and explains why it's better, basically helping you to grow in your experience. So it's more like an expert system that's running alongside you. You can have self-healing pipelines or self-optimizing pipelines and maybe even deciding "these pipelines are less critical than this." That's definitely an area where we've heavily invested in, having the natural language interface plus the expert system running in the background.

Also the testing, like LLM evaluations, because you're not in deterministic territory anymore. They change every time. So you need to have much smarter ways of determining if the test is successful or not and you can proceed or not. There's a ton of data flow through our system that you can get a lot of value from. Like what tests are run based on historical data, why and why not, having explainability, audit logging there. So these areas are definitely stuff that we are diving into very deeply because it is our core. It's not a thing that we do on the side. It's the only thing that we do.

Julian Wood: Even when I think about it now, everybody wants to leverage AI and the industry is figuring out what that means. But also with your custom pipeline and your tests and everything, that's your protection mechanism to allow your internal people to go wild. Build whatever you want, try it out, do all these crazy things, and then have your CI/CD pipeline going, "okay, well, now seriously, we're going to check: that's wrong, that's wrong, that breaks policies, that's insecure," and all this kind of thing. The better you get your CI/CD, it allows you to be more exploratory in your AI because you've got this backstop, this wise old lady or man, at the top going, "good idea, but let's iterate on that." So I can see a sort of genetic workflow in collaboration with your CI/CD system to actually do the right thing.

Olaf Molenveld: That's definitely a very interesting area. I'm super curious to see where this all will go.

Julian Wood: Are there things today top of mind that we haven't spoken about yet that you think are interesting to bring up?

Olaf Molenveld: Not really. I think we covered a lot of very interesting things. The analogy came back a lot of times. Everything that we improved with software development, we also start doing now with CI/CD pipelines. I think the same things will happen with what we see with machine learning and AI and software development. We will also see this happening therefore with CI/CD.

Julian Wood: I think for what it's worth, the DevOps products and industry is in an exciting time. Getting stuff into production is going to become easier and people are figuring out things. There's going to be a lot of innovation happening in this deployment and the CI/CD space. We did those pipelines ages ago, but with everything that's going on at the moment, it's an exciting time.

Olaf Molenveld: There will be new ways, definitely.

Julian Wood: Thanks so much for joining us here on GOTO Unscripted. It's great to hear from people in the trenches who are helping customers and seeing what's going on in the industry. So thanks for spending time with us today.

Olaf Molenveld: Thank you very much for having me. I really enjoyed this talk.

Julian Wood: Excellent. Until the next episode of GOTO Unscripted, the GOTO website is fantastic. There are so many recordings and audio podcasts as well. There are GOTO conferences which are always fantastic to attend. So enjoy building software, enjoy getting software into production as well. Goodbye, everybody.