Building Secure Container Images with Wolfi

Updated on October 5, 2023
Adrian Mouat
Adrian Mouat ( author )

Author of 'Using Docker'

Matt Turner
Matt Turner ( author )

DevOps Leader and Software Engineer at Tetrate

10 min read

As software developers create and deploy containerized applications, they need to be acutely aware of potential security vulnerabilities. Inadequate security measures can lead to data breaches, unauthorized access, and other cybersecurity threats. In this new GOTO Unscripted episode, Adrian Mouat and Matt Turner delve into the world of container image security and network trust. Matt shares his expertise on Chainguard tooling, emphasizing the practical benefits of image size reduction while Adrian explores the parallels between securing container images and implementing a zero-trust network strategy. They emphasize the importance of being explicit and concrete in both domains, highlighting the common thread of strong trust and identity-based authentication. This engaging conversation offers valuable insights for those navigating the complex landscape of containerization and network security.

Discussing Chainguard and Container Image Security

Adrian Mouat: Hi there, welcome to another episode of "GOTO Unscripted," and we're here at GOTO Amsterdam. I'm Adrian Mouat. I'm a technical community advocate at Chainguard, where we do stuff around securing the software supply chain, and I'm here with Matt Turner from Tetrate.

Matt Turner: Hi, I'm Matt Turner. I'm a software engineer at Tetrate. We help enterprises with service mesh, zero trust, and high compliance network security. So thanks for having me.

Adrian Mouat: Welcome. So I believe yesterday, you gave a talk on building images with the Chainguard tool.

Matt Turner: I did, ironically, given that you work there, and I don't.

Adrian Mouat: Could you give us a bit of an overview of what you talked about?

Matt Turner: I talked about how folks can use the new Chainguard tooling as an alternative to Dockerfile builds essentially. I know the company angle on this probably all to do with security and whatever, but for me, I presented quite a practical if you are using a Dockerfile, this is how you might change over. And I did talk about the advantages. I talked about how you can get smaller images. Talked about how you automatically get SBOMs, software bills of material, and signing and show folks how to build their own application into an APK package. And then how to take that APK and a few others from your Wolfi distribution, and turn those into a container image in a declarative, sort of, code first kind of way.

Advantages of Chainguard and Container Image Size Reduction

Adrian Mouat: That's amazing. As you said, I worked with some of these tools in my early days, and so I find it amazing that you're talking about it before even I am. You talked about a few of the advantages there. Is there any in particular that, sort of, attracted you to it in the first place?

Matt Turner: It's funny just that I was talking about it, because I sent Preben at GOTO a list of things that I could talk about, and most of them had to do with my job, right, about service meshes and networking and stuff. But I had just moved some of our images at Tetrate over to the Chainguard tooling. So I thought, well, you know, I know this stuff backward at the moment. I guess I could talk about that. It was like the fifth bullet point in my list, and he was like, "That one sounds interesting." So what was the question? What attracted me?

Adrian Mouat: Yes.

Matt Turner: I was actually trying to reduce image sizes. Again, I know security is, like, the big topic. I was actually trying to get some smaller images. We do a lot of Golang tooling, as you might imagine, and I was actually running a Kubernetes operator in Rust. I think I love Rust, and the Kubernetes created in Rust is really ergonomic. Really nice to write controllers. So I'd written that and both of those are or like to be static languages, but Golang is not. Golang tries to compile statically. It doesn't always manage and it can be a bit tricky. So I was having all the usual build and link time problems. The obvious answer is to just throw the kitchen sink at it to ship your Golang code in the Golang base image, which is meant to be a build image, or you end up trying to mess around with getting the correct version in libc in all the right places. So honestly, I just wanted the controller because I was trying to reduce that container image size because it was taking too long to load into all of our clusters.

Recommended talk: Building Images For The Secure Supply Chain • Adrian Mouat • GOTO 2023

Adrian Mouat: Okay. I'm guessing you've also played with the Google Distroless images and KO then?

Matt Turner: Yes. I've used KO, sort of, once and I've heard about Jib in the Java world. I didn't really...they're not bad tools. I didn't like the approach. I think it felt like the wrong thing. And it felt like I'd have to...we felt maybe like a bit of a fad, a bit of a reaction to the way that some of the other tools are getting quite clunky. I felt like I'd have to retool fairly soon. And obviously, for other languages, I was trying to use some Rust as well. The Distroless images, yeah, they're a nice idea. They were certainly a lot better than what we had. Scratch is ideal. If you can persuade your system to do a perfectly static build, which I've actually written a blog post about, shameless plug, I guess, on Go because actually, I didn't really understand that until I dived into it one day. But with Scratch, you actually miss...you can't just throw a binary into Scratch because you miss timezone data, you miss CA certs, you miss all these little things. Even if you don't need a shell, even if you don't need a libc, you do need usually that kind of stuff hanging around to do anything.

Sorry for the long answer. I have used Distroless, but it was another thing putting that tool together. There's actually more than one Distroless. There are about eight. And then you have...there's static and there's base, and there's CC and there's other stuff. And it's one of those things like the Kubernetes network policy, where default is not the default. The base is not the smallest Distroless. You might think it is, but the base is actually built on static. So anyway, that was a little confusing. So that was something I had to get my head around for the talk. But again, day-to-day is the kind of thing I totally make a mistake with. So I really like the way that you just spell things out with Go.

Adrian Mouat: Okay, that's great to hear. And is there anything particularly difficult or confusing you find in our images or tooling lab?

Matt Turner: Not particularly. The Wolfi distribution didn't have 64 images available until recently, but now they are. So that's great. I think it all makes sense. There are a few quality-of-life improvements I've, sort of, tweeted about like you have to add any non-root users that you want, and you almost always want them. So all my files have the same copy-pasted sort of seven-line stanza for that. But no, other than little quality-of-life things, I think it's all fairly good.

Adrian Mouat: I mean, from my point of view, we're always working on docs because, yeah, I would really like our docs to be super great for people getting started. So that's in the works. 

Comparing Container Image Security and Network Trust

Adrian Mouat: On a slightly different subject, you work at Tetrate, and I'm curious if you see any similarities between creating secure container images and securing networks.

Matt Turner: Right. So that's interesting because I think it is a bit of a mindset shift. I'm going to say zero trust, and I maybe won't say it too often, because it's such a buzzword. But if you put that in the title of the video, you probably get some search traffic. I think it's a similar mindset, because what does zero trust mean? It doesn't actually mean trust anything, right? Because that would be that you'd never make a network call. You'd never be able to include any software dependencies. You've got to know what you're trusting and trust the right set of things, and not trust anything more, and have strong trust for the things that you're going to trust. Right.

If I'm building a piece of software, I need to know which packages are included in that. I need to be able to verify by signature or whatever that that information is correct. And then I need to know whether I can trust them. So maybe I build an image using the Chainguard tooling and then six months later it's been sat around in my registry. I try to run it, that's when someone can come along and say, "Right, I know what's in this. And I trust that. I know what I'm being...you know, I trust that manifest. I know what it is. I know which dependencies it is that I'm being asked to trust. With the benefit of six months' worth of security research, let me now go check the CB database and see if any of these are vulnerable. Because the build may have gone through because there were no known vulnerabilities at the time, but if that thing has gone a bit stale, and we've all got those in our registries, you know, six months later, okay, are there any CBs in this?"

So I know what I'm trusting, and I can have a high level of trust in it because we've generated that SBOM with trusted tooling, and we've signed it to know that it's not being tampered with. And I think you see the same thing on a network. A zero-trust network isn't trust anything. It's just don't trust things by which subnet they're on, which IP they claim to come from. It's got strong authentication, like a SPIFFE ID, an X.509 certificate, like a JWT for the end user, or both. And then have an access control list, ideally block everything, allow list, what you want.

So trust the minimal set of things with a minimal allow list. Know what it is that you're trusting through strong identities. I think they're actually very similar. I did do a talk once on kind of both. It was for a general software engineering conference. I was like, "Right, if you're in the cloud, you just move to Kubernetes, or ECS or something, and you want to start locking things down. Are you completely confused by all of the buzzwords and marketing, frankly, around zero trust? Okay, here's what you do on the compute side of things." I talked about your stuff and starting to get stronger trust in the supply chain. And then on the network side of things, just did a sort of, demystification of zero trust on the network.

Recommended talk: Cloud Native Progressive Delivery • Matt Turner • GOTO 2022

Adrian Mouat: Okay, I think that's an excellent answer. I guess it comes down to being explicit and concrete, both in terms of what you trust in an image and what you trust in a network. Is that fair?

Matt Turner: Yes, I think so. So certainly on the network, when I talk to our users, I've got like five bullet points I use to like, what actually is zero trust? Well, okay, so stepping back, zero trust is about identity-based authentication and shrinking trust boundaries and stuff. But how do I actually implement it? I think you've got five things you need. I think you need encryption on the wire, right, between services. The way you set that up is with this mutual TLS with a certificate exchange. So once you've done that, you've also got number two, which is workload authentication, do I know the ID of the machine I'm talking to? It's probably another pod, but it might be a VM or it might be a cloud-managed service. So can I authenticate the workload that I'm talking to? I think I know who it is. Can I authorize it as in should it be talking to me? Should I allow it to talk?

Once I've set up the encryption, I can do the authentication. Once I've done the authentication, I can make that AuthZ decision about whether it should talk to me.

And then the final two are the same for end users. So can I authenticate an end user? Because if I'm the microservice that fronts the orders database, it's no good saying, "Well, yeah, the basket microservice is allowed to talk to me?" Sure it is. It's not blocked completely. It has got some reason to do it. But as the order server, I shouldn't be giving you your orders, right? So you need that end-user context. All the way through, you should be forwarding those headers like you forward trace headers so that you've always got that context in which to make an access control decision. So yeah, authentication of knowing who the end user is, and then you can do the obvious corollary, the fifth one, which is authorization of an end user. So for any request, even deep into a back-end graph of microservices, we're still saying, "Oh, who is the user? And should this Google Cloud bucket be giving the Gmail service any data? Yes, so it's Matt Turner that's logged in, don't give him Adrian Mouat's emails," right?

So I think, yeah, that's how we try to be concrete about what zero trust means on a network. But I think you are right, in more general terms, right, it's based about being very explicit about what it is you trust. And if you look at, like, an AppCode file, let's see, it's really as simple as here is the key ring for the public signatures I accept for my packages. Here is the list of packages I want. Nothing else should be in there. And that's all because it uses APK. That's all declarative. Right, nobody can sneak some more files on this with, like, a post-installed hook script in a Debian package because APK just doesn't support that.

Adrian Mouat: Okay. That is great. Well, thank you very much, Matt Turner. I think that's a wrap.

Related

CONTENT

 Evolving Your Containerized REST Based Microservices to Adapt to EDA
Evolving Your Containerized REST Based Microservices to Adapt to EDA
GOTO EDA Day Nashville 2023
Cloud-Native Progressive Delivery
Cloud-Native Progressive Delivery
GOTO Amsterdam 2022
Helm Your Way with Kubernetes
Helm Your Way with Kubernetes
GOTO Amsterdam 2022
Troubleshooting & Debugging Microservices in Kubernetes
Troubleshooting & Debugging Microservices in Kubernetes
GOTO Chicago 2019