Building Green Software
Anne Currie ( author )
Co-Author of "Building Green Software" & Leadership Team at Green Software Foundation
Sarah Hsu ( author )
Co-Author of "Building Green Software" & Site Reliability Engineer at Goldman Sachs
Delve deep into everything in sustainability in software from the likely evolution of national grids to the effect those changes will have on the day-to-day lives of developers. Catch the co-authors Anne Currie, Sarah Hsu and Sara Bergman give readers a glimpse into every chapter as it gets released and what the next one holds in our monthly series.
How will software development and operations have to change to meet the sustainability and green needs of the planet? And what does that imply for development organizations? In this eye-opening book, sustainable software advocates Anne Currie, Sarah Hsu, and Sara Bergman provide a unique overview of this topic—discussing everything from the likely evolution of national grids to the effect those changes will have on the day-to-day lives of developers.
Ideal for everyone from new developers to CTOs, Building Green Software tackles the challenges involved and shows you how to build, host, and operate code in a way that's not only better for the planet, but also cheaper and relatively low-risk for your business. Most hyperscale public cloud providers have already committed to net-zero IT operations by 2030. This book shows you how to get on board.
Catch the co-authors Anne Currie, Sarah Hsu and Sara Bergman give readers a glimpse into every chapter as it gets released and what the next one holds in our monthly series.
Hi, my name is Anne Currie, and I'm one of the co-authors of the new O'Reilly book, "Building Green Software," which is all about the things that we need to do in the software industry to handle the energy transition and what we need to be up to. And today I'm gonna talk to you about one of the latest chapters to go live with the book, which is "Operational Efficiency." Now, I'm gonna have to force myself to not talk about this for hours, just 10 to 15 minutes, because it is the most important chapter in the book, and it's the longest chapter in the book, and it has the most in it of the book.
The Significance of Operational Efficiency
The reason why I say it's the most important chapter is...It's controversial when I say it's the most important chapter because most people think that the most important chapter will be "Code Efficiency. And in some ways it is, but in terms of urgency and importance and priority, it isn't.
Code efficiency, if you are good at making your code efficient...Most people have seen the charts somewhere that's ranked different languages in terms of how efficient they are. You have probably seen the one that says that C is 100 times more efficient than Python. Therefore you'll be thinking, "Well, hang on a minute. I'm never gonna get a payoff as big as that." And you'd be right, "I should be rewriting my stuff in C, not Python." But today I'm going to argue that operational efficiency is, at this stage, much, much more effective even though, in terms of operational efficiency, the best you're probably gonna get is a 10x improvement. So you'll cut 90% of your carbon emissions by using good operational techniques. So you'd be thinking, "Well, that's literally 10 times less good than code efficiency."
The reason why operational efficiency is so effective is because we are much closer to being able to do it. There's so much more about operational efficiency that has been commoditized already than is the case with code efficiency. Now, there's some really good stuff going on with the commoditization of code efficiency. If you start looking into things like Python being compiled down to C-level performance, new languages like Mojo, have similar effects. There is work to improve the efficiency of code whilst keeping developer productivity. But that's sort of still at a very early stage, operational efficiency is much further down the line. It is much closer to being commoditized. There's a lot more you can buy off the shelf. So that's kind of the main reason...well, is it the main reason? It is one of the reasons why operational efficiency is where we need to be looking next.
Operational Efficiency: A Fundamental Shift
But the other reason why we need to be looking at operational efficiency is it more fundamental than code efficiency is. For example, say I had a monolith that was written in Python and I went to all the effort, and it would be a lot of effort to rewrite that completely and re-architect it for good carbon awareness, maybe put in some microservices, all that kind of stuff, to make it a 100 times more efficient. Say I did that, and it would take me ages, and it would make this stuff very hard to maintain in the future, at the moment. Things will improve, but at the moment, it'd be extremely custom work that I was putting in there. If I went, "Hooray, I've done that," I've reduced the amounts of CPU and memory and all those kinds of resources and bandwidth it requires by 99%. If I run it on the same machine, then I don't get any of that benefit or very little of that benefit. If you reduce how much a machine is used by 99%, you don't save that much. You don't save any embodied carbon and you don't save that much of the electricity used to power it either because most of the effort is in keeping a machine turned on rather than doing anything.
So fundamentally, if you don't right-size your application, you don't move it to a VM or a machine that is the same size as you've just shrunk it to, then you don't get that much benefit from the enormous amount of work that you did to shrink the application. So what I'm saying there is that the operational move, that moving the application to a machine that's the right size for it or a VM that's the right size for it, is more fundamental than actually tuning it. You have to learn how to do that before you get any benefit from the enormous amount of work that you'll get from code efficiency. The co-authors, the three of us who are writing the book, and the Green Software Foundation as well, what we want is for everybody to start moving in the bit that's already commoditized and is more fundamental, which is the operational efficiency side, get shit hot at that while we are putting pressure on the code efficiency folk, those platforms to become green and make that easier for us.
So in terms of ordering, it's operational efficiency first because otherwise, you know, if you did it the other way around, as I said, you don't get that much benefit. And, that is the way...fortunately, that's also the way that commoditization works, we're more commoditized on operational efficiency than we are on code efficiency.
Operational Efficiency Breakdown
So, operational efficiency. Now, this is the longest chapter in the book. So I've got an awful lot of stuff, and I'm gonna have to whizz through it here. I would say falls into three areas.
You've got turning off machines that aren't used or are underused, and there's a new kind of ops that entirely revolves around that problem, which is not as easy to solve as you might think, is LightSwitchOps. Then after that, you've got to be looking at things like increasing your multi-tenancy, using auto-scaling, right-sizing, removing over-provisioning, and that more or less falls into the remit of DevOps. And then finally, when you get hot at this, really fantastically good at this, you're looking at SRE, the moment, the peak of ops performance, which is Site Reliability Engineering Image, initially developed by Google.
Because what I'm telling you here about green and efficient ops is just efficient ops. It's nothing new, there's nothing magic about it being green, it's just really good ops. So once you are top of your game at doing ops, you will naturally be green, which is another good reason why ops is a good place to start because you can sell it for other reasons than being green. You can sell it for cost-saving reasons, security reasons, and, oddly enough, productivity reasons because a lot of these techniques that allow you to deploy more quickly will improve the productivity of your developers. And you'll find that that is often the easiest sell to make. If you can go faster, deploy faster, deploy more securely, deploy more confidently, then that's the kind of story that your company will like to hear because it means you'll be able to get features out faster and try them out. So no more of that. It's like the reverse of the old waterfall days, which were my days, we don't wanna go back to them.
So in terms of operations, what are we looking at? LightSwitchOps. So LightSwitchOps is an idea being championed by Holly Cummins, who is an IBM engineer. And the idea is that we often don't turn off machines that we should be turning off, either because they're not very well used, or they're just not used at all because we fear that if we turn them off, we won't be able to turn them back on again. There are lots of reasons why we keep machines around that aren't effectively used, and that's a waste of power, and it's a really waste of those servers that could be used for something more effective. But we don't necessarily always know which machines they are, so some work is required to find that out. But also we fear turning them off in case you might not be able to turn them back on again.
Recommended talk: 5 Tricks To Make Your Apps Greener, Cheaper & Nicer • Holly Cummins • GOTO 2023
Resolving that is probably your 101. It's the first thing that we need to be working on, making sure that we can safely turn off any machine in our systems. Say an example of the kind of level of savings you get from this, it's a security hole if you've got machines that you don't understand what they're doing anymore, and you know, it's just a sign that you're losing control of what's going on in your data centers. But it's incredibly common. So the best example, I heard about this recently, was VMware moving a data center in Singapore. They were moving the data center, pretty standard stuff, and they wanted to not move more than they had to. So they decided to do a full audit of what was running on all of the machines that they were moving. And what they discovered was that two-thirds of the machines in that data center were not doing anything that mattered to them anymore. That they were old, you know, couple of users, that it was not worth their while moving that to the machine, two-thirds.
So, LightSwitchOps...But it can be very hard to work out which machines you can turn off. There are a couple of really good techniques for this. There's the scream test, just turn the machine off, and find out if anybody screams, that works quite well. But again, you might fear that if you worry you're not gonna be able to turn the machine back on again. Another thing is that all resources are provisioned for six months, and if no one says, "No, I want this again," in six months, then it's not popular enough to warrant being kept on. But again, you've gotta tie this in with the LightSwitchOps. So the idea of LightSwitchOps is that you go through and you take the pain of working out which machines you need to turn off, and you take the risk that you might turn some of them off and not be able to turn them back on again. But from then on, everything is automated so that you can automatically turn it off and on again. And then you test that, and then you use that to turn off machines that are no longer in use, but also machines that are on at the moment, but you don't need to be on. So the obvious example is test systems at the weekend or development environments at night and in the weekends.
LightsSwitchOps, is the idea that you can turn machines on...This all comes from the idea that at night you don't not turn your lights off because you are afraid that they won't turn back on again in the morning. If you were afraid that they wouldn't turn back on again in the morning, you wouldn't turn them off at night. But you always do, and that's...well, with LED light bulbs, it doesn't necessarily save you that much. But in the olden days, it used to save you a lot of money to turn your lights off when you weren't in the room or you were in bed. But it only works because you're quite confident you can turn them back on again. And the aim is that with your systems, you are as confident as turning on your lights that you will be able to turn them back on again, and that way you can turn them off without fear. So that's LightSwitchOps.
- DevOps and Right-Sizing
The next thing is DevOps. It's also scaling...it's about...Once you've done your LightSwitchOps, you turned off the stuff that you don't need anymore, the next thing to look at is not over-provisioning. I can talk forever about why over-provisioning happens, and it's perfectly sensible. But in the long run, we've got to cut back on over-provisioning, and there are various ways you can do that. You could use also scaling, you can use orchestrators to make sure that things are moved around from place to place and scaled up and scaled down. But fundamentally, right-sizing is the next step in the process of operational efficiency. And it's the kind of thing that's covered by DevOps.
It's a very difficult thing to do. I mean, I'm saying this as if, you know, it's like obvious that you do DevOps and that you make all this stuff work, but it's really not. It's hard, it requires an investment, but it is part of...If you look at companies that are doing well in operations these days, a lot of them are doing it so that they can be releasing code on an hourly, or a minutely basis, on a 10-minutely basis, really, really fast. And this all tends to go hand-in-hand with that. Getting good at DevOps, getting good at how you control your systems, getting orchestrators in place, starting to wrap workloads in things like containers, which is part of being able to move workloads around and scale them up and move them from machine to machine depending on their current resource requirements, it's hand-in-glove with that whole CI/CD moving faster in production and making sure that applications go live faster.
So, it is aligned with the stuff you want, it's not only good for machine productivity. And this is all about machine utilization. Operational efficiency is all about machine utilization. What you're doing with machine utilization cuts your costs, proves your security, and massively reduces your carbon effects. So, and again, this is why operational efficiency is the most important thing because it's aligned with all the other stuff that we want to do. So we've got DevOps there, which is like auto-scaling, using systems that...If you're in the cloud, choosing the right instance types. This is a deceptively powerful concept. It doesn't apply if you're not in the cloud, but if you are in the cloud, make sure you choose flexible instance types that are the right instance type for your workload.
So, for example, one of the most interesting types of auto-scaling that's available out there, I would say, is the burstable instance type, which is available in all clouds. And what it is, is you pay for a low level of...or what you think will be your average level of resource requirements on a machine, it's not crazily expensive. But the key reason why we all over-provision is because you think, well, mostly I need that, but occasionally I'm gonna need enormous amounts of resources, and if I don't get that, then I'm gonna fall over and it's gonna be really bad. So to meet my SLAs, I'm gonna have to over-provision to the maximum rather than run on the kind of, like, average level of resources that I'll need. And then, you know, know that every couple of weeks I'm gonna fall over.
The idea of burstable instances is that you pay for the moderate level of a resource requirement, but for a limited amount of time, occasionally, when it's needed, your hosting provider will allow your machine to leap up to that large amount of resources that you need to get with the peak and then decline again. So burstable instances are a really interesting way of doing auto-scaling. I'm quite happy with burstable instances. So right-sizing is one way of doing it.
Another way of doing it is to try and steer your developers to use platforms that do an awful lot of this stuff for you. So serverless, it does also scaling for you. Then spot instances, perfect, I love spot Instances for many, many reasons. They're fantastically good for demand shifting and shaping, which we'll talk about and which you will hear about in another podcast. But spot instances, again, will jump in, do stuff for you, and you don't have to worry about 'em so much. I mean, you have to architect for spot instances. That is not easy because they've got no SLAs and you'll have to redo everything. But operationally, if you could do that, you've won, that's the perfect green operational approach. And where you can use it, use it.
- Site Reliability Engineering (SRE)
And then finally, we'll talk a little bit about what is the perfect, the perfect solution here is really for you to fully take on SRE, Google's SRE principles of CI/CD, full automation, and massive monitoring and acting based on what you see. Because all of this stuff, is hard, it is not easy, it's not trivial to do all these things, and we would've done them. The best practice and there are commodity tools available to help you with them, but they are hard. It is well worth looking at Google's SRE principles for how they have moved in this direction because they moved in this direction literally 20 years ago, and they've started to write up and talk about what they did and what they learned.
So you're not having...Unlike with code efficiency, where we're still in in very early days of aligning code efficiency and developer productivity, which we need to do. With operational efficiency, some people beat this track. It's not new, you can find out what you need to do. Almost the best thing you can do about being green is to start improving your operational performance and operational skills, and the acme of operational skills at the moment is reading some of these Google SRE books. They're scary. You should be following people like Charity Majors on Twitter, seeing what they're doing because they are...they're not doing it for green reasons, but it's green. It is the foundation of how we will produce very, very efficient systems in the future. So it's absolutely worth you looking at that.
Recommended talk:Observability Engineering • Charity Majors, Liz Fong-Jones & George Miranda • GOTO 2022
Main Takeaways: LightSwitchOps, DevOps, SRE, and Operational Efficiency
So actually today I had so much to talk about. I think I've been able to cover the very, very basic...I've not gone into depth on any of these things, but remember, the book is available, and there's more detail in the book. This chapter is available, and it is a long chapter, there's a lot in it, but it is well worth reading through. And even that chapter only gives you an introduction to what you need to do, it does give you a good introduction to what you need to do. So go and have a read of it. You don't need...O'Reilly keeps reminding me, you do not need an O'Reilly subscription to read these chapters because you can download...you can go and get a free trial and read everything very quickly then, just like, you know, Netflix...or I need to do the Disney low-price trial so I can watch all the stuff on there for a while. But fundamentally, we're all very used to using a free trial to blast our way through the content. If you have anything to do with ops, use your O'Reilly free trial to read this chapter because it's what we need to do next. It's utterly, utterly key. And it is the bits that are aligned with what the rest of your business needs to do. So it'll be the easiest sell of any part of the green story.
But anyway, I promised GOTO that I wouldn't overrun this one to the horrible extent that I did the last one on code efficiency. So, there we go, LightSwitchOps, DevOps, SRE, and Operational Efficiency is the most important thing. Get good at it and you will win. This is the next step for you to take in being green. So thank you very much for listening to me, and I look forward to speaking to you again, which will probably be about the "Networking" chapter.
Part 3: The Importance of Code Efficiency in Green Software
Hello, my name is Anne Currie, and I will be giving you your quick 10-minute introduction to my new book today. This is the third podcast in the series on the book "Building Green Software," which is just being published as we, or pre-published, early release published on the O'Reilly website. So, my name is Anne Currie. I have been in the tech industry for nearly 30 years now, which is quite germane to the topic we'll be talking about today. And I'm part of the leadership team of the Green Software Foundation, which is a Linux foundation for being greener in tech and managing the energy transition. But here today, I'm here to talk about the book that I have currently in the process of writing, along with my two co-authors, Sara Bergman of Microsoft, who you will have heard in the last podcast, and Sarah Hsu at Goldman Sachs, who you will hopefully hear in one of our future podcasts.
But today I'm here to talk about the latest section that we've released, which is Chapter 2 of the book, Code Efficiency. Now, the question that we need to be asking ourselves today and in the talk today is, is code efficiency the key to being green and having efficient and energy transition-aware software? And the trouble is, it's quite a controversial question and a very controversial answer. I'll put it to you. Everybody's been really looking forward to the coefficiency chapter, and I know why because we kind of think code efficiency is absolutely fundamental to cutting down the amount of energy and electricity used by software as it runs.
The Evolution of Software Development Over the Past 30 Years
I got into the green software about seven years ago now, seven or eight years ago now, and it was entirely based on code efficiency. So, my background, and obviously I've been in the second degree for a long time now, since the early '90s. Back then, code had to be really efficient because machines were about a thousand times less effective than they are today, about a thousand times less good in terms of CPU storage, speed of bandwidth, bandwidth availability, and cost. But 30 years of exponential growth really pays off. Thirty years of Moore's law improvements in hardware and computing hardware have meant that we are about a thousand times better now than we were then. And remember these are all changes that happened within one career.
It's not the end of my career. I do not intend on retiring tomorrow. During my career without completely spanning my career, we've seen a thousand-fold increase in the ultimate thing we all run on, which is hardware. So, my interest in green software came from the fact that 30 years ago, we had to be a lot more efficient in the way that we coded because we had so much lower amounts of resources, worse resources to run on. We used to do a lot of things I was extremely familiar with because I used to do them myself to get that increased efficiency. We used to use very low-level languages. So, I was a C Developer, a C Service line developer in those days. So, you're looking at languages that are compiled that run directly on without any kind of intermediation on the hardware.
And what that means is that you've got a lot fewer CPU steps per operation that's made. And yeah, you can run stuff today on a machine nowadays, it is unbelievable. We had nothing like this. And also their bandwidth as well, we had nothing like bandwidth that's available now. And my original thought on being green was, okay, we can do all of those things again. We can go back to using, let's say, modern equivalents of highly productive languages, which Rust is more popular these days, and rightly so. It's a lot safer than C or C++. But it is an equivalent language that's not compiled and that is running directly on the hardware. And it's extremely efficient as well as being a lot safer than C was.
But fundamentally, it's the same concept. You've got a fast, very lightweight language running and you use it to write clever code that kind of accommodates its own multi-threading. One of the things that's very important is effective multi-tenancy if you want to write very efficient code. And back in those days, we used to do it by writing. Actually, even back then, we didn't really write monolith. We wrote something that would be analogous these days to microservices, not as tiny as modern microservices are, but could look like meatier microservices. So, you had a distributed system, but it wasn't quite as small as microservices, but it wasn't as monolithic as a monolith.
And the reason why is, is because you wanted to try to emulate multi-tenancy. So, something that happens these days in data centers, if you're in the cloud, something like that, you will be using VMs or even containers to get multi-tenancy. So, you've got multiple different clients running multiple different systems on top of the same hardware. And the reason why that is good is that you'll have some things that are currently quiet and other things that are busy, by sharing the same hardware and being careful about how you balance up. So what's running on that hardware? You can aim to get really, really good machine utilization. And back in those days, machine utilization was absolutely key. We had such rubbish machines, that we needed to make sure that they weren't ever sitting around waiting on operating system calls, for example.
We had kind of famed multi-tenancy that came with multithread, and it was great. It worked really, really well. I mean, it really was a thousand times better than what we get these days. And my original thought was, if we had a thousand times more efficiency in data centers these days, then the software industry would be a leader in the energy transition. We wouldn't be moving the dial when it came to using electricity. These days we're about the same as most other industries. In data centers, we use about 2% to 3% of the electricity used on the planet across all sectors. But if you factor in hardware and end-user devices, we're more like 12% or 10% to 12% according to Greenpeace.
But if we just look at data centers, then if we could drop that a thousandfold, then we're out. We don't need to worry about it anymore. We are completely done. So, my original thoughts, the reason why I got into green was from my background in code efficiency. But we are not in the world that we were in 30 years ago. We are in a very, very different world. And I went in, you know, all guns blazing on green software, although we didn't call it green software. I didn't call it green software back then. I called it sustainable software or coding for the energy transition, which is not something I'm allowed to call the book because, frankly, it's not very catchy. But it's all the same idea. How do we code in a green sustainable way that's energy transition aware? I was thinking, great, what we'll do is we'll write everything in C.
Then I started to do some work with customers, some work with businesses, and some work with engineers about what people actually wanted to do and what was likely to land as a message. The trouble is that rewriting everything in C or rewriting everything in a multi-threaded kind of cut somewhere between a microservice and a monolithic way, it's really hard. It's really, really hard. Back in the '90s, it took us ages to do anything, and it was very hard to change software. It's really hard to evolve software. And so, most of the thousand-fold increase in machine productivity between then and now went not into making everything run faster on smaller and smaller machines. You know full well that's not what we did with it. What we did with it instead is we improved developer productivity. So, we put in loads of layers, isolation layers that made it easier as a developer, but not just made our lives easier.
It made it faster, it made it safer, it made it more secure if you are using it all correctly. And that wasn't crazy, you know, because actually, the world has become different in the past 30 years. Expectations are very high that we move faster, we do iterate, and we do get product features out in minutes, not years, as it used to be in the early '90s. And nobody wants to go back to that world. So, if you start selling code efficiency to your business, or more generally, you're gonna run into that straightaway, but nobody quite rightly wants to go back to that world. A lot of those things didn't really scale in the way that we need these days. It couldn't be made secure in the way that we need these days, and it couldn't react and get better and evolve in those days in the way that it can now.
Balancing Code Efficiency with Developer Productivity
So, we need code efficiency, but we also need to keep all of those improvements that we got in the past 30 years. If we don't get both, we are never gonna sell code efficiency. You are never gonna sell code efficiency to your business if it's like, oh, yeah, but we'll massively slow down our development cycles because you'd go out of business. So, we've gotta find a way of squaring that and getting both of them at once. And I'm very sad because I really went into this in the hope that I could just tell everybody to rewrite everything in C, and we'd be done. And a lot of people wanna hear that. There are loads of us out there who would really love to rewrite all our code in C, but for the more sane amongst us because we don't wanna go back to that world. Much as I enjoyed in the time, I was much younger. That's not the world that we can go back to. We need to be secure, we need to be fast. So, we need to find a way of both getting, but code deficiency is still really important. So how do we do both? How do we get that thousand-fold increase in productivity, in machine productivity without taking a thousand-ful decrease in developer productivity?
Aligning Code Efficiency with Modern Software Development
We have to align those two. And the more I've been looking into it, the more that my co-authors and I have been looking into it, and the more we've been thinking, we need to align it with open-source, the cloud, and modern services. It really, when it comes to code efficiency, we need to make this a buy or a select choice rather than a build choice. If we can buy or select push our suppliers to create efficient software, we could just buy it and they can keep improving it under their feet. Because although it's massively bad for developer productivity, that doesn't matter so much. If you are selling code that's going to thousands and thousands of users, it's worth putting in that investment to get efficient code.
So, as a customer, which we all more or less are, I think that most of the folk who are listening in on this podcast today will be consumers more than producers. With producers, I'm talking about people who have thousands or hopefully millions of users, but multimillion end users fundamentally of the underlying software that we need to make efficient. And if that is the case, then it's really well worth your while putting in the time and energy to make that code efficient. And it does require enormous amounts of time and energy. And as a consumer, which we are here at this point, if we can be putting pressure on our suppliers to put that work in, then we get so much more bang for our buck. So, for example, it is an example, most of us, it's not worth rewriting your code in Rust or Go because even doing that is a hell of a lot of work.
These days, often quite unnecessary. The Python compiler, now there are now compilers for Python available that will compile your Python down to C or even machine code. So, there are people at the platform level who are working on making it so you don't have to rewrite your code in C or Rust in order to make it efficient. You can keep using these high-level, relatively easy-to-use languages where you get great developer productivity, but still get the code efficiency improvements. But you need to pick those platforms that are moving in that direction and where you can just rely on and lean on the platform to develop the code efficiency for you. So, for example, say you are somebody who does choose to write Go code, which is really quite an efficient language.
It's not quite as good as C or Rust. But it's really quite an efficient language. Even if you choose that, you're better off putting pressure on the engineers behind Go, the community behind Go, to make sure that it is efficient. So, for example, you can write as efficient Go as you like, but if the standard libraries in Go are inefficient, then you've wasted your time. You are much better off putting pressure on the Go developer, and on the Go community to make sure that the standard libraries are efficient so that what you write on top inherits that efficiency. That is a much, much more...that's an effect, that's an action that moves the dial when it comes to being green. Just writing your code in Go without putting pressure on the platform, the environment that you're working in to be green under your feet, and to continue to make it green under your feet is a much less effective action.
It's the whole thing about, you know if you want to go vegan, that's totally fine, but it's not going to change the world unless you are some kind of massive influencer who's going to make a million people go vegan because they're great, you know, fill your beets, go green, go vegan. There's no problem with that. But if you just do it on your own, it makes you feel better. That doesn't change the world. It isn't going to get us where we need to be. What we need to be looking at is actions that are gonna affect thousands of people. And in the software industry, the actions that are gonna affect thousands of people are where we persuade our suppliers to be greener. Not where generally, where we are greener ourselves unless we are a supplier, in which case, yeah, definitely go green.
But if you are not a large set-scale supplier of software industry code, where you need to be putting your pressure on is on your suppliers. And it's easy. It's easy. So much easier than rewriting your code in C, I can assure you, is that get on the phone to your AWS supplier or whoever you are getting your platforms from and say, "I really care about this. What are you doing? How are you doing? This is how I'm gonna be making my platform decisions in the future. I wanna see action, I wanna see commitment."
Putting Pressure on Suppliers for Code Efficiency
That's how you move things. That's how you change things. I really wish it wasn't the case. I really wish I could be saying, "Gotta rewrite all your code," because I love that kind of thing.
And that is, you know, it gives me a flashback to my youth. But unfortunately, that is not really the way that we are going to change things. We really, really need to be putting pressure on as consumers, as users of software, rather than rebuilding all our code. I really enjoyed the chapter. I really enjoyed writing this chapter of "Building Green Software." Although for me, it was heartbreaking, I enjoyed writing it. It forced me to rethink all my inbuilt desire to go back to those days of multithreading but really, really, really difficult to write languages that were close to the operating system. We're not used to debug assembler back in those days. Just don't do it. It takes ages.
It takes ages, and you will get fired. So, we need to find you a way of working that is less...that is both more effective, good for your business, and less likely to get you fired, which is don't rewrite all your code in C. Slows everything, even Rust slows everything down massively. Unless you are actually writing open-source software that you expect to be used on a massive scale. Even if you do, interestingly, two of the examples that came out when I was researching this chapter were from somebody who had tons of money, was expecting massive scale, and still didn't do the optimization until really late on in the day. And I think this really tells you quite a lot. So, one example is, you'll have heard, earlier this year, there was a lot of fuss over it coming out that Amazon Prime Video used to run on serverless but has been now moved to something which is quite like the description that I was just giving about you.
It's not full microservices, it's not a monolith, it's somewhere a little bit in between. It's like we used to do when you were writing your own code that was kind of essentially provided your own multi-tenancy by using multi-threading and a moderate degree of microservice and distributed systems. And they'd done that and they said, "Well, this is great. Now that we know that Amazon Prime Video is massively successful and needs incredible scale, we are willing to put this investment in to do this." And everybody went, "Oh, serverless is terrible because it's..." You know, they've had to move away from serverless to get massive multi-tenancy and massive scale. And yeah, you will probably, but still they had huge amounts of money to throw at it, and they still didn't start on that. They started using Service, which is a pretty decent way of getting multi-tenancy without having to put enormous amounts of personal engineering efforts into do it.
Amazon, with something that they were absolutely betting on and had been betting on for years, didn't start by being super efficient. They started by concentrating on developer productivity, which is serverless, and then saying, "Well, that was good enough to get to what to this stage, and then we'll do the actual investment, which was similar to what we used to do in the '90s to get to the next stage," but not until this year, which is amazing, isn't it? The other example is Teams, Microsoft Teams during the pandemic, they needed more machines, but they couldn't get their hands on more machines. So, they decided, "Oh, well, how can we free up some of the wasted resources in our existing system?" So, they started to do quite a lot of the stuff that we used to do during the '90s.
So rather than having, for example, storing all their data in text, text, would've never have used text in the '90s. In their databases, they moved over to binary and coding. And that was what we used to do. Everything was binary. It was awful. It was impossible to debug everything. We all got quite used to being able to just read binary-coded messages. But anyway, thank goodness those days are gone. But anyways, Microsoft, now those days are back, they have gone back to using binary and coding to store data so that they need much, much less hardware to store the data that they're saving. And they had to, they were forced to do it because Teams was going bananas and they just didn't have the hardware to do it.
It was like being back in the '90s and apparently, it was terrible. It was really unpleasant experience, but they did achieve it. It can be done. But even Microsoft and Amazon, on products that they've been betting on for years and had enormous scale, they didn't do this investment that took quite late on. And I think that's, it breaks my heart, but fundamentally it is the lesson we need to be taking away from this. How do we do green? How do we do code efficiency without having to do those things ourselves? And the answer is we need to buy it. We need to put the pressure on our suppliers to give it to us. Whether those suppliers are Amazon, it's very easy to put pressure on Amazon. Just talk to your Amazon rep.
Same with Azure, actually. Talk to your Azure rep. Open-source communities, start raising issues. Start saying, "Well, how do I measure this? Is this good enough? Are these libraries good enough? Are these standard libraries good enough?" That's the first thing to be tested and really is a good measurement. Our last chapter was on measurements, but one of the easiest ways to just do some quick measurements on whether a standard library is good enough is just straight-up performance. Is it fast enough? Is it fast or is it slow? If it's slow, raise a bug about it because we absolutely need these things to be, these underlying technologies, these platforms to be code efficient. And that is where unfortunately we need to be putting our attention and applying our pressure.
Now, if you want actual things that you can do, the building part of buy and build, that will be not in the next chapter that's coming out, but in the chapter after that, operational efficiency because when it comes to operational efficiency, there's tons of stuff that you need to do yourself to improve efficiency. But code efficiency, really, unless you are a platform, you need to be putting pressure on platforms.
So, I know I feel sad, I would really have liked it if I'd just been able to tell you about how to, you know, tune your for-loops better. But really the answer is don't tune your for-loops better. Just put pressure on your suppliers to improve your compilers so that your code will be automatically tuned under your feet. And then just you write your code in a way that is compliant. Is it how your platform expects? And what your compiler will be optimizing optimally, if you know what I mean. So, good to talk to you and I'm sure that I will be back and speaking to you again about the operational efficiency in a couple of months' time.
Part 2: Exploring green software metrics and measuring energy efficiency
Hi, my name is Sara Bergman. I am one of the authors of the new O'Reilly book "Building Green Software", which is currently in early release, and today I'll be talking a little bit about one of the chapters that we released in late June on the O'Reilly platform. And this chapter is all about measurements, because some of the very commonly asked questions when talking about green software, especially when you just start dipping your toes into it is, how can I measure it? How do I know my impact? Which part of my software has the biggest impact and, you know, where do I start reducing? So, in this chapter, we really dive deeper into these questions and offer you some opinions on how you can answer them. So, the first part is all about the perfect. What would a perfect carbon score be like for software, how close can you get to perfect data right now, and what would the perfect carbon score or perfect method be for the future?
So, we talk about energy data, carbon intensity data, and embodied data. So, for perfect energy data, you can actually instrument this yourself to a degree, especially if you own your own hardware, which is pretty cool. But doing real-time energy measurements in large-scale systems that span many obstruction layers can be a bit more complicated. It's entirely possible, but for many of us, it might actually be more worthwhile to leave the development costs of energy instrumentation up to someone else entirely.
How to achieve precision in carbon intensity and embodied data measurement
The next is perfect carbon intensity data. And carbon intensity is a metric of how clean or dirty the grid is that you're running your hardware on right now. And this would be a bit trickier to instrument ourselves as it requires some deep knowledge on how the local grid functions, but we are in luck because there are several open-source data providers that actually provides this data to you right now. Some have more static data sets, some have more of a live API situation, but that's pretty good because we can get today already very close to a perfect carbon intensity data score. And this is also the section where we talk a little bit about market-based reductions such as power purchase agreements and offsettings. Market-based reductions, they are an economic tool that is widely used not only by the software industry but by all industries to reduce their carbon totals or their carbon footprints.
And they are pretty cool, and they are definitely worth talking about, but for a perfect carbon score, we think that they should not be included simply because they muddy the picture a bit. It makes it harder to understand if your reduction was due to an actual change that you did, or if it was because you got a better market-based reduction approach. The third and final part of the perfect score is perfect embodied carbon data. So, how much carbon was emitted when the hardware that your software is running on was created? And here, we are once again left out to our providers to tell us, well, how much carbon did they spend making that server or phone, or whatever it is that your software is running on. Of course, we also have a vision for the future, and what the future of perfect monitoring would be, and we think that's real-time carbon monitoring that has sufficient granularity in time, location, and software components.
If we had that out of the box, it would allow more software teams to shift their focus from measuring to actually reducing, which is the end goal, of course. So, that was a bit about the perfect.
Current methodologies and practical software assessment
Now let's talk about the good enough because for several reasons, the perfect is not here yet for all of these data points, and the perfect might not even be possible for all types of software. So, in the good enough section, we talk about proxies that you can use for things like cost, performance, energy, or hardware because we should not let perfect stand in the way of carbon reductions.
The next section is about current methodologies because luckily, we are not the first ones to actually ponder these questions. There have come a lot of smart people before us who've done a lot of thinking into this which is great. So, in this section, we go over the current existing methodology, so namely the Greenhouse Gas Protocol, the Green Software Foundation's software, carbon intensity specification the SCI, and the ISO 14064 standard. The Greenhouse Gas Protocol is pretty well known. It's the one that divides your carbon footprint into scope one, two, and three. You've probably heard of it. They allow both market and location-based energy data, and it's a widely used standard across enterprises, not only in our sector but in all sectors. So, it's really good for a lot of things, but it wasn't really built for software, which makes it not ideal for software.
I'm gonna explain a bit more about what we mean when we say that in this chapter. On the other hand, the Green Software Foundation's SCI score was built for software, and it's not a total, but it's a rate of carbon emissions per what we call functional unit, where a functional unit describes how your software scales. So, that can be per user, per device, API call, anything that sort of makes sense to you, and the SCI score, they take energy location-based, so not market-based, only location-based carbon intensity, and embodied carbon into account.
Exploring the path to future sustainability and bridging metrics to action
Lastly, we have the ISO 14064-1 standard, and that's actually built on the Greenhouse Gas Protocol standard, but the division of emissions is a little different whereas the Greenhouse Gas Protocol uses the scope one, two, and three, the ISO standard has direct and indirect emissions instead. What is cool about ISO standards though is that you can get certified by a third-party certification body, which can be great for auditability if that's something that's important to your organization.
And like I said, we are not the first people to think about this, and we're not the first people to build something on this either. So, we have a chapter or not a chapter, but a section in this chapter about tooling. So, all three major public cloud providers, AWS, Azure, and Google, have their own carbon emission tooling, so we have an overview over what those tools provide to you, as well as the open-sourced cloud carbon footprint tool, which is sponsored by Thoughtworks. Of course, not everyone runs software in the cloud, so we also have a short overview of the client-side tooling. The takeaways from this chapter is that, well, it's quite heavy on information about the current state of the industry when it comes to both tooling and standards. And the great thing here is that our industry is in such a better place than it was just a few years ago.
A lot of these standards and toolings, did not exist just a few years ago. So, we're very excited to see where our industry will continue to move, and hopefully, we'll get closer and closer to that perfect real-time carbon monitoring. We also hope that you, our readers, by the end of this chapter have a better understanding of what you can do to measure your emissions, and where you can get help to do the same because you're not alone in this. And whatever solution you end up using, we hope that you really approach your software measurements with curiosity, and let them be a conversation starter for your team, or for your organization. But they shouldn't be the end goal. You should let your measurements be the companion to carbon reductions, which is actually the end goal. So, thanks for listening, and I really hope you like this chapter.
Part 1: Collaborative efforts to cover various aspects of sustainability in software
Hello, my name is Anne Currie. And I am a veteran software engineer. I've been in the tech industry for nearly 30 years now working on a variety of different things from really hard, you know, hardcore C-server stuff in the '90s through to modern operations stuff now. And I'm also part of the leadership team of the Green Software Foundation, which is a new Linux Foundation devoted to how we can basically manage the energy transition from our side, from the tech side. How we can contribute, how we can work in that new environment. But today, I'm here to talk to you because I am one of the co-authors of the new O'Reilly book, "Building Green Software." There are three of us, all of whom are key folk in the Green Software Foundation, who are working on that book. And hopefully, it will be a kind of a nice easy-to-read introduction to code efficiency, operational efficiency, carbon awareness, time shifting, how everything fits together to get with the green software transition. The transition for fossil-fueled energy, which is massively on demand, you can get it whenever you want, to renewable energy, which is not so readily available. So, the book is a very broad introduction. It won't be deep diving in. If you want a whole book on a particular subject, I'm sure O'Reilly will start to produce those, but given that we're already getting a lot of positive feedback on the book that we've done.
The book officially comes out in first course of next year, early next year. But what is actually happening, and this is the O'Reilly way of doing things now, which I appreciate. As we write each chapter, they do a bit of minimal editing and then it goes straight up on the O'Reilly Safari site. So, if you have an O'Reilly Safari subscription, you can start reading the chapters, and there's one coming every month. And the one that is available already is the introduction. And there was quite a lot of method to our deciding we were going to go live with the introduction first.
Powering the Future of Software with Renewable Energy Solutions
The introduction is basically a summary of everything we're gonna say. It's kind of a nice introduction to what are all the key concepts. What do you need to do, what do you need to be thinking about? Hopefully, it should be a nice easy read. We're already getting quite a lot of reads on it, so that's good. It's looking like a lot of people want to know how we need to be accommodating this new world of power is not necessarily gonna be available, well, power will not be available instantly everywhere. Now, eventually, it will. The whole point of an energy transition is eventually you have transitioned.
At some points, we are back in the state where we are now, where power is very readily available. It's free, you just don't need to think about it. But maybe for the next couple of decades, that's not gonna be the case because the thing about a gas-fired power station or even a coal-fired power station is they're quite reliable. They provide great baseline power. You just turn them on and you've got, you know, literally at the flick of a switch, you have energy available, but renewables are not like that. And we've got several problems with the way the renewables work. I mean, obviously, they're extremely cheap when they are running. So, when the sun shines, the wind's blowing, there's tons of power. We have everything we might possibly want and need to use. But when the wind isn't blowing or the sun isn't shining or it's night for example, then things get a little bit more hairy.
Long-term solutions for a sustainable energy infrastructure
For the next few years, we are gonna have to cope with all that kind of stuff. And then we've got a couple of tools in our toolbox, which is what we talk about in the introduction. We've got to be more efficient with power and be more aware. So, being aware of what the carbon intensity of the grid is at any moment, and shifting what you are doing, so that you run more when the sun is shining, whatever, and you run less when it isn't.
In the long run, we will have solutions. We will have batteries, and we will have great grids that can transfer power from the other side of the world where the sun is shining. But we're quite a long way off that. That's gonna take quite some time. So, we need to be thinking about how things work in the meantime. And at the moment we're not necessarily paying much more for power that's dirty versus power that's clean. But I think we all know that that situation isn't going to continue forever. So, there's a kind of FinOps element to this as well that it will cost, ideally, probably a punishing amount if you have an all-on-demand, always-on application that's very energy heavy and is running when the only thing to available to power it is a coal-fired power station. So, we have an enormous amount of ability to shift things around in the tech industry and we're very clever. So, we really do need to be leading the field on this. So, this book is all about all of these different concepts. All the different ways, all the different ideas, all the different things that folk are coming up with.
Meet Anne’s co-authors – Sara Bergman and Sarah Hsu
With my co-authors, we've got the three of us. There's me, there's Sara Bergman, who's based in Norway, Swedish based in Norway, and works for Microsoft. And we've got Sarah Hsu who is Taiwanese but lives in London. Works as an SRE for Goldman Sachs. So, we're trying to approach this from all different angles, make sure we've got lots of different perspectives in this. We're going out, we're talking to people all over the place. Our forward will be written by Adrian Cockcroft, lately of AWS sustainability. So, we're trying to make sure that we get a lot of points of view.
And O'Reilly have come up with this clever scheme, which apparently they do all the time, but I hadn't really heard about it before, where you release chapters as you go. And we are using that to try and get people to give us feedback on what they want to see, and what isn't in the book at the moment. So, we're very keen to get folks to read those chapters. Say introduction with the first chapter. The second chapter will be "Measurement." The lead on that chapter is Sara Bergman. So, I think next time we'll probably have her coming in to talk about that. We were under some time pressure with Sara because she's heavily pregnant, and is due to have a baby in six weeks' time. So, we have to push her stuff, get her stuff out, and get her talking about it before she has to go on maternity leave. And then after that, we'll get Sarah Hsu in to talk about her chapter. She's going to own the "Carbon Efficiency" chapter. So, I hope that gives you some kind of an idea about what's coming in the book, and what we want from you, which is feedback. We want ideas, we want you to ask all the questions that you want answered, so we can make sure we do answer them. Because we've got the time, we can incorporate what you say as we go along. So, it's a community project. We need your feedback, we need your questions. And hopefully, we'll get them.