Expert Talk: Are We Post-Serverless?

Updated on April 6, 2024
Julian Wood
Julian Wood ( interviewer )

Serverless Developer Advocate, Amazon Web Services

James Beswick
James Beswick ( author )

Senior Manager, AWS Serverless Developer Advocacy

Share on:
linkedin facebook
Copied!
40 min read

James Beswick and Julian Wood analyze the evolving landscape of serverless computing, from its current state to its future trajectory. They discuss the fusion of containers and serverless, highlighting the flexibility and efficiency gained from running Lambda functions from container images. Moreover, they emphasize the importance of asynchronous development and the role it plays in scaling applications, with Julian Wood noting its underappreciated potential for high performance. Throughout the dialogue, they touch on cost management, architectural decisions, and the collaborative relationship between AWS and its customers in shaping the future of serverless technologies. As they envision the next five years, they anticipate a continued integration of best practices, platform evolution, and groundbreaking innovations influenced by customer feedback and industry trends.

Beyond Serverless

James Beswick: Hi, everybody, and welcome to "GOTO Unscripted." My name's James Beswick and I lead the Serverless Developer Advocacy team here at AWS, and I'm joined by Julian Wood on my team. Julian, how are you doing?

Julian Wood: Hey, James. We've been doing this developer advocacy thing for a few years and playing the serverless beforehand. Super keen to chat all things serverless as always.

James Beswick: What for you is top of mind serverless these days?

Julian Wood: Well, I think that's an interesting question because everything is serverless, yet there's still a long way to go. And so of course, the title of this video, you know, are we post-serverless is, of course, slightly click-baity. See we need to do that kind of thing. We needed to put some GenAI goodness into that as well to make it even more buzz-worthy. But, you know, what do we mean by post-serverless?

Well, first thing I think of is like, serverless is actually a silly term. It means that there's a lack of servers, but what does it actually mean that's of a benefit? But serverless has caught on, as an industry term, as a really good way to build applications, to scale things out, to run things in a cloud-native way or native way in the cloud. And so the premise of this chat is to say we're in a post-serverless world because serverless is just the normal, and so many companies, big and small, are building serverless applications that in a way it's not a stage anymore or a point in time that we did something. It's just an accepted way of running applications with some great benefits.

James Beswick: I was wondering about this. It is an interesting headline, but I was trying to think about where it was five years ago when I started this job, or seven or eight years ago when I first saw it really started using it in earnest in applications and where we are today. What do you think is the big shift just over that time period?

Julian Wood: I think it's a natural evolution. I think Lambda's actually 10 years old at the end of the year this year. I mean, that's also crazy when you think of it. When sometimes you think cloud is happening so quickly and you think, well, you know, Lambda's already 10 years old. And Lambda's sort of, in a way, started the serverless movement, even though when it came out, there was actually no mention of serverless. It was just run your code in the cloud and we'll look after you. And it was the initial mental model of running code without managing servers or infrastructure. And that was super attractive.

Before then, you had to set up EC2 instances or maybe container workloads, but still, that was the early days of that. And it was tough to scale and manage and patch and look after all these kind of things. And so Lambda really was groundbreaking in terms of, you know, throw your code up in the cloud and we will look after a whole bunch of stuff for you. And in this sort of 10 years since we've been doing this, it's now moved beyond just running code to much, much more. And now it's sort of a mental model of delivering value to customers without having to manage complex infrastructure capabilities. So it's not just code, but it can be things like databases or queues or messaging or application architectures.

And funny, if you have a mental model of some of our, sort of, big serverless services, our S3 and SQS, and S3 and SQS were two of the original services at AWS. And so, you know, actually, they were the original serverless services. And I like to say that the cloud was actually born serverless because it started in this way of just being able to connect to an endpoint over the internet, upload a file, do something on a queue. That's not AWS-specific as well, you know, lots of the other clouds, and, you know, at that stage, some of them were called pauses. That was the whole idea of just behind an end-point, send your code up, and off you're gonna go.

Now, using that model of SQS and S3, that's what it has evolved in. In the 10 years, we've got many more databases. We've got many more services, not just AWS ones, but we've got, you know, some other companies also building on top of AWS things that come to mind. You know, companies like Snowflake, also companies like Confluent for Kafka. So it really is an evolution of not just the code now, but running any kind of service in a serverless way, which you could just access via behind an API.

James Beswick: I think back to when it was launched at re:Invent 10 years Lambda, and I think that's why Lambda is often equated to serverless because it was sort of the big shift in the compute side. And I fast forward to now, and obviously serverless is just so much broader, but yeah, many of our customers don't realize just how big the service is. I mean, internally, Lambda's now being used for nearly 50% of all internal applications at amazon.com and there's trillions of indications every month, and it's all over the world in all these different regions. But of course, one of the things we don't talk about much in our group actually is Fargate. And Fargate is something that's grown really equally fast. It's also 10 years old, I believe.

Recommended talk: Thinking Serverless: From User Request to Serverless Solution • James Beswick • GOTO 2022

Julian Wood: That's the sort of interesting way the industry sort of moves and comes back. And when Lambda came out, containers were, you know, very early days and they were, you know, difficult to use, difficult to understand. But as the industry has moved on, containers have become really, you know, simple to use, amazing tooling, and all these kind of things. But also there was a lot of infrastructure to deal with for containers, a lot of the compute infrastructure where, sure, you had a control plane, but all those servers you had to manage yourself.

And so Fargate is a AWS service, which just has a control plane, which is called ECS, Elastic Container Service, does all the orchestration, but behind the scenes, all those spinning up of the containers on the actual servers is handled by Fargate, entirely serverlessly, you don't need to manage or think about any of the scaling for that. And yeah, a really sort of powerful benefits of, sure, you've got Lambda for, you know, short-running functions and Fargate you can use for long-running functions and, you know, maybe some small stateful workloads.

Exploring the Boundaries: Serverless vs SaaS

James Beswick: What do you think is the line though between serverless in this very vague way with sort of, the industry's gone with it and SaaS? You know, if I'm using a SaaS provider, is it serverless or is it just a service that I'm integrating?

Julian Wood: YAn interesting question. I think there're probably purists on both sides who will argue either point. For me, I'm thinking from a consumption model, I think SaaS is more just a service you use over the internet. However, the SaaS provider itself is probably running in a sort of more serverless kind of way where they can be multi-tenanted, they can have, you know, multiple customers doing various kind of things, and they obviously want to run...they manage their SaaS service in a more serverless way because it'll be more scalable and, you know, more secure and more resilient as well. I think just the consumption model, if you're just pointing your browser or your API at a website, I don't even think you care whether it's serverless or not. And so you're just using a SaaS service. I mean, that's my personal take. There's nuance in that and I'm sure there are different ways people can think about it.

James Beswick: The thing that attracted me to serverless before I worked here was the idea that you could do a lot more with a lot less work. And so I've always been the sort of developer when you're running teams of people trying to build things that I wanna move as fast as possible just because there's so much work to do. And so in many ways, I'm less worried about if I plug in a service like Adobe or Stripe or Zendesk on these sort of third-party applications and more about how much time did it save my team and what sort of resiliency it gave the application.

Julian Wood: The tenants of serverless are still entirely true today because, you know, in the serverless you've got lots of complexities of, you know, managing large fleets of various kind of things. And there's storage that's gonna go in and out on a virtual capacity. And you've got, you know, network connectivity between various resources and their permission constructs and everything that sort of comes with it. And, you know, all of that takes expertise. You've gotta learn storage and networking.

I come from an infrastructure background. I know you come from an app dev background, and there's a whole bunch of stuff you needed to do. But that's not the job. That's not the job of your company. You know, you don't need to be networking or storage or, you know, packaging experts. You wanna deliver value to your customers. And so that's sort of the serverless mindset is to be able to, what I sometimes think of as sort of smart delegation where you delegate smartly things to cloud providers or, you know, other services to just handle things for you because that's not called your business.

Now, it's smart delegation because you need to be clever about things. You need to understand it. It's not saying you just, you know, ignore the operational overhead. There is some stuff you need to do, but it certainly makes it a lot easier and you can use, you know, AWS and, you know, even other cloud providers' expertise to just run these things on your behalf. But it does take a mindset in, so you have to evolve the sort of building blocks you're building with to a more cloud-first way.

Serverless Success Stories and Platform Engineering Evolution

James Beswick: Given your background, what do you think about the platform engineering side in terms of what you've seen, how serverless evolved? If we are post-serverless, what does that look like from the point of view of platform engineering?

Recommended talk: Why is it so Hard to Create a Great Platform-as-a-product? • Nicki Watt • GOTO 2023

Julian Wood: I think sort of moving back to my earlier comments about how you now compose new applications. In an old infrastructure world that I was in when it was at networking, that storage and the firewall rules and everything, when you're building more, sort of, applications in the cloud, that evolves to different kind of constructs. And instead of, you know, load balances, storage networking, as I mentioned, and instance types, now you're sort of looking at the application constructs. And these can be functions or databases, queues and workflows.

That distinction is actually where some people miss the full proposition of serverless when it's about less infrastructure to manage. Because a lot of developers think, well, I don't look after the infrastructure anyway. That is a platform team or an operations team responsibility. I just handle at the application level. But your infrastructure team certainly cares. And part of the rim to serverless is that you don't need to just throw it over all a whole platform team internally, there's more important stuff that you can do.

But when you're a cloud developer and you're building applications, there's a lot more that you then need to take on because now if you are having to decide how you're gonna wire a queue or a function or a database altogether, you know, there's a lot of cognitive overhead you need to do. A lot of those best practices for, you know, database or still some network connectivity is still there. So I think the whole platform engineering thing recognizes that, that, you know, you can't be a 10x engineer, full stack 10x engineer developer doing everything. If you're a big enough company or even medium-sized company, there are gonna be ways that you've learned how to build things that are gonna be secure and resilient and, you know, at scale and powerful. And you wanna be able to package those up so that your developers can use. And so platform engineering can really help with that. But platform engineering, I think is an evolution from the sort of infrastructure days of having, you know, DBAs and network and storage people to people who can package up these application constructs to be able to allow your developers to do more.

James Beswick: I think one of the best things in the job we have actually is just be able to see the huge number of customers and what they're doing and talking to them about how they're trying to solve various problems. I think over the years, we must have spoken to, you know, thousands of different people. But what are a couple of your favorite customer stories about how they've used serverless and what the impact has been? Any that really stand out for you?

Julian Wood: I think there's some of the obvious ones where it's just cost and availability. You know, Amazon is sponsoring Red Nose Day. If you're in the UK it is a, you know, massive charity drive. And, you know, they had to keep a whole bunch of servers running for a whole bunch of time. And their infrastructure and their operations, everything used to cost them a whole bunch. But if you think if you're running a big charity drive, you've gotta raise money in, you know, I think it's 24 hours or something, some sort of time thing. But you can't afford anything to go wrong because literally, if people can't, you know, give you money, you're just gonna lose money.

And so Red Nose Day is a really good example of...I mean, they slash their bills dramatically. I can't think of the exact number, but it was sort of 80%, 90% I think it was. When Red Nose Day rolls on each following year, they have to do very little, you know, setting up infrastructure or doing things. The applications that they were running last year are gonna work this year. Yes, they may need to bump up some versions, but it's so much simpler to just not have to worry about all that kind of scale. So, that's my sort of use case on the, you know, spiky workloads, things that need to happen where you just don't have to do other kind of things.

But, you know, the premise of this is other companies who are doing amazing things. I mean, the Nationwide Children's Hospital, which we've spoken about a number of times, who are literally running amazing machine learning workloads to help discover and do kind of things with cancer for kids. And you would think that would regulate the environment, it's healthcare, this is kids, this is important genomic data that you've gotta wrestle with. And they've picked serverless because, you know, it was just way easier for them and way more productive.

James Beswick: I mean, they're a great example and certainly some of the nicest people you meet doing some of the most important work. And they've been fairly innovative in how they've used both Lambda and Fargate and used the orchestration tools to really accelerate, especially, as you say, in a regulated environment.

Julian Wood:  I mean, just other examples, you know, even Lego talk about, you know, their backend infrastructure and how they're doing kind of things. So many companies are doing data processing, Kafka is huge, and so many companies are able to just process data from Kafka at high scale without having to worry about writing all your producers in Java and complexities with scaling, all of that. Yes, the data processing, IoT workloads are huge. Just being able to get data that comes from many devices all across the world, you don't even know where it's coming from, and just to be able to manage that.

I know people can accuse me of slight bias because I work within the service team, but I actually very rarely hear customers who adopt serverless and, you know, do spend the time to learn it and do it successfully go, "No, this was a waste of time." Once people get the benefits and they go with it, it's just a sort of land and expand within their organization. And once they do the internal learning and training, off they go.

Synergies Between Serverless and Generative AI

James Beswick: I suppose the next thing I have to ask is really the big change in the industry that everybody's talking about is around generative AI. And so, obviously, in this very short space of time, GenAI has come in and really it's revolutionizing this entire space. And certainly internally we can see enormous growth across products like Bedrock and Q and just the industry moving at a huge pace. But don't you think there's a really interesting fit between GenAI workloads and what we've been doing with serverless?

Julian Wood: Completely. I mean, it's sort of hand and glove. In fact, you can't think of any other way really to do it. And that's from two perspectives. One is the actual just using these models, because what are they? They're just an API call over the internet, and whether you are using AWS models or even other kind of models, it is just an API call over the internet. So in a way it is that sort of SaaS service that you can run. And, you know, the consumption model from that is purely serverless. And you can, you know, write code that runs in a Lambda function or on a Fargate instance or anywhere, and just the model of being able to interact with these models, it's just really simple. So that's on the one side of the consumption where serverless is just so easy to be able to query these models and build amazing capabilities for companies.

Even internally, I was working on something with a colleague recently where we had a whole bunch of feedback items that we needed to summarize and, you know, hundreds and hundreds of feedback items, all really good, to go through. I, you know, started initially going through them all, going, are we gonna categorize them? How are we gonna do all that? GenAI to the rescue, hello, here is all our feedback items. Can you give me the top 10 ones with the originals? And there it was. And it was just what a fantastic use case rather than crawling through the data manually. So that's on the consumption side.

And then on the production side, well, GenAI is, of course, great at creating code. Now, part of the whole serverless thing is we want you to write as little code as possible, which is a good thing. But the code that you do need to write, GenAI is really good at producing that. And if you are writing code in, you know, sort of any number of languages, you know, GenAI in services like Amazon Code Whisperer, which is in your IDE, so if you're using VS Code or other IDEs, and Amazon Q is also another GenAI service. And I've been using that recently where you've just got a little box that comes up in your IDE and you say, oh, please write me a Python Lambda function that is going to consume messages from a queue and use an idempotency token based on this, and just, you know, 95% of all that code, boilerplate code is just written for you. And it's amazing.

Recommended talk: Beyond GenAI: What’s Next for the Enterprise? • Andrew Turner • GOTO 2023

I think it's really gonna accelerate developers to be able to write the code. And once they have the code, that whole operational simplicity of the service, they can take that code that's generated. Obviously, they've got to sanity check it. The GenAI can, you know, write the test for it as well, which is even better. And they can just, you know, upload that to a serverless like Lambda or build a container image for Fargate, and off they go. Just the whole cognitive load of writing the code and then also running their code in production makes your life so much easier.

James Beswick: I think about the things that we build in our team where we're trying to create realistic workloads for customers like serverlesspresso and serverless video. And when we built serverless video last re:Invent, and this has all just started, we were trying to think of interesting and clever ways to process video with GenAI. That was my awakening to this, because when you look at the process of managing these async requests, where you have to manage retries, error handling, potentially pulling back and amalgamating different data from different models, it was amazing to watch how we could use step functions to do a lot of this with really very little code and really how reliable it was. We built this application and took it to re:Invent. It's used by thousands of customers and it just works, you know, really with very little code from our side.

Julian Wood: Well, I think that's sort of the next revolution in serverless that people are about to jump onto. Some have already. The revolutions that people have already taken to heart is the running code in the cloud, you know, a lot of data, databases, all this kind of thing. But the actual orchestration and workflows, I think people quite haven't grasped how powerful that is. When instead of running all your code, you use a managed service. And this can be, as you mentioned, step functions, but there are other services out there like Apache Airflow where you can just write your workflows in Python.

Maybe that's attractive to you. It's not quite as serverless, but something like step functions is literally amazing because if you think of a lot of applications, what are they generally doing? They are writing some custom logic and, sure, it's gonna be some code that needs to do some analysis or pulling in some data and some transformation. But a lot of your code, you're right, is the boring stuff of retries and branching logic and SDK calls and all this kind of boilerplate kind of stuff. So imagine if you had a service that could just orchestrate those retries with exponential back off and jitter and all these sort of fancy distributed computing terms where you don't have to think about that, it's just done before you, you've got branching logic. So a response from an API is blue or green or orange or yellow, or different kind of values. And you can just go down another part of a workflow. You need to write configuration code, not application code. So it never ages out, never needs to be patched, never needs to be tweaked.

Then the SDK part of it is fascinating because step functions has built in the AWS SDK. And that is, sort of, 11,000 or 12,000 SDK calls. It's ridiculous. So if you need to write to a database, read to a database, you need to call a machine learning model to do something, you wanna integrate with Bedrock to do some generative AI, you literally just fill in some configuration code as part of a step function state, and it's gonna call this API and then respond to it. And you're not having to write and maintain that code. Once the response comes back, again, you can use the branching logic to do things with it. You can run parallel workflows. Yeah, it is exceptional. And again, this is one of the services that customers don't necessarily know super well, but once they start with it, it's actually addictive and you can't stop because you find so many ways that you can just run these services at scale in the cloud.

James Beswick: We've spent many years building these primitives that are really extraordinarily reliable and highly scalable. You've got, you know, queues with SQS, notifications, SNS, event messaging, buses with EventBridge workflows with step functions. The list goes on and on. These are very well established services and all these primitives, but if this is post-serverless, where do we go from just the primitives that we've built to the next stage? Where do you see that going?

Julian Wood: I think that's a fascinating space at the moment because, you know, we often talk about infrastructure as code. That's always been a term. We've talked about infrastructure as a code is where you define your infrastructure instead of clicking around in a console or doing API calls that you actually have a template, which is what your infrastructure's gonna be. And, you know, in the EC2 world, this will be EC2 instances and load balances and all these kind of things, but it's called the infrastructures code.

And actually, when you're building service applications, you've got far less infrastructure to worry about. But you're actually setting up, as I mentioned before, you know, databases or queues or events or you connected to a database to get streaming data for CDC data. Those are actually application constructs, even though we call them infrastructure. And so a lot of developers are sort of scratching their heads going, is setting up a messaging queue as a buffer between two applications infrastructure, or is it application code? And the fact is it's actually merging of both of them.

And so in a post-serverless world, we are actually moving away from assembling these infrastructure bits, but it's actually about composing different parts of an application using programmable cloud constructs. And those are gonna be done in various different ways. And there's obviously been the rise of more general languages, which are things like when you write your infrastructure/architecture application code in the same language as your business logic. So you've got things like the CDK from AWS, it's a cloud development kit, but also Pulumi, which is a sort of infrastructures code, but, you know, written in Python or Node or Go or JavaScript and everything. So that's super attractive for developers to be able to build their applications in the same code.

But what's also happening is there's sort of even an elevation above that where you no longer even know or have to worry about what infrastructure is being built. And so the company's called Winglang and Ampt, for example, where you actually just write your application code and it figures it out and goes, oh, well, in order to achieve this, we are gonna build you a Lambda function, or we are gonna build you a step function, or sometimes a step function's workflow, although I'm not quite sure if that's quite ready there yet, or we are gonna build, say, a Lambda function and a Fargate task.

And then if you code in your application code that maybe you need some rate limiting, for example, it's gonna maybe, you know, put in the weeds for Lambda, but some concurrency, or it's gonna say, well, actually it'll make sense to put a queue in front of that as well. That's all written as part of your application and these tools are gonna infer from your application and build these new composable constructs. And so in a way, we're way elevated even above the application constructs, let alone infrastructure constructs. Again, that's gonna be a super interesting space where developers are just gonna, you know, write the sort of pseudocode of what they need and then the business logic and the cloud is just gonna figure out how to put it all together.

Lambda in 2024

James Beswick: Now, you are famous at this point for creating one of the most popular sessions at re:Invent, where you've essentially pulled together everything that's happened in this space over a year.  I know you focus primarily on Lambda, but you do look at all this sort of broader picture in what you do. And you've built this presentation that, you know, I think it's 1000 slides at this point now, but if you think about the last...I think it takes four years to put this together, but if you think about the, the last one, and go back to the Lambda side of it and the compute side of this, what are some of the things last year that stuck out to you as being, you know, really some remarkable things that have changed that now give you some superpowers in what you build?

Julian Wood: I think it's this merging, as we were speaking about earlier, between containers and serverless. And I'm talking specifically about serverless functions and containers, because before there was this maybe unwritten, I was never part of it, but some unwritten, there had to be some war between containers and serverless that they were mutually exclusive. And that never made sense to me because fantastic use cases for containers, fantastic use cases for functions, but there was a bit of a divide between the two of how the two are gonna coexist.

And now actually since Lambda has come out with being able to build Lambda functions from container images, wow, that's fantastic developer fullness because you're used to building Docker files for your other applications. Well, you can just use a Docker file to build your Lambda function. We still take a lot of operational overhead of actually building that and running it for you. But yeah, that's really cool that you can do that. And it's not just being able to run a Docker file, like, massive amounts of innovation that goes on behind the scenes to make that super fast.

People worry about cold starts. That's where, you know, functions need to start up. But actually when you are using container images for Lambda, we cache so much of that information within the actual service that cold starts are really negligible. And in actually in many cases, cold starts can even be faster with bigger images than they would with a previous way of doing Lambda. Yeah. So, you know, that's an amazing innovation where we do things before you, where we meet you where you are doing container imaging and then do a whole bunch of innovation behind the scenes to make that actually realistically work. But yeah, I mean, Lambda is continuing to evolve.

Recommended talk: Building Modern Apps with Serverless & Feature Flags • Jessica Cregg & Julian Wood • GOTO 2022

I mean, there's no post-serverless that means nothing stops with Lambda. You know, Lambda is becoming so broad that it handles so many use cases. If you think it runs for 15 minutes, 10 gigs of memory, 10 gigs of local storage, 6 virtual CPUs, you know, that's a huge amount you can do in that. And that's just one invocation, these scale out ridiculously. At re:Invent this past year we even changed the scaling model for Lambda on its head that previously we basically 12xed to Lambda scaling. So Lambda can scale up to, you know...1000 functions can run every 10 seconds. So I mean, there are massive use cases and massive workloads that you could just do on Lambda.

And again, you know, that figure for the scaling 1000 every 10 seconds, but you don't need to worry how that actually happens or if there's capacity or what goes on. So yeah, lots of innovation still going on in Lambda and we'll continue where we just chip away if there's use cases. And, you know, Lambda was founded on this, let's build best practices in the cloud. So we talk about well-architected, it's a term we use at AWS, which is just baking in the best practices of all the years of we doing running distributed applications. And we just gonna build more of that in Lambda so you don't have to. And so you can run these massive distributed applications without needing to know about distributed applications.

James Beswick: So even though this has really been around for a decade, we still see lots of customers who are very new in this space coming to Lambda and ECS Fargate, all of these tools. If you could distill down, like, what are the two or three things that are good pieces of advice you can give so they can get started quickly building in this environment?

Julian Wood: I think first of all, try not to get overwhelmed because serverless is everywhere and can be everywhere. I think find a small use case and then play and iterate. And a lot of our serverless services have huge amounts of functionality because they could do a lot. But I would start small and start simple. Now, think of a use case that you're gonna do. Maybe it's just running some code that's gonna respond to an API request. So you can connect up an API with Lambda really easily. It can either be native with something called function URLs or behind a more fully featured API service, like API Gateway. And you basically hit the API Gateway endpoint every time you do a get, post, delete or whatever, it's just gonna run a Lambda function. And that is very cognitively easily to set up. You iterate really quickly on it and you can understand every time you hit that API endpoint, your code is going to return some response. That's a really easy way to just understand how that works.

Another one is data processing as well because so many companies are using data processing. If you are using, you know, Kafka or Kinesis in AWS, which is specific there, or queues like RabbitMQ or SQS or these kind of things, to process data asynchronous from these queues is just one of the wonders of Lambda. And Lambda runs a polar for you. It's even free. And so you can very simply just write some code that is gonna iterate over the messages in that and, you know, maybe persist it to a database, maybe write some GenAI, who knows.

Those are sort of two easy use cases to think of, but also recommend that, you know, people, look, we've got a site called serverlessland.com. If you've go to the learn page we've got so many use cases there, you've got serverlessland.com/lambda. We've got videos and learning guides and training material and everything. So yeah, try not to get overwhelmed because Lambda and other services can do a lot. Find a little use case you've got and there'll certainly be some example code out there that you can at least learn and understand how it fits together.

Exploring Cost Management and Asynchronous Development in Serverless Architectures

James Beswick: Another topic I was thinking about for really architects and for CTOs and people who are making these decisions, and yeah, increasingly developers just because there's more responsibility coming to that space, is around cost. I think it's something we haven't really talked that much about because we always, you know, look at Lambda as being relatively inexpensive if you get, you know, 1 million implications, 20 cents and so forth, and sort of very generous free tier.

When you think about production level scale, there's so many aspects of cost that come into this. And I think there's still some confusion about this because, yeah, there's a lot of sunk costs in on-prem IT, and, you know, there's lots of different ways of measuring things, but what are some ways you can think of where we can...a simple way to look at the cost of running workloads in this environment where you can make decisions that are, you know, is it a good fit for serverless or should you do something else?

Julian Wood: I think one of the benefits of serverless is because it is small pieces loosely joined that you actually have way more visibility, at least first of all, into your costs. Because you may be running different Lambda functions or you are using a message queue or using an orchestration service like step functions, the pricing for all of these is, you know, per function that runs, per state transition, so it's extremely granular. And that's super attractive to companies because previously they would have something on an EC2 instance or a whole bunch of stuff in a container, and that just runs, and they've got no idea of the actual costs of what's running in that.

And so with a serverless model, yes, it's gonna take some work to understand the different cost dimensions, but you can be able to model your application in a far more granular way. And you're gonna be able to find out, for example, my front-facing API that is, you know, returning items to buy on my shopping cart or to my website. Well, that's really, really critical. So you are gonna be willing to, you know, ensure that that is as performant as possible. And the cool things with serverless, the quicker it runs, the cheaper it's gonna be. A lot of optimizations you can do on all these kind of services to make things fast. But, you know, then that's where you can spend your amount of time because that's making money for your business.

But if you've got some asynchronous task that is doing, you know, backups or putting something into a compliance database or all these kind of things, you can look at that and go, oh, I've actually spotted the Lambda function that's running that it's costing me, you know, an arm and a leg, what's going on here? And you can look at your code just for that example and say, oh, wow, I didn't realize there were some optimizations we could do. Or, you know, splitting things up or using another service. And just being able to iterate quickly and optimize individual costs within your application can be super important rather than, you know, all of that codes in one big kind of thing and you just, you don't know really where to look at.

James Beswick: I think that's one thing I wish we would, you know, in our group talk about more actually over time is about asynchronous development because, you know, a lot of developers are used to working in a server in a single memory space and just doing a job where you take some data from an API, put it into a database, running scripts and so forth. But as this evolves, really to me, the asynchronous development is the magic behind the whole thing that gives you the scale and the control and significantly more visibility into what the application is doing. And if you look at, you know, Amazon's applications, like when we have Prime Day, this is all being powered by asynchronous processes, then it gives the company that enormous scale for these events. But it definitely, when I talk to developers, it seems to be something that isn't that well understood in many circles.

Julian Wood: That is true. I's a shame and it's slightly odd because specifically for JavaScript developers, if you're using Node.js and you're familiar with the event loop, ultimately that is partly an asynchronous process where you just, you know, do a promise or that kind of thing, and eventually that code's gonna run somewhere and come back and tell you when it's finished. And that, sort of, asynchronicity can be expanded into the whole, sort of, world of cloud. And yeah, just super important and such a great building block for building applications because separating two different services in a smart way where there's less coupling and you're not overwhelming a downstream service or you're able to handle, you know, spikes and downtime and issues and that kind of thing is super effective. And a lot of these asynchronous services, that's just built in.

I mean, if you talk about even Lambda or EventBridge or these kind of serverless services, step functions we mentioned before, they're multi-AZ by default. You're not deploying Lambda functions in specific availability zones or having to, you know, failover EventBridge or some other kind of thing. And so, you know, putting a message onto an EventBridge event bus, it's gonna be there, it's gonna have retries and error logic built into it, so much more available and you're gonna be able to send those messages onto other kind of services. And yeah, I think you'd be surprised also the performance characteristics of these.

Because a lot of people think, oh, well, I must use my synchronous workloads as for my really fast workloads and then anything async, oh, well, you know, that's batch processing, or that's stuff that's gonna take a long time. I can wait for in the order of, you know, minutes to hours. But asynchronous can be super fast. And it's surprising. Even when you were talking about the serverless video and serverlesspresso, those are a mix of sync and async applications, but all of the frontend notifications from the application are all actually started asynchronously.

Recommended talk: Serverlesspresso: Building a Scalable, Event-Driven Application • Julian Wood • GOTO 2022

And there's a whole process, you know, something will happen in the backend, which calls a notification and that sends it using IoT core over to the front-end via an async Lambda function. And you're sitting on your mobile phone ordering your coffee or watching a video, and you have no idea that this is, you know, an async process that you would expect to be really slow. And, yeah, it's as fast as you can write a message. So, I think that's also the model that's gonna evolve is building these asynchronous processes, but understanding that they can be as performant to synchronous ones.

James Beswick: I think Lambda functions, the name might be a little bit misleading because, it's not just a function really, a Lambda function can be an entire microservice.

Julian Wood: Mini app.

James Beswick: You can scale up to the Lambdalith, where in some cases it can be even an entire application. And so there's been this debate for years about how big should your Lambda function be. Luca Mezzaliro had a really good article in the Compute Blog earlier this week about this, where we see customers start with the Lambdalith, where it's just basically lift and shift everything into one big function to do everything. Or they go and start with one function per purpose, which is where I started, where you have lots of tiny functions all completely independently, and you tend to then find problems whichever way you started and go violently opposite direction, go Lambdalith to single function or vice versa. And he proposed a third way, which I thought was interesting around how you can dissect this. But that, to me, is part of the evolution of the post-serverless idea about the architecture of applications.

Julian Wood: Definitely. And I think that sort of evolution is natural and healthy. And I think fantastic. And in fact, even the term Lambdalith is something I've sort of tried to avoid speaking about anymore because customers previously have been building applications on EC2 instances, and maybe they've even been using containers. And then that is a proper monolith, we're talking big applications over here and they've wisely decided maybe we're gonna use Lambda. And they then will take even a Flask application or an Express application or some sort of application that's got 7, 8, 9, 10 different code functions within their function and they've moved it to Lambda and it's working.

And then we've come up to them and said, well you know, the story is really, you should be using Lambda for, you know, a single-purpose workload and doing all these kind of things. If you're not doing that, you're building a Lambdalith. And they're like, hang on, hang on, hang on, I've just gone from a monolith over there. I've moved to Lambda. That's a great thing. Why are you still calling it a lith? You know, this is freaking me out. And so I think we maybe put some people off in terms of the terming, using the term of a Lambdalith, that's a bad thing. And as you say, it's two different approaches to it. You either get very granular, you've got super tight permission, you've got super control over everything, but a lot of Lambda functions to run, and that's gonna be a hassle operation.

Or you put everything in a single Lambda function and then you haven't got as much control over their kind of things. And so, as with all consultants, they say it depends and there's gonna be a sweet spot for you somewhere in the middle. And, you know, Luca's post had a really good idea of, well, if you're gonna have a whole, you know, separate your writes and your reads, for example. Your writes and your reads are gonna have different database permissions, you're gonna be able to set up, maybe you've got a relational database that you can optimize where it's gonna write and you're gonna have read caches somewhere else for reading from it, or you're gonna have a separate cache service like Elastic Cache and yeah. So you can, you know, group your functions, have a function for read and a function for write.

You've got all the permissions, the granularity and still high performance, and yet you're not managing so many Lambda functions that that becomes unwieldy. So yeah, that is the evolution. And I actually love how the architectural practices are evolving and we don't wanna be too pragmatic and say, you know, there are any wrong or right ways because, you know, customers do awesome things. And we want to sort of give you, here are the tools, you can use it and we are gonna try and help you make the best choices. And hopefully actually try with the service way, reduce the mistakes you can make.

James Beswick: I think what's fascinating about this space, and one of the great things again about our job is we've got this amazing community. So, you know, you think about the millions of people who use Lambda every day, within this, there's this huge community of builders and heroes who are very vocal and having these debates. And for us, it's interesting I think, to stand back and watch what they're building and why they land on these certain decision points. But over the years, a lot of things oscillate between one direction or another. And I think about things like the runtime choice, you know, that right now there's a big push around what sort of runtimes you can choose for optimizing performance. But many respects, I'm not sure it matters. Lambda works incredibly well with really a choice of runtime and, you know, your own expertise should probably be the driving choice. But at the same time, it's just fascinating to see how fast it can get when people use Rust or Go or some of these newer runtimes.

Julian Wood:  Well I think that's sort of, you know, just showing how evolved and how mature the platform is, where yeah. I mean, some of these runtimes are, you know, low single millisecond latency for a Lambda function to run. I mean, it's exceptional. And so you think all of the power and all of the scale for so many millions of customers behind the scenes. And yet you can run a Lambda function with up to 10-giga RAM, as I said, six virtual CPUs in, I don't know how many regions across the world. I mean, it's the world's biggest, you know, distributed supercomputer and it's available in millisecond latency. So it's actually incredible.

James Beswick: When you get into the optimization side of things, it's really fascinating. Because obviously, you know, you can fine-tune Lambda functions to be incredibly fast with all these levels that you have available. But then when I look at some of the things we've built, actually, you know, you can sometimes just remove the Lambda function completely. And again, nothing's faster than no compute. So, you know, when I look at API gateway integrations and how those play into these asynchronous designs, I'm starting to see more of that, more popular...it seems to me a really clever way of getting that scale and that throughput really without any sort of code maintenance at all.

Julian Wood: That's one of the big pushes of serverless when you started this conversation about it started being run your code in the cloud. Well, you know, we wanna encourage you to run and write as little code as possible. And so with step functions, and it's SDK integrations and API gateways, you mentioned with direct integrations with a whole bunch of services. Yeah, I mean, I'm certainly not the world's best coder. I don't wanna run my code in production. So some other service can run my, you know, business functionality on my behalf, yeah, bring it on.

Predicting the Future of Serverless and AWS's Role in Shaping It

James Beswick: So, now you've been doing this...I mean, you've been here as long as I have, working on these things, and we think about post-serverless in the years to come. If you fast forward five years from now, what do you see customers doing in the serverless space as different, or maybe just the same, but, you know, how do you think it would evolve?

Julian Wood: I hope that it just becomes more of the new normal and some of the operational practices just are embedded with companies. I'd like to see more of the platform evolution, you know, a lot of companies building platform engineering teams, and they're still taking a lot of heavy lifting at the moment, specifically when they're running container workloads. A little bit less so with Lambda. But as those two worlds converge in the future, I'd love to see just best practices built in, platforms that companies can just use and run services codes with observability and all the metrics and all the cool cloud stuff built in.

It's gonna be interesting to see if there are gonna be any groundbreaking new approaches or new changes. We have been speaking about some of the inferring architecture from the code. That's exciting. Yeah, it's gonna be exciting to see what AWS and even our ecosystem partners are gonna be able to build in the future. But I think the post-serverless world is, sort of, here and is gonna continue, and there's just gonna be more and more iterations and more and more improvement and yeah, good for it.

James Beswick: I think if people are listening, what you might not realize is that you're responsible for the future. So the way you get built here is that you tell us what you want and we build it, and in AWS that's entirely how our roadmaps are put together is with the ideas that customers have. So the future is not set in terms of AWS's vision. It's very influenceable by what customers are building. So to me, it's gonna be amazing to see what's coming out in the next five years or so.

Julian Wood: We are just talking about serverless and functions, you know, the whole web space is growing and going, you know, crazy with so much innovation over there. So yeah, watch the space. Let us know what we should build. I mean, we'd love to do it. That's what we are here for is to make your life easier. Let us know and we'll try and get on it.

James Beswick: You can always contact any of us in the BA team. If you go to Serverlessland, we have a number of repos, number of contact sites there, or you can reach out to us on LinkedIn or email, but we're always interested in hearing what your ideas are.

Julian Wood: Some of those are product ideas, but also if there are things you don't understand or want explained in a different way, yeah, let us know. You know, that's what we love doing is to be able to, you know, bridge the gap between, you know, what people need to understand and what we can create.

James Beswick: Well, it's a really exciting world of experimentation that even though it's 10 years old still feels very new to me and it's really exciting to see people, what they're doing. So, Julian, it's always a pleasure to chat with you. You've always got some really interesting insights and what we've gone with serverless and where we're going, but I think we're at time now. So thank you very much. My name's James Beswick and this is Julian Wood, and we'll see you all again soon.

Julian Wood: Bye-bye.

Related

CONTENT

Optimizing EDA Workflows: Realtime Serverless Communication with Momento Topics
Optimizing EDA Workflows: Realtime Serverless Communication with Momento Topics
GOTO EDA Day Nashville 2023
Journey to EDA: Patterns, Best Practices, and Practical Tips
Journey to EDA: Patterns, Best Practices, and Practical Tips
GOTO EDA Day Nashville 2023
Deep Learning for Developers
Deep Learning for Developers
GOTO Amsterdam 2018
Microservices Without Servers
Microservices Without Servers
GOTO Amsterdam 2017