Home Gotopia Articles Building Better ...

Building Better Software: Why Workflows Beat Code Every Time

Serverless espresso served 100,000+ users. The secret? Workflows, smart architecture, and embracing failure. James Beswick & Ben Smith share hard-won lessons from building at scale.

Share on:

Copied!

About the experts

Ben Smith ( expert )

Principal Developer Advocate for serverless at AWS

James Beswick ( expert )

Senior Manager, AWS Serverless Developer Advocacy

Read further

James Beswick: Hi everybody, and welcome to GOTO Unscripted Live from Chicago. My name is James Beswick and I run Developer Relations at Stripe. Here I am with Ben.

Ben Smith: Hi, I'm Ben Smith, a staff developer advocate at Stripe.

Why Workflows Matter

James Beswick: We had a few things we wanted to chat about that we've actually been talking about for a while. Ben, you've been doing a lot of work over the years in workflows. I think one of the questions that people don't think about is, why do you need a workflow to begin with?

Ben Smith: It's almost counterintuitive for a developer to say code less and do workflows more. But there are so many good reasons why you would do this. The first is that if you're building things in workflows, there's literally less code to manage. I'm talking about workflows where it's not necessarily something that you built using an SDK—something where you have an interface that you can drag and drop and build out a state machine.

The second is that many of these workflow services are pay per use. They're serverless in nature. They scale automatically. They grow with your business in terms of the cost of them. Another good reason would be the way that you're able to decouple these workflows from other workflows to create microservices. What would you say is a really good reason to use workflows?

James Beswick: I'm thinking about some of the difficulties when you're in distributed systems of handling issues like idempotency. When you've got a problem where maybe due to a network issue, a message didn't arrive, how do you know on the receiving side you're not handling something twice? Workflow implementations are a really good way to handle this type of sticky problem.

Or on the other side of that, handling something like an error. When something goes wrong and you've got a custom flow, you need to back out and cancel things in a certain way. When I think about some of the things you built—you were famous at AWS for building Serverless Espresso—at Stripe you're building similar things. These very sticky problems that workflows have really interesting use cases for.

Ben Smith: I've done a bunch of talks where I said you should use Step Functions—which is the AWS tool for building workflows—always. In hindsight, I would roll that back a little bit and say you should probably start with deciding why you wouldn't want to use a workflow.

I think it's a really good solution for most microservices where you need to run a series of APIs or a series of discrete pieces of code, maybe in Lambda functions in response to certain events. Because you have this way of very visually being able to orchestrate, branch on if statements, create loops, you can create circuit breaker patterns, you can create catch and retry patterns. Pretty much everything that you would build in code you can visually represent as a workflow. You can put configuration into your application rather than adding extra code. You can catch errors, you can retry errors. You can send those errors downstream to other queues or services to handle them. And it's all very visual when these things go wrong, so you can inspect them easily.

James Beswick: I'm going to ask a question that I suspect many people want to ask but everyone's a bit afraid. What is the circuit breaker pattern?

Ben Smith: The circuit breaker pattern is a way of preventing you from constantly calling a downstream service that you know to be faulty. If you imagine this in a workflow, you would have something where you check the overall status of your circuit or your application. If it's good, you go ahead and you do the things that your application needs to.

The next execution that comes in realizes that there's a problem. The problem is something that you've caught with a previous error. So you have a workflow—if something errors out, you catch that error. You update the status of the circuit to say, okay, this is now a bad circuit. This is broken. We don't want to keep calling these downstream services, especially if there's something in the cloud where you're paying every time you're trying to call that.

With your circuit breaker pattern, you query the status at the beginning of the execution. If it's known to be broken, you do some kind of escalation path. If it's not broken, you can continue executing on the happy path. It's essentially stopping you from continually calling those services that you know to be bad.

James Beswick: What's really interesting is these things are very hard to build by yourself. If you're trying to build that type of logic just off the bat, this is really hard to get right and to test. I think one of the interesting parts of workflows is when you look at real life enterprise development, it's full of these kinds of things that tend to be awkward but can be encapsulated very easily within workflow software.

Ben Smith: You can imagine a typical developer building lots of things all the time, context switching even hour to hour, let alone month to month. If you have something like a circuit breaker pattern that maybe something has gone wrong or it's broken, you can open that up in a workflow months later and instantly understand where it went wrong. If it was written in code, I think it would take most people a lot longer to reason about what they built and where it went wrong.

Versioning and Evolution

James Beswick: When you and I have been building things together, what you don't often see in demos and Hello World examples in tutorials is the fact that real life is full of versioning. You might get your business case down and it works for today. Then six months from now, you suddenly find there's a change in the process you have to update. In software, if you build this yourself through your own code, you become very hesitant to want to change that code. It gets fragile because everybody's relying on it. Tell us a bit about how you would look at doing versioning with this type of approach.

Ben Smith: There are a few different ways. You can have canary deployment so that you check that the new thing you've built is working by sending a smaller amount of traffic to it at first. You can catch problems with that and then roll back effectively. You can tag things with versions and have your own form of version control.

I think it's specific to what you're building, the team you're working with, and the tools available to you. When we were building Serverless Espresso, we would come into the room every morning and redraw the workflow and try to understand what's happening and how things are connected. Even if you're doing things as a workflow with the visual builder, these things are still difficult. They're still hard to get your head around because there are so many disparate pieces. It's an issue with building with microservices in general.

James Beswick: Serverless Espresso is a good example because it's an ordering process where various things have to happen in a sequence. It gets really interesting. You add something like payments because with many things you can make mistakes, but on payments, you cannot. If you accidentally do something you can charge a customer twice or not at all, or something else goes wrong and you burn that trust with the customer.

With payments, the fact that you charge someone or don't charge them is something you've got to get right every single time. It doesn't matter if you have millions of transactions, it still has to be right each time. When you think about idempotency, this is something we don't discuss enough. How can idempotency be bundled in these workflow services in a way that makes it easy?

Ben Smith: First of all, we should explain idempotency because it's one of these words that you only really hear in terms of computing. My interpretation of idempotency is that if you provide the same input, you should expect the same result every time.

There are a few things you can do. The first is that you can have a literal idempotency ID to make sure that each execution pertains to a certain result, so that you don't reproduce different results with the same idempotency ID. What else would you say would be a good way to handle that?

James Beswick: The use of IDs and the way you use those and track them is important because you can use something like DynamoDB to keep track of these keys—have you seen them before? Some services just build idempotency in. If you provide IDs, that's useful.

But I think there's also a state of mind—assuming that the provider doesn't just do it for you, that there won't be this problem. In testing and small numbers of volumes in small batches, these problems often don't come up. But when you look at millions of transactions, these long tail problems are part of the reality that you're going to start to see these issues arrive. As long as you think about them, plan them in, and use services in the way where they are supported, you can approach it the right way. But it's amazing how many customers we've dealt with and talked to where it's just assumed there won't be a problem.

Ben Smith: This is a challenge with being a developer advocate as well. You build applications that are often fairly small in scale. You try to build them to scale. But Serverless Espresso, for example, is probably the most used application that we would have built as a DA. That's had maybe 100,000 users go through it in total. This is not a massive thing. There are not hundreds of thousands of events happening every day. So you don't always come across these edge cases that customers are coming across.

James Beswick: When we tested the workflow for Serverless Espresso, we tested for a million orders a day to see what would happen and where the breaking points were. Of course, the application would never see a million orders of cups of coffee at a conference. But it's worth doing that kind of testing to see where the weird behaviors start to happen.

Ben Smith: That was a good plan actually, to use the scalability of the workflow service to test itself in a sense. That was probably the closest we could have come to simulating a real event.

James Beswick: With the workflow approach, it's not just Step Functions. There are other services like Temporal, which is a very common option. I think as a concept, it's underused. When I think about what people are building out there, there's still too much building yourself when you could just pull a service and do a lot of this complexity for you.

Ben Smith: I think it's just the way developers are built as a persona. You're problem solvers. So you see a problem and you think, oh, I can just write something to do that. But it's a mind shift to allowing you to use something else that a whole team that are dedicated to just that one service—whether that's payments or serverless or observability—just pulling the API that team are offering you can sometimes save you so much time in the long run and offer you so much more power than you as a developer solving that on your own.

From Monoliths to Microservices

James Beswick: I've been doing software for 25 years. One of the big things that's changed is that back in the day when I was a C++ programmer writing things on the trading floor, they were all running on one machine in the same memory space using a hard disk. Nothing's really going over the network, and that makes a lot of problems you don't tend to see—race conditions or other things. Now with the adoption of cloud and all these services, in a way these problems are unavoidable. As you go towards distributed computing, you're suddenly dealing with networks where things go wrong and things don't always happen in the sequence the way that you expect. The planning is a lot harder when you're putting these types of apps together.

Ben Smith: The most successful customers that I've seen are those that pull in the best of breed. They'll have maybe one or two cloud providers using the best of both. They'll have something for payments or something else for their observability piece. Then the bit in the middle is their secret sauce. That's the value or the unique thing that their application does that no other can do. But pulling in these other APIs and these other services as microservices—what do you think when you look at the progression from monolith to microservice and potentially now back to monolith? We're starting to see a little bit of that. It seems to be at a weird point.

James Beswick: I think the microservice design has been confusing for the community generally because there's always this question of how big a microservice should be. Obviously it depends on the organization and the team size and what you're building. All sorts of variables make it hard to come up with absolute rules. Often you would see microservices are just way too small or way too big.

When I come out the other side of that, I think there are some things you can say that you shouldn't build yourself. Auth is one of these things where you've got services like Auth0 that do this for you very well. Cloud providers have similar services. But the one thing you can say with certainty is you shouldn't roll it yourself. It's one of those dangerous things. If you get it wrong, you're going to get hacked, you'll lose data, you'll lose trust. It's the same with things like payments. Credit card numbers are effectively toxic. Why allow them in your application when you can easily farm out that work to a provider to do it for you?

The services in the middle of how you then model your own business logic are still a bit open to interpretation and the size. I think the move towards the monolith again is probably just microservices getting big once again.

Ben Smith: There's an interesting train of thought as well that a workflow in itself is a kind of monolith—a state machine is a monolith—because you have this enclosed area where the workflow only really knows about what it knows within that execution history. The more things you drag in and orchestrate, the more this workflow grows. Eventually you can very quickly accidentally build a monolith as a workflow. What are some of the things you've seen customers do to get around this problem of accidentally building a monolith?

James Beswick: It happens over and over in different ways. I think it's easy with microservices to over-orchestrate what goes on between them and you accidentally create this very fragile glue between everything.

Ben Smith: Exactly.

James Beswick: One of the reasons I really liked EDA as a concept—event-driven architecture—was that it forces that apart a little bit. But even then you can get to similar problems where this question of whether you use very thin events or very large events can cause this flow, or the flow of events causes the monolith.

Ben Smith: What's an example of a thin event then?

James Beswick: If you look at an ecommerce app, an event where you say an order ID is number two. But a fat event, the opposite would be the order is two, and the order contains a cappuccino, it's for Ben, it's this type of milk, here is the currency. You really put everything you know about it into that. Somewhere in between the two is usually the right place to be.

Ben Smith: In the thin event version, presumably you would store the rest of the info in some quick lookup so you can access it.

James Beswick: That in itself is a problem because then each downstream service might well be pinging that service to find out what the overall state of order number two looks like. It can get overwhelmed by the number of requests, or you just got a lot of traffic going backwards and forwards.

Ben Smith: Then each downstream service needs the authorization to access that resource to get the more information. Where does that sit? Maybe you build that into the workflow piece and then you're back to your monolith again. I think you're right—events is one great solution to decouple these services, especially with workflows.

You use your workflow piece to orchestrate within some sort of microservice boundary, and you have to draw that boundary eventually at some point. I think it's normally within a certain domain. You can normally understand the domain—this domain is the thing where I'm ordering drinks, and I've got a database to keep track of the orders, and I've got a workflow to orchestrate how those orders change. Then I've got something else which is maybe the customer microservice and an API to update customer and a customer database. These might want to communicate with each other by consuming and producing events. In the middle of that you have some sort of notification service or event bus or something like this.

James Beswick: A lot of this design choice, however you build these things, the computers and the servers and the systems behind them will support it and it'll work just fine. Mostly. Generally speaking, the real design choice comes down to the cognitive load you want to put onto your team.

When I think back to jobs I've had in the past where we built things, the monolith tends to require developers to know way too much about what's going on. It makes onboarding new developers hard. It makes it hard to make changes to systems. Really, these types of designs are all about how can you reduce the amount of knowledge you need about each service. More importantly, when you come back six months from now and you don't remember what you built, how quickly can you onboard yourself to the design so you can make changes to it?

Ben Smith: This is the challenge with things like Lambda as well because there's a lot for you to learn upfront before you can even start dropping code into your Lambda function. You could say it's a similar problem with something like SAM, which is the serverless framework for uploading applications on AWS, or Serverless Framework itself. These are all things that have a big cognitive overload for you to learn before you can start adopting it.

Whereas something like CDK, for example, the AWS CDK, is a more similar framework for building and coding. These services where they are meeting the developer where they are—they're using the languages that developers are already using, they're using terminology that we're already familiar with—these are the things that seem to be a lot more successful.

Extensions and Extensibility

James Beswick: A lot of the work you did on extensions and extensibility is underused in the community as a concept because people are always trying to build the one application to rule them all—the thing that covers every use case. Every time a new feature comes in, you're just assuming it into the code base. To me, this is not the way to go. You just increasingly create a blob that isn't maintainable and over time atrophies. Whereas extensions can be really powerful. Tell us a bit about extensions because that was something you built for the Serverless Video project.

Ben Smith: This was a concept that we wanted to create where we would create an application that would do a cool functionality. Take an example of a live video streaming app. We would build the core functionality. But what we wanted was to allow people to extend that core functionality. The way we wanted them to do that was by consuming events. These events are fired off at important milestones along the post-production workflow of this live video stream.

The video streams, the video ends, and after that there's an automated post-production workflow that does things like translate the video, generate captions, or work out who was in the video. We wanted to create events that would occur along this workflow and publish these events so developers could say, "I'm interested in when the captions file has finished generating because I want to make an extension that would translate the captions into Spanish." They could hook into the event that says "captions ready," run their own microservice, and return a response.

We built it in a way that the workflow would wait up to two minutes for all the different plugins that have hooked into that given event to return their response. If they don't return within two minutes, it just moves on. If they return, then that goes into a key-value lookup for all the data that would respond, and all of that will be available to the next event hook.

What you had was this application that was originally just for live video streaming. By the time 16 different developers had got their hands on it, it was using gen AI to summarize what was said in the video and create titles, sharing it on social, working out if there was any bad imagery in the videos—all these interesting plugins that we never would have thought of on our own.

James Beswick: Let me ask you then. Why didn't you just build a monolith? You could have avoided all of that and said, here's the 16 developers, here's my GitHub repo, go at it. Why didn't you take that approach?

Ben Smith: It's much easier to keep these things decoupled. We wanted to say here is the application, here's the bounds of the application. No matter what you build, it will always still work because what you build, even once we pull it into the app, the core app is separate, decoupled from that. That was the most important thing. It couldn't break. It mustn't break even if the extensions broke for any reason.

That was the main reason why we didn't want developers crawling all over the code base and us having to look at different GitHub issues—what if that one merges with that one, what happens if they're talking to the same database at the same time? We'll just use events. We'll give you the event contract, you give us the response it's supposed to return, and we'll publish that to our event catalog. Then other extension developers can see that and use that as well. You have this nice decoupled way of growing your application.

James Beswick: I remember when you were building this. Often people think plugin architectures are for big projects—you're building a browser or a CMS or something where hundreds of thousands of users will be using it, and the plugin architecture is beyond them. But the actual amount of code you used was really tiny. It made me think plugin architectures are much more applicable to many enterprise pieces of software.

Ben Smith: I think absolutely. Stripe does a good example of this where they have the ability to create something called Stripe Apps. Stripe is well known for having a great API and great documentation. The API can do loads of things, but many of those things might be something that the developers at Stripe might not know because they're not using the product every day to take payments and do all the things that it can do.

So they created Stripe Apps to allow other developers to build extensions that can even sit inside the Stripe dashboard that extend Stripe by using the Stripe API. Or maybe it's just rendering different objects within your Stripe account. It's again that same piece of plugin extensibility and allowing the developers out there to bring their own specialists, their own skills, their own ideas, without it affecting the core application.

James Beswick: Let me speak to enterprise developers. Many are very competent at what they're doing. They've got knowledge of certain runtimes, certain pieces of software. But many of these concepts seem very new to them. What would be your advice in terms of what's the best way to learn some of this stuff? If I want to adopt extensions or plugins or any of the things you've been talking about, where should I start as an enterprise developer?

Ben Smith: Just start with something small, something that is maybe not essential to the core business. Something with a small blast radius if it goes wrong, if your implementation isn't good. Prove that, champion that, and you can start to grow it from there. If you build something that is useful that is based on extensions, people will quickly realize the value that it brings. Then you start to see customers grow their knowledge from there.

Building Effective Demos and Choosing Technology

James Beswick: You've been a developer advocate for five years, coming on six years now. I've worked with you that whole time. I think you've been very successful because you've managed to convey complex concepts to people, especially people who are new in the software business and learning all of these things. Given all the demos you've made, what do you think are the important parts of producing a demo that captures the imagination of people so they can understand the important concepts?

Ben Smith: The first thing to try is to build a demo where people are using it without realizing they're using a demo. They walk up to it and go, "Oh, this is a thing that I can use to achieve something that I want right now." Whether that's ordering a cup of coffee or buying something from a store—different versions of this where they're immediately into your demo without really knowing that they're doing a demo. So that gives you the interest.

After that, you want to build something that is using your product. It has to be obviously using the product, not something that is unrelated, not just some sort of grabby toy. It has to be something built with the product. Then there has to be a moment where they see something happen and they go, "Oh, okay, I sort of see how that's put together." You can see that spark ideas in people instantly. That's a demo that is helping people to think of different ways they can implement something.

James Beswick: When you're thinking about technology you can choose from, the great thing today is that this is like a candy store of different places—the cloud providers and service providers. What's in your mind when you're choosing between the options that oftentimes are very similar and how do you make a choice? Do I want to use this thing and not this thing?

Ben Smith: A lot of the stuff that I'm building is fairly small use case. One of the things I look for is a pay-per-use model, a cloud approach model, something that will scale, often doesn't cost a lot in the beginning. Something that's got great documentation and is widely adopted already. The other thing it has to have is an API so you can customize it and extend it and bend it to your will a little bit.

James Beswick: Let's talk about APIs then since that's a good point. Many people aren't sure what an API is. We talk about them all the time. What do you think defines a good API, something that jumps out at you like, I can absolutely use this thing?

Ben Smith: The first thing I think of is great documentation so you instantly understand how to achieve the thing you want to do. Anything that you're able to do in a SaaS dashboard, you should be able to do with the API. It's really frustrating if you do a thing in the console or the dashboard and you can't achieve that same thing with the API. For me, that's a little bit of a warning bell—what else can it not do that I might need further on?

You need the authorization and authentication options that you need, and that could be different for different companies. You also want it to integrate well with the thing that you're building. That could mean that service you're building has an integration with your cloud provider that maybe doesn't need you to stand up a webhook. You need to look beyond the API and see how does this service integrate with other things in the ecosystem that I'm currently building with?

James Beswick: I was thinking about how, going from this model where you're building something all yourself on your own server or in your own environment, you're opening yourself up to all these other systems. I was thinking about testing and observability—two critical problems because suddenly now you're not quite sure what's going on. What are your thoughts around bringing in this API? How do I now test this API so that whatever I've built works with it properly and I can be sure that's what I'm expecting out of it?

Ben Smith: I can try and give you the correct answer or I can just tell you what I do, which is probably not what you're supposed to do. I test as I build. This is how I build things. I'm constantly clicking around different interfaces, different dashboards. It's always useful when you come across a service that has a single pane of glass that is showing all of the events you're generating, all of the API calls you're making, all the errors without you having to jump around different windows. That's a great thing to have, and it's surprising how many services don't start with that.

Local testing is also something that's really useful. I know that was a big issue for serverless for a long time. It still is, really. Although that's slightly different because that's more of a mindset of this concept of having a cloud account where every developer can test on their own cloud account and then migrate that into a UT. But it's sometimes not realistic.

James Beswick: A lot of this, you know, to get to real scale in anything you actually have to have a very good practice around testing because there are these long tail issues that show up between services, and they don't show up until you hit a certain scale. It's really interesting because I've seen it both ways where you don't build this. Then when you get to a certain level of traffic, problems start to appear and they become almost impossible to debug because you scroll through log files, you want to follow data across services, and it becomes fairly chaotic. Whereas when you do it the right way, it's much easier to set up an observability framework for how you want to approach things and note things down as they go wrong.

Ben Smith: This comes back to one of the reasons why workflows is a good way to build stuff. You have on most of these services that observability piece built into the execution of each workflow. I would cheat that a little bit by using a workflow service where it's sort of there for you in a nice package.

Observability and Debugging

James Beswick: With Step Functions that you were using for these services, and you were also using other AWS services and things outside AWS, how did you trace transactions when something went wrong? If something was coming in from the API ending up in the workflow, maybe going out to another provider to be processed and come back. What was your approach when there was an error to working out exactly where that went wrong?

Ben Smith: You would start with the workflow execution itself and have a look visually. Every execution would visually show you the state at which it broke. Sometimes it's not the workflow that breaks, it's some other piece. It might have come from API Gateway and then gone on to some other piece. Maybe your Lambda function is returning a 200 success, but downstream it's broken somewhere.

Then you can add things like X-Ray, which is a tracing service on AWS. But I found that quite difficult to use if I'm honest. But I think that's probably the best way of doing that if you want to stay within the AWS ecosystem. But then there are services that are just specialists in these sorts of problems. As your application scale grows, you probably need to start using these other services where that's their expertise.

James Beswick: Thinking about things you've built in production in the last few jobs you've had, as they started to scale up and you've had a problem when you couldn't fix something, often because it scaled up or you didn't build the right debugging, what was your approach to solving that problem? When you're in production and it's chaotic when things start to go wrong that way, how did you approach dismantling a problem and fixing it?

Ben Smith: I probably go straight to the logs and eventually you're in log hell, clicking around trying to find that moment where you can see some sort of error. It's very difficult on these systems that are disparate, spread over multiple microservices, running multiple cloud services, maybe calling out to other third-party APIs. It's really a problem. I think this is one of the reasons why you are starting to see a bit of a trend of people moving maybe a little bit away from microservices, or at least rebalancing the idea of using that for everything all the time. This is probably the main reason—when your application is so decoupled and so spread out, what's actually holding it together?

James Beswick: Over the last few years, the amount of data generated by these apps has just spiraled. The first mobile apps I built, you hardly produced any data. We're collecting a few log files. Before you know it, people are uploading photos, uploading videos, it starts to explode. Suddenly with that, your cost goes up enormously. This is the tradeoff. It's great to have these cloud providers doing all these things for you. But when the bills start to arrive and you start to see how expensive it gets, it's often easier as a first cost-saving measure to bring things back a little bit.

Ben Smith: That's totally true. It's a reason why at these cloud conventions like re:Invent, some of the most subscribed to talks are about optimizing for cost. When your application is built on services, you're paying for each time you use the service. You need to come up with all sorts of clever, interesting ways to reduce it and optimize for that.

James Beswick: This is one of the difficult tradeoffs as a developer because in this toolbox, everything has been built practically that you could imagine using—this library, this service. But you can build your own. The question is, when you build your own and you think about the developer hours and the testing and the maintenance, is it worth it versus buying something off the shelf or using something off the shelf and paying for use?

It really does vary. Where I came from before working at AWS, we were building for startups. One of the problems is if you build for scale in a startup, that might not be the best use of your resources because you don't know if the idea is going to work and people want to use it. But of course, you only find those people want to use it when they all show up. If you haven't built the scaling and your application falls over, they go somewhere else. There's this chicken and egg problem that buying the service off the shelf often helps with. That's why I got into serverless when I first did.

But then you get a bit locked in because you're with a vendor with one particular solution, and then things scale up, you need to back out and move things. I'm not sure there's a clean and elegant solution here. To me, as a developer, it means keeping your options open and knowing a broad spectrum of technology can really help you.

Ben Smith: The best solution I've seen for this is when you have these businesses where they maybe run most of the infrastructure on one provider, but they still have a little bit running in another. They understand the terminology, they have the relationship with the various teams in the other provider. If they need to, they can shift and they're not so locked in as if they had to start from scratch.

James Beswick: It's one of the shames with the cloud really that you've got the big three, but there's so little commonality between the three. If you look at something like Terraform, it does a reasonable job of trying to abstract away. But in practice, if you want to even move buckets from one provider to another and data from one to another, the rules are all different. The limits are all different. The terminology is all different. This lack of consistency between the three makes it very hard for developers.

Ben Smith: I wonder if that's intentional or if it's just small groups working together in isolation, because a lot of these companies hire across each other. So I wonder what they do when you are an engineer that moves from one to the other and the terminology is slightly different.

James Beswick: When it was just simple for the cloud, when you had relational databases and compute and storage, various buckets of things. But now there are so many different things you have to deal with. Even something like a queue, which in principle is very simple, the implementations across providers are very different. I do think it's something that as you build, you have to think about—it's better to keep your options open.

Ben Smith: I think there's also a problem you have now where there are so many services that are very similar. They have a lot of overlap and a little bit of something that's different. You're not really encouraged to be super opinionated on the way you build things when you work for the company. You can be opinionated, but you can't make another service look inferior in any situation. Whereas in some situations, one thing is better to use than another.

When you have a lack of opinionated guidance, it's really difficult for developers out there to know which thing to use out of all the multiple versions of moving an event or a message around the system. One of the things I like about having left a big cloud provider is now we can just talk about the best of breed. We can take the best from here and the best from here and put them together because that's probably what other developers and other customers are going to be doing as well.

James Beswick: I say to a lot of junior developers who are new in the industry—when you start with these things, often you think that what you know, the tools you know, the language you know, is the way to build something. But in reality, as you start to accumulate more languages and more tools, you realize that everything is a tradeoff. There isn't so much a right answer or a wrong answer. It's simply a decision you're making that will create debt in the future or work now.

Even something as simple as, do you write a script on your laptop that does the work, or do you do it properly and break it out and put it in the cloud somewhere? That's a choice between work now, work later, and various distinct tradeoffs. It isn't so much about right and wrong. That to me is the most fascinating part about building software because everything I've ever built, I've made bad decisions. Those decisions come back to you later.

The Art of Developer Advocacy

Ben Smith: A guy asked me at the event earlier today what the skill sets are that you think are important for being successful as a developer advocate, because it's often a role that people are interested in. You've managed a couple of developer advocate teams now. What would you say are important skill sets to develop?

James Beswick: I think it's one of the most difficult roles in technology to hire for because it's an odd selection of skills. First of all, you have to be a decent developer. You have to know certain runtimes, certain environments well so you can talk about them with people.

You also need the ability to be empathetic with the users. That's important. It's not so much about right and wrong in the tech holy wars. You really listen to people's problems, see if you know something that can help them. Also know when you can't help because you don't know that much in the space.

I think you have to be a good communicator, whether it's something you can write or you're good on video, you're good on the stage, you're good at getting an idea across to different people. I think you also have to be fairly independent. You think about the teams we've been in—we've been all over the world with our groups. You're in London, I'm in Denver, we've got people in Japan on my current team. When you're by yourself, you have to know what you're doing and be willing to work on things.

You also have to be willing to put yourself out there and say, this is something I believe, but you're willing to be slightly wrong so the audience can engage. Often when I'm talking to people, they have some combination of these skills, but rarely all of them.

Ben Smith: It's back to that idea of being opinionated and being brave enough to not necessarily learn in public, but have your way of doing things and to be able to explain that in a way that is reasonable, even if it's not the best way. I think that's always useful for people to be able to follow something that you've built from end to end and understand why you've made certain decisions along the way.

James Beswick: Let me ask you then. You weren't a developer advocate before five or six years ago, and you've been doing this since then. What is it that you most like about being a developer advocate?

Ben Smith: It's talking to people and constantly working on different projects, solving different problems. I'm a problem solver. My wife hates me for it because I can't help but come up with a solution even when you should maybe just listen. You can do that in this job. You can constantly be hearing challenges and problems and thinking about ways to solve that.

Maybe sometimes that's taking the problem up to the product team and they need to solve it. Or maybe it's by explaining how you can use different things together and building a solution for it. But I think it's the personal aspect, the people aspect, the empathy that you need to be able to pinpoint a challenge and simplify a solution for that challenge. That's the interesting part to me. How about you?

James Beswick: To me, I was training developer teams on serverless before I became a DA because I had a company where we were building software. We had development teams in Colombia and Romania, and I picked up on serverless as being useful just because of the type of software we were building. Then I had to explain it to these groups about why it was useful, how to use it, and best practices.

Then I got approached by AWS—did I want to do this as a living? I found it was the best job in the world. I really, absolutely loved being a DA. As you know, I've been managing the DA team for a few years now. From a managing a team point of view, one of the great things is the personalities of DAs are all so distinct. It's just extraordinary the sorts of personalities who come into this role. It's a true honor for me on the management team because it is such interesting people building different things.

Ben Smith: I think they're all the random nutters of the developer world. I really do. It's a mishmash of odd bods, but it's great.

James Beswick: My last question to you then. Obviously as an industry there are always new people coming in. They come in different ways these days, not just always through computer science degrees—through online courses, through trying things themselves. As a piece of advice you could give, what's the single piece of advice you would give to people who are new and getting into things to get started and to really embrace the industry of software development?

Ben Smith: I would say the most useful thing you could do is to share your learning. Every time you've understood a new thing, write about it, make a video about it, share some code that you've written on GitHub, and keep creating content around the problems that you're solving. You'll find all sorts of interesting people reaching out to you to want to talk to you about that because they're all meeting those same problems too.

It's quite hard to do because you can look like a bit of an idiot sometimes. You've got to be honest about what you don't know and what you've just learned. But most people really enjoy that. What would you say?

James Beswick: I think it's similar. I would say don't be afraid to experiment and get things wrong. Even though software often seems very black and white, it's much more gray. You learn more by getting things wrong. I continuously still get things wrong when I build things, but every time I do that, I learn something insightful. It's why I think the industry is still fresh and interesting after doing it for a couple of decades.

Ben Smith: Absolutely. Constantly having to relearn every two or three years. But it's what keeps it interesting.

James Beswick: That is. Well, thanks Ben. This is Ben Smith, a developer advocate at Stripe. My name is James Beswick, and I run the developer advocacy team at Stripe. This is GoTo Unscripted live from Chicago. Thanks for listening.