Home Gotopia Articles Beyond the Cloud...

Beyond the Cloud: The Local-First Software Revolution

Why rely on the cloud when your devices are already powerful? Brooklyn Zelenka explains how local-first computing is revolutionizing software development for collaboration without constant connectivity.

Share on:

Copied!

About the experts

Julian Wood ( interviewer )

Serverless Developer Advocate at AWS

Brooklyn Zelenka ( expert )

Author of numerous libraries including Witchcraft, and founded the Vancouver Functional Programming Meetup.

Read further

Introduction to Local-First Computing

Julian Wood: Welcome to GOTO Unscripted. We are here in the beautiful city of Chicago. Beautiful weather for GOTO Chicago in October. So happy to be here. Another episode of GOTO Unscripted. Talking to other speakers who are here at GOTO. Brook Zelenka, welcome to GOTO. Have you been to Chicago before?

Brooklyn Zelenka: Once briefly, in 2019.

Julian Wood: This is my first time, and I'm loving it. You're a distributed systems researcher. That can be a broad title. What does that mean for what you're researching?

Brooklyn Zelenka: I've worked in a bunch of different areas. My career has evolved over time into working on primarily access control, but also distributed computing. So getting multiple machines to coordinate on a problem or divide up a problem across things that may or may not be co-located. The two main projects that I've worked on the past few years are essentially across identity, data and compute. Making sure that data, no matter where it lives, and no matter how many replicas there are, can always be stitched back together. Everywhere it lives is access controlled, so it's encrypted or otherwise managed without a central server. And compute so that no matter what you're doing, even if you're on a plane or disconnected from a data center, you can still run computation.

Julian Wood: So the distributed computing problem that it's solving is one of scale because you need to separate the problem into a number of things to iterate individually on that. Why wouldn't you just run it on one system or a system can't be big enough to run all of that?

Brooklyn Zelenka: There's a few reasons. One is latency. If you have really small jobs, latency isn't the bottleneck if you're training an LLM. But if you have something that's more like functions as a service or Lambda, if you can save yourself a couple hundred milliseconds by computing completely locally or on a computer sitting right next to you, rather than always going out to us-east, you've saved yourself all of that round trip time and can make it much faster.

Additionally, if you're working on a train and you go through a tunnel and you lose connection, it really sucks to have to restart that job or hope that it finishes. So a lot of my work centers around making sure that things that can be computed locally are computed locally. And when you absolutely need to go to some external service or send an email, that can either be put off until you're reconnected or, if it's actually required, the application developer doesn't have to juggle all of these concerns.

Recommended talk: The Jump to Hyperspace: Local-first Software • Brooklyn Zelenka • GOTO 2024

Rethinking Traditional Software Architecture

Julian Wood: So instead of writing something all locally, now you've got to go and use a cloud service or database remotely or some kind of thing, and you're not ready for it at that moment in time. And your local computation breaks down.

Brooklyn Zelenka: We're essentially been building software the same way for 30 years. It used to be that you had a base box that lived on your desk, and you would dial over the internet with your phone line. If you wanted something like toast or photos or email, you'd have to rent a server from somebody that was online all the time. Then you end up with concerns about scale, and if you're disconnected, they're still around for you, but you have to trust them and deal with latency.

In the intervening years, we've tried to extend that metaphor by containerizing things to scale them up and down and hide complexity. Now you have clear distinctions between front end engineers, back end engineers, DevOps people that specialize in things like Kubernetes.

Some of us have taken a step back and said, what if we re-architected software in a way where there isn't a required central server that has to deal with all the scale? All of our devices are quite powerful now. My phone is incredibly powerful. Why can't I do most of that computation locally and only go to the cloud when I absolutely have to? It'll feel much faster because I'm not paying for latency, and I don't have to rely on somebody else to run all that infrastructure.

Google famously kills applications all the time. There's a website that counts 295 apps killed in the past few years. If all of your software runs completely locally, nobody can shut it down. Nobody has to pay for that infrastructure cost. The economics become very different.

The dividing line is sometimes you actually need coordination. We have a good understanding in the last ten years of which things can be done without requiring a central server and which do require it. A lot of my work has come down to making that division hidden from developers so that it just happens automatically.

Julian Wood: How do you think about the interactions with other companies or microservices where that coordination is across a much bigger domain boundary than possibly you can't do everything locally?

Brooklyn Zelenka: It does take a lot of things that today you'd have to go out to different services and brings them local. But absolutely, there are some things where you don't own the entire stack. If you want to send a text message, you're probably going to use a text messaging service.

These are the kinds of things where we have to know what can be computed locally and what is required to be done remotely. If you have a remote service that you rely on, like the BlueSky firehose, you probably don't have all of that locally. Other people are making edits, and it's at a scale that your local device probably won't handle. Going to a server that's indexing all of that for you is completely reasonable.

We can mark that as something that I'm relying on somebody else for. If I've made that request in the past couple of minutes and I'm offline now, maybe I can reuse the automatically cached version until that cache expires. We don't want individual developers having to think about all of these concerns. What if we can just automate that away for them?

If you're building a note app, step one is you build the application locally with some dummy data and database. Maybe you call out to some dummy external APIs. Traditionally, you'd then containerize it, find a host, and set it up in the cloud. What if we just ship our mock version of the application instead, and we already know where those boundaries are? If we can build an application model or gRPC model inside that knows when external calls are needed, it can register the responses as normal data, and then treat that as something it owns locally.

Julian Wood: You're talking about how software architecture has been the same for about 20 or 30 years. What fundamentally changes for this new approach? How does that actually work?

Brooklyn Zelenka: For the end user, assuming we do everything correctly, it should be completely invisible that anything special is happening.

Julian Wood: When you say end user, is that someone building an application and then the end user being somebody walking down the street on a mobile phone accessing the website?

Brooklyn Zelenka: Exactly. In that case we still need to serve them the website, but typically in this local-first model, that website is just static. We can cache that very well. It doesn't require any state or databases. It's often a single page application.

They load that up and it sets everything up it needs. It creates some key pairs locally, maybe uses the web crypto API or path keys, and creates a local database. And they just start using it. Maybe it's a text editing application with collaborative text editing like a Google Docs clone.

They're typing away and they invite their friends, and their friend starts typing as well. Then they get onto a plane and don't have access to the server anymore. They can connect over Bluetooth and continue editing because there isn't a single copy of a database somewhere that is the single source of truth.

We gain massive scale by simplifying each individual node. If you have access to the data, you have access to the data, you should be able to edit it. And if every replica, every device is its own source of truth, and we don't expect everyone to have exactly the same data at all times, then you get on a plane, edit, and we get off the plane. We use fancy data structures that know how to automatically stitch that together. Think about it almost like Git without having to do manual merges.

Large scale databases use similar techniques to achieve high scale in a cluster. We're saying, what if we took that to the far edge, to every individual device? Of course, this doesn't work if you need really strong consistency.

There's researcher Mae Milano who does fantastic stuff exploring the idea of flexible consistency. If I have a video game platform, the only ones who really care about the current state of that game if we're playing are the two of us. We don't have to have everybody else in the world know about it. But if we're going to rank the top ten players in the world, we'd better all agree who the top ten players are. That probably requires some central coordination.

So there is this boundary that says if we're just doing text editing amongst friends, this is totally fine and we don't require a server because we can use these fancy data structures that stitch together to get everybody back in sync once they reconnect. Think of it more like Git rather than a traditional database with a single source of truth.

Recommended talk: Software Architecture for Tomorrow: Expert Talk • Sam Newman & Julian Wood • GOTO 2024

Local-First Implementation and Data Synchronization

Julian Wood: So the mobile phone would have a local instance of a database that's able to synchronize offline using these fancy data structures and do clever merging. In terms of developing the application, how does that work?

Brooklyn Zelenka: Typically you use some toolkits, just like how you use a web framework. The team I work with has a project called Automerge, which is one of these fancy data structures. It gives you what looks like a JSON API where you have a document which is just a JSON structure. You wrap it in a call to Automerge, and then you do all your normal things: "I want to change this field, so doc.myField = 'hello world'." At the end, that will create all of the bookkeeping under the hood required to make this automatically searchable.

You use toolkits like Automerge or my project Beehive which adds access control. They give you nice top-level APIs in the same way that Node or Rails give you - invite members, add documents, create different documents, point at a particular sync server if you want a sync server.

Even when trying to demo this with people live, it's often a little anticlimactic because it does all this cool stuff under the hood, and then you get there and you're like, "So I'm editing JSON." "Yes, you're editing JSON."

Julian Wood: What percentage of computer science is just editing JSON?

Brooklyn Zelenka: I know, right?

Julian Wood: So you're installing a framework that allows you to have this opinionated process of editing data. Where does it get hosted if it's just sitting on my laptop but I've got somebody walking down the street connecting to it? What's the communication mechanism that makes this collaboration possible?

Brooklyn Zelenka: The system doesn't care how the bytes move around as long as it can receive messages. You could literally load them onto a USB stick and move them across, and that would continue to work. In practice, people have tried two main approaches. One is peer-to-peer, which is hard, especially behind consumer Wi-Fi NAT traversal.

The one that always works is to put a server in the middle. We call these sync servers. Unlike deploying an application with a database that has to know your entire schema, sync servers are pretty dumb. All they know is "I received some bytes, those bytes belong to this label, and somebody else requests them. Should I give some to them? Yes. Here's those bytes." All of the interpretation happens on the client.

They can be deployed everywhere - inside serverless functions, as standalone servers. A single standalone server could run this for many applications without knowing anything about the data inside. You can scale them out or use several - they're very lightweight, really just a relay.

Julian Wood: So the sync servers are just traffic hops that send data along routes. What about broadcasting? If many people need to communicate with many more people, does the complexity ramp up or is it just more data?

Brooklyn Zelenka: It's just more data. In this model, we can diverge for an unbounded amount of time. I can be disconnected and continue working. So even in this broadcast world, I don't require that everybody notify me they got the same message or that some quorum has received it. We send stuff and maybe they're online, maybe they're not.

The sync server can support having an open WebSocket channel where I can just push things out with subscriptions. If users are offline and come online later, they can just do a regular HTTP get, which is also equivalent in the system.

From a distributed systems perspective, anytime you're talking through a sync server, you're at hundreds of milliseconds latency. In a practical sense, if you're online, it'll feel snappy at human scales. But there isn't really a distinction in the system between whether it took 100 milliseconds or three years - the underlying application doesn't really care.

Julian Wood: Your interaction with the data is always local for your local copy. Is this a sort of federated database? You don't have a central coordination database where you're doing reads and writes via an API. You have a data model that acts locally, knows where to send information, and these data structures merge information that flows around.

Brooklyn Zelenka: It's a perfect way of thinking about it.

Julian Wood: I know you've also done work that might relate to blockchains or that kind of technology. How does that play into this?

Brooklyn Zelenka: Many years ago I worked on the Ethereum protocol, mostly around the virtual machine and peer-to-peer aspects. We end up using a lot of similar techniques under the hood, like hash linking. But the primary difference is that in a blockchain, everyone in the world has to agree on the current state of the world, so we're doing all of that coordination all the time. The main difference is that there's no single owner.

Julian Wood: That's computationally expensive at scale because everyone has to agree.

Brooklyn Zelenka: Exactly. It's not necessarily that the computation itself is expensive, it's all of the coordination of agreeing that this is the latest version. When I started on local-first, we asked what if we took all of those techniques and applied them to things where we didn't have to all agree, but still maintained the "no single owner" aspect?

The two approaches feel very different. If you're using a blockchain and get on an airplane, it's going to keep moving ahead without you, and you pay in performance and time for that consistency. Local-first doesn't have those problems. But there's nothing saying the two can't interact.

I've seen projects like AnyType that uses both local-first between checkpoints on a blockchain, then checkpoints mainly for access control. The two can be compatible, but whenever you have a system with weaker consistency like local-first and very strong consistency like a blockchain, you always have to drop down to the strongest version. Between checkpoints it can be very fast and fluid, but when doing a checkpoint, you may not be consistent with the previous checkpoints if you didn't get your merge in. So you give up a lot of the benefits of local-first as soon as you involve stronger consistency.

Recommended talk: Demystifying Blockchain: Infrastructures, Smart Contracts & Apps • Olivier Rikken • GOTO 2023

Security and Use Cases for Local-First Applications

Julian Wood: Local-first sounds good, but what about data protection? If I lose my phone or I'm disconnected and there's no central storage state, where's my stuff?

Brooklyn Zelenka: This is the main thing I'm working on right now at Switch. I worked on local-first access control at my previous company, Fission. Until quite recently, maybe the last year, people didn't really care about this problem because they were mostly working on stuff for themselves or friends. Now they're trying to do things for real customers, and it's becoming critical.

In traditional access control, you rely on a network boundary. There's a resource server somewhere with a guard saying what you're allowed to read. In this world where everybody has the data, that model breaks down.

There are two halves to the solution. One is read access and the other is mutation access to control changes to documents. For read access, we encrypt everything at rest. Any change everybody has a copy of is encrypted, and only those with the key can decrypt it. Since all cryptography is eventually breakable, we add another layer: the sync servers will only give you the encrypted bytes if you're a member of the group.

So it's a two-layer approach. If somebody breaks into the sync server, they can't read anything because it's all already encrypted. The sync server doesn't need to know how to read any of the documents.

For mutations and changes to documents, we have a list of who's allowed to be in the group. Admins can change that list. Any change applied to the document is signed, and when you receive a change, you check: Are they on the list? Does the signature match? If not, you throw that out.

Recommended talk: WebAssembly in Production: A Compiler in a Web Page • Brian Carroll • GOTO 2023

Getting Started and Future Directions

Julian Wood: If somebody is interested in this, what changes do they need to make? Is it a different programming model? What is the transition from the traditional way of building software into this local-first mode?

Brooklyn Zelenka: Local-first is much newer than containerization and Kubernetes, but the stack is also much smaller because it's a rethinking of what we actually want to do. I would say start with AutoMerge and get used to its APIs. It really just feels like you're editing JSON in the browser.

You can use a sync server - Switch runs one, and there are other teams running them as well. They don't have to know anything about your data, so they're essentially interoperable. There's work right now going into being able to host these on Dropbox or Google Drive without setting up a specialized server. The only part where you have to interact with another team is deciding how to move bytes around on the network - which sync server to use. Other than that, you just use AutoMerge or something similar like Serenity, DXOS, or Yjs.

Julian Wood: What are the use cases or example applications? You've mentioned document sharing or collaborative editing. What would be good for people thinking of transitioning to this approach?

Brooklyn Zelenka: I'll tell you both what it's good for and what it's not good for. I like to think of the distinction as "big world" versus "small world." If you're building a shopping list for you and your friends, collaborative document editing, or a drawing application, this is fantastic. It works well for groups up to some thousands of users.

When you start to need things like indexing where you don't necessarily care about everybody else on that particular application but want discovery - say, building a social network with millions of people - you'll want indexes so you can search across the entire thing. That's "big world," and local-first isn't as good for ingesting something like a Twitter firehose.

It's great for what we sometimes call the "cozy web" - maybe inside a single organization or across a couple of organizations, as opposed to everyone in the world tweeting all the time.

Julian Wood: We shouldn't have social networking; we should have cozy networking!

Brooklyn Zelenka: There's some thinking in the space that because this is such a smaller stack to learn, it really lowers the barrier to entry and gets more people involved. You only need front-end skills to do this. There's a great article called "An app can be a home-cooked meal" - literally just for you and your friends. It doesn't have to be at massive scale where you're going to charge people subscriptions. It can be for 1-2 people or up to thousands. When you start getting to millions, then you probably want indexes.

Julian Wood: What else have we spoken about today that you think people should be knowing about?

Brooklyn Zelenka: I spend most of my time these days on those three layers I mentioned before - auth, data, and compute. Data is the most mature in the stack. For auth, there are existing solutions like Ucan and another auth project I've worked on, and Beehive, my current research project at Switch. Then there are distributed compute networks like IPVM (Interplanetary Virtual Machine), which is the least mature in that stack.

Julian Wood: Interplanetary Virtual Machine seems like the complete antithesis of a local-first thing. How are they connected?

Brooklyn Zelenka: It's WebAssembly overhead. The basic idea is it should scale up as much as it needs automatically without having to worry about it.

Julian Wood: So that virtual machine is the local experience, but can be scaled out wherever based on WebAssembly?

Brooklyn Zelenka: Exactly. Anything you can run locally can still run locally, but maybe you're on a low-powered phone and it's going to take minutes to process something. If you're connected to the internet, it should be able to take that WebAssembly and the data, move it somewhere else, compute on it, and send you the result back. But if you're offline, you can wait for it to finish too, like when rendering a 3D scene.

Julian Wood: So you're solving the portability of applications, which we've done with containers. You're using WebAssembly as a binary format with your additional data and JSON structure. Because it's much smaller, you could move things interplanetary and do compute wherever it needs to be.

Brooklyn Zelenka: Exactly. And we can take your WebAssembly and push the compute to the data for large datasets, have it run, and send you the result back. Most of the challenges there are around verifiability of the result.

Julian Wood: This sounds super interesting. Where can people find more information and start exploring?

Brooklyn Zelenka: There's a podcast called localfirst.fm. There's the AutoMerge Discord at automerge.org with links to all of these things. And there's also a Local First Discord as well.

Julian Wood: Thank you very much. Thanks so much for joining us here on Go To Unscripted in lovely Chicago, where we're speaking to other speakers here at the Go To conference and finding cool new things for distributed computing. I appreciate your time.

Brooklyn Zelenka: Thank you.