Kafka in Action
Kafka has been on developers’ radars for quite a while now. Viktor Gamov’s co-authored book “Kafka in Action” ensures that you have a list of recipes to dive into. Joined by Tim Berglund, VP DevRel at StarTree, they explore the fundamentals of Apache Kafka.
Learn what Kafka can help you achieve, what Viktor’s favorite MCU film is and what “Highway to Mars” by Beast In Black has to do with all of this.
Tim Berglund: Hello, and welcome to another episode of the "GOTO Book Club." I am your host for the day, Tim Berglund. I run developer relations at a company called Star Tree. My guest today is Viktor Gamov. Viktor Gamov is the principal developer advocate at Kong, and he's also one of the co-authors of "Kafka in Action." And we're here today to talk about that book and kind of anything else we wanna talk about. Viktor, welcome to the show.
Viktor Gamov: Thank you, Tim. And it's great to be here at the show. It definitely feels like one of these shows that we ran in the past together. Hopefully, it's gonna be the same quality and the same success as we did with "Streaming Audio" podcasts and live streams.
Tim Berglund: This is not the first time Viktor and I have appeared on a podcast together, so, he may get that idea.
Viktor Gamov: Thank you so much for having me and thank you so much for GOTO organizing this book club and talking, as Tim mentioned, about the book that I participated in and helped to release. Yep.
Tim Berglund: A co-author. Just own it.
Viktor Gamov: Yes, co-author.
Why a book on Kafka?
Tim Berglund: At first, I looked at it and I thought it was Kafka Inaction like not doing anything with Kafka and I was confused, then I realized I just missed a space there, so it's "Kafka in Action." So, it's the exact opposite of what I thought. Anyway, tell me about how... All right. There'll be better jokes before we're done, so if you're listening, keep listening. Tell me about how you got involved. Why a book on Kafka? Why a book is a great question.
Viktor Gamov: Why a book is a great question. And I think that this episode of Book Club very soon turns into a kind of actor-to-actor type of vibe, because Tim is also a published author of multiple books on different topics, including Kafka for Dummies, I guess.
Tim Berglund: No. O. No, stop with that. There was kind of a sponsored data integration "CDC for Dummies" book.
Viktor Gamov: Oh, "CDC for Dummies."
Tim Berglund: I was listed as a co-author, and in truth, I was mostly a table of contents writer and editor. But a couple of books on Gradle. I had a co-author with one and wrote another one myself. So, yes, author to author. This is your show, not mine.
Viktor Gamov: Yes. You know this answer for yourself then, so why a book?
Tim Berglund: But they may not.
Viktor Gamov: But many listeners might not. So, yes. It's a very good question. So, first of all, writing a book, if you're planning to write a book, it's a multi-month endeavor where you...
Tim Berglund: More years.
Viktor Gamov: ... should have the time from different places. Usually, this is not something your employer will be interested to pay for because he or they're interested to pay for your services to do work, not the writing book. So, usually, you will be writing your book during your free time, which you might not have enough free time. That's why you need to understand why. Why it's a very important question that everyone needs to ask themselves from time to time. Writing a book about some of the topics, for me, was always a milestone of not achievements, but at least something that I learned personally myself. In the past, my first book was about enterprise web development where I shared some experience with my team in the consultancy when we were building enterprise web applications for our clients.
We thought that all would be great at the time, it would be a great idea to summarize some of the things that we learned on the project. Same thing with the Apache "Kafka in Action" where this book gives a lot of practical things. And being in professional services during my time at Confluent and being a developer advocate also at Confluent, I gained some experience of how people use Kafka and how people should not use Kafka, and maybe to kind of create something that will collect all this knowledge and create this milestone for me. Pretty much it's very similar. Writing a book is very similar, to writing a thesis, right? You're working on something. You do research. You do peer reviews. You speak with the people in the industry, and after that, you publish your thesis.
It's very similar to the amount of effort and amount of work that's involved in this work, including research and especially research in peer reviews. But here's the difference between being peer-reviewed with academia versus the real world. Academia is less toxic than the audience because you cannot make mistakes. After all, all these mistakes are now on paper, and people will point to you every time they will have a chance. "Hey, Viktor, in your book you wrote blah, and this is incorrect." And now you need to own it and it's going to be with you all the time. That's important. That's the kind of downside of writing a book. And the upside of writing a book is you can always say, "Hey, I'm a published author. You can find my book on Amazon. Just Google me. Or even just Google me." And it always works on your resume. Works as a charm.
Tim Berglund: Absolutely. I have some friends in tenured positions in academia who might disagree with you...
Viktor Gamov: Disagree about toxicity.
Tim Berglund: ...on which community is less toxic. That would be a fun debate to have. So, you want to share knowledge and, I mean, you want recognition. So, I think what both of us as published authors would say to our listeners, "If you want to write a book," and to my friends who are acquisition editors at Manning and O'Reilly, and so forth, I'm sorry, "the first talk to your therapist. Why are you looking for validation here?" Because it's gonna be a lot of work.
Viktor Gamov: Yes, that's it...
Tim Berglund: It's a hard place to get it.
Viktor Gamov: That is true.
Tim Berglund: Yes.
Viktor Gamov: It is true, Tim.
What can you learn from Kafka in Action?
Tim Berglund: But I'm glad that you were because this is a good book. What was your approach? It's a book on Kafka, and I guess, I want to be straightforward about that. That's part of what we're talking about today. What were you trying to get across to people? And I don't mean like specific, stories from the trenches. That's not really how the book is written, but from your experience in helping people actually apply Kafka in professional services and then explain it as a developer advocate and get people started with it. How did you want to make the book work given your background?
Viktor Gamov: YSo, a couple of things, like, before I answer this question. SI was trying to be, an early reader of this book for a very long time. I'm a huge fan of the Manning Early Access Program.
Tim Berglund: Because it was in Early Access and you were not originally a co-author, right?
Viktor Gamov: I was not originally a co-author. I was just there, reading things. And it turns out that my reputation helped me to get to this point of... At some point original author, Dylan Scott reached out and said, "Hey, so there's something that I'm working on. There are a few ideas that I'm not 100% expert in, like, in streams, in some Java aspects." I am a long-time Java user and now I'm a Java champion which means that I have some recognition in the Java community as an advocate for Java technology. So, I know a few things about Java. I love Java and writing my stuff in it. And I was brought in as an external expert to validate some of the ideas in the book, plus contribute to some of the ideas where the original author was struggling to add some sense.
Recommended talk: Real-world Reactive Programming in Java: The Definitive Guide • Erwin de Gier • GOTO 2018
Tim Berglund: Yes.
Viktor Gamov: One of the things that we...
Tim Berglund: It was like you said before, it's writing a book as a grind, it relies probably on free time, it takes place over many months, even years, and your marginal free time could change over that period. So, there are all kinds of reasons why you might bring in a co-author, right?
Viktor Gamov: Precisely. Yes. I came into the book, not in the very ideal time of the world. It was during the COVID times and it was...
Tim Berglund: Early too, so it was the particularly sad COVID times.
Viktor Gamov: It was something that you need to consider if you want to do something like this in the past. So, one of the things that we knew when we worked at Confluent as a developer relations team is the people struggling with some of the practicalities. There are the things that people love to do, "Oh, let's deploy a distributed system, and let's talk about how the distributed consensus will work. Let's talk about the partitioning consistent hashing algorithms." All this type of stuff that data engineers and data operations people love to talk about is failover, backups, and all this kind of stuff. But the normal people, like developers who actually will be using this infrastructure, don't care about this. They want to know...
Tim Berglund: You don't need to care.
Viktor Gamov: ...how the things work to write apps. That's the things that we wanted to do. And with this book, we didn't want to go into the world of there's a book, "Apache Kafka: The Definitive Guide" by Gwen Shapira, Neha Narkhede, and Todd Palino. Great book for all the people who like to grind into the world of distributed systems, consensus, and how things are working internally.
Tim Berglund: Written by authorities on planet Earth.
Viktor Gamov: Right. Exactly.
Tim Berglund: Yes.
Viktor Gamov: Exactly. We didn't want to go that route because why would we?
Tim Berglund: Why am I going to read Viktor Gamov Gamov's book when I could read Gwen Shapira's book on Kafka internals?
Viktor Gamov: Exactly.
Tim Berglund: Viktor Gamov is a good guy, but...
Viktor Gamov: So, we took a more practical approach. So, we didn't... We give enough information about insights of Kafka for you to start doing... This is why it's called action. So, it's not in-depth or...
Tim Berglund: And this is the...
Viktor Gamov: ...not the internals. We wanted to get up and running.
Tim Berglund: This is part one, like, "Here's a mental model of Kafka."
Viktor Gamov: Yes. Explain the event streams the way... Explain a bit about how Kafka stores the messages so people will understand that it's not a messaging system, messages are stored there, and you can reprocess them. You can enable security, all this type of jazz. Here's how we can do this. There are practical recipes. Just go and do it. And there's a stream processing side of things with the key sequel, a little bit about connectors. So, a little bit about everything.
If you want to start exploring things, we have a lot of references to other books where we think that people can learn more specifically from some of the books that are already named. So, it was a more practical thing for just... You basically can open any place in the book and refresh your knowledge about topics and partitions or how to use a skim registry to put the structure to your data, these types of things. So, that was the idea.
What you need to know to build with Kafka
Tim Berglund: So just the top-level ideas and then really how-to, the bulk of the book. I mean, I'm cheating. I'm looking at the table of contents right here. There's like 40 pages of that, let's get your feet wet of kind of 30 pages of additional resources, and then the big chunk in the middle, a good 140 pages.
Viktor Gamov: One of the things that we wanted to do at developer.io, is the website where we provide samples where you want to solve a particular problem. That's the most important thing when you're trying to teach someone from the very beginning. You have a problem. If you can relate to this problem, this is the solution. If your problem is slightly different, should you use this solution? I don't know. But usually, in the world of Kafka, people will have particular problems that we researched thoroughly through search engines, social sites like Stack Overflow, Twitter, and things like that. We put these ideas into this website. We wanted to do it very similarly, but without repeating the already existing sources, like how to count the number of messages in Kafka. So, the same mental model. I always have the same mental model when I'm writing for this book. What is something that I would like to know if I would know...if I didn't know this, like, a couple of years back? What are the absolute minimal things that you need to know to do the stuff with Kafka?
Event streaming & event-driven architecture
Tim Berglund: And that seems to be the spirit of the book, like, what do I need to know to build stuff? Now, in the model... This is a model that I've heard salespeople talk about. This is not a sales podcast. You and I are not people who, as they say, carry a bag. We're developer relations, but this is an interesting kind of three-part model. They have this thing, why change? Why Kafka or whatever the thing is that I'm selling? Why now? Right? So why learn a new thing? Why is it, Kafka? Why now? And that's my question. Why is Kafka happening? What's going on with event streaming and event-driven architecture? What's your perspective on that? Why is it happening? Is it a big deal? Your thoughts.
Recommended talk: The Many Meanings of Event-Driven Architecture • Martin Fowler • GOTO 2017
Viktor Gamov: So, this is a very good question and this is a question that I get asked a few times by many people who are trying to start understanding what this fuss is about. I think it is a slight paradigm change. I'm sorry. It's not a sales podcast, and it's not...
Tim Berglund: We have to say paradigm sometimes. I feel bad too. I just... I feel dirty.
Viktor Gamov: I have to say this. But it has changed the way how we think about our data.
Tim Berglund: Is it disruptive?
Viktor Gamov: Traditionally... I'm sorry.
Tim Berglund: Is it disruptive?
Viktor Gamov: Disruptive. Maybe we need to find some synergy with the...
Tim Berglund: There we go.
Viktor Gamov: ...existing emerging technologies.
Tim Berglund: Thank you. Thank God. That's good.
Viktor Gamov: So, we are very proficient in the corporate...
Tim Berglund: We are the corporate Gs.
Viktor Gamov: ...the corporate language. So, we know how to talk about this. But...
Tim Berglund: So, there's a paradigm.
Viktor Gamov: Traditionally, the way how we're dealing with data is data is somewhere there, someone put this data for us, and whenever we need this, we go and query this data no matter the size, you need to find different solutions for size, for web-scale, for... I'm doing this again. I'm sorry. With Kafka, we're changing the way how data deliver to applications without sacrificing on, you know, the size, and the scale, and how we're gonna be putting this data. So, it's not your typical messaging system, for example, where you just receive the message and after that, you know, off you go. Kafka remembers. So, whenever you think that... Again, you might have a different use case for the same data. You don't need to ask someone else to resend this data back to you.
And by data, it can be from simple notifications or any type of messages that will carry some of the piece of information or what we call it kind of a notification type of message, versus data that carries a state change. So, something is changing with your data. Your data is changing, and you want to do on react to something when the data is changing. Traditional tools were not providing these APIs or these ways how we can retrieve this data. Usually, those things are hidden inside of implementation. So, every database always had this kind of transaction log and all the changes that happened with your database were hidden for you because you were exposed and you would get access to a particular data model that you need to deal with. With Kafka, those tools become available so you can listen not only for business messages, like I said, like notifications or some of the changes or somewhat calling your API but also to the data changes. And the way how you start thinking about dealing with this data has also changed the way you want to have real-time results. You cannot wait until your MapReduce cluster will be done with this particular job to calculate some sort of like running overage for the report.
Tim Berglund: No, you cannot.
Viktor Gamov: So imagine you wanted to get the result of the transaction being declined or there was some fraudulent activity, like, a week after. Do you want this?
Tim Berglund: I mean, nice to know, I guess, but a little late.
Viktor Gamov: Yes, but it's a little late.
Tim Berglund: The money is gone.
Viktor Gamov: So, it's better to be at least closer to real-time as you can. That's why the streaming technologies and technologies around the streaming data and the processing data processing events as they arrive, it is now. Everyone wants to do this, like, right now. For the last couple of years, we've seen how people transform the way how they look at their data. They're not looking at data as silos anymore. They want to look at the data as something, as an asset. And something is happening with these assets. I said asset, not acid.
Streaming vs batching
Tim Berglund: Now, you and I have a lot of shared context on this. This is a story that we spend time telling together, so if you'll let me kind of summarize that. You made an interesting point about how databases have always had this event log in them, but you never see the event log because who wants to see an event log? I mean, you usually don't want to. And if you do, you want to do, filtering, viewing, and things. Something's kind of gone wrong if you're looking at a log. In a database...
Viktor Gamov: Exactly.
Tim Berglund: ...you want tables, you want something in a query language, and like you're a human, something a human can use. But it's kinda like we're building our applications that way now where you don't want to look at the log of what's happened in the business. That's useless. You want to place orders. You want to edit your account. You want to... I want to add Viktor Gamov as a friend in TripIt or whatever so he can see my trips. Those are the things you want to do, and so there's this event logged down at the bottom and these services built up on top of it that expose an interface just like databases did. Now we're like doing that thing that just used to be inside MySQL, Oracle, or SQL Server back in the day, and now that's our whole system and that's a new architectural paradigm that Kafka is a big part of. Is that fair? I'm putting words in your mouth.
Recommended talk: The Database Unbundled: Commit Logs in an Age of Microservices • Tim Berglund • GOTO 2019
Viktor Gamov: You mentioned this interesting point that we were limited by the tools of our generation and we forced ourselves to...
Tim Berglund: Aren't we old?
Viktor Gamov: ...fit some of the business requirements and the system designs into the structures that were not intended to be used, by normal people. I'm talking about relational systems. To decouple typical processes inside an online shop, or not even an online shop, in the retail store, you need to turn this model of the real world into something that will fit into the relational model, tables, relations, and all these types of things, which we tend to learn to love to hate or hate to love. But with an event system, because everything in life happens with events. Life doesn't happen in batch mode. Life happens in the kind of real-time mode. And every event, everything that happens, it's an event.
You have a conversation with your friends, you received a notification, or you received some update of some sort of, like, a state. And it's up to receiving part to make some reactions on this one. You cannot change events in the real life when you said something that you're not supposed to say to your significant other. The only thing that you can do is send another event to try to amend the things that you said. And it's based on receiving side how they will...
Tim Berglund: React.
Viktor Gamov: ...operate with this data. React to this data.
Tim Berglund: And to be fair, it up to the producer to send that compensating and create that compensating action also.
Viktor Gamov: It's not the find that we need to have with the... So, the streaming technology versus batch technologies. It's more like, like I said, quoting for our stack. We were limited by the technologies of our time. We didn't have enough maybe resources to have these distributed event logs during the times when we start thinking about our data. Even though log as a technology, as a paradigm is kind of like a pattern of designing data system was there...
Tim Berglund: It's hardly new.
Viktor Gamov: ...in the transactional log, it's in the database. So, it was there. We just kind of, like, how the people like to say, the millennials discovered a transaction log, basically what happened, this is how Kafka came to be. So, that's another spiral of the evolution of data systems.
Tim Berglund: So, getting close to wrapping up here. What are you up to next? What are you doing now? What's in your near future?
Viktor Gamov: I am preparing for my presentation for the upcoming Current event. It's going to be our big reunion with some of the people who we worked with in the past. And it's going to be bigger than Kafka Summit because this year Confluent is bringing the different technologies in place and how they work together with Kafka and maybe complementing each other and things like that. So, that's gonna be a big milestone and big technology. I will be talking a little bit about the infrastructure side of things for the people who like to do infrastructure. We'll be talking about the test container technology that allows you to test bigger and more complex distributed systems without implementing some sort of mocks and stubs, but using real applications that runs inside a container as a part of your integrational testing paradigms. That's something that I am excited about personally, plus the technology as a container has pushed the accessibility of any system very far. So, to run a very complex system, you don't need to install everything on your laptop. Everything would be stored inside the containers and you have a docker-compose to kind of orchestrate this thing so you can run the complex example on your computer and use the similar approach but more programmatic approach to testing your system.
Tim Berglund: I guess, I should say we're recording this. The day we're recording this happens to be September 21st today. Not sure when this is going to air, so the current may already be in your rearview mirror by the time you hear this. But Viktor's talk will be available as a video I'm sure on that website or poke around there. And we can either put a link in the show notes to that conference website or you can just Google "Current." Don't Google "Current." That's not gonna work very well. Maybe like "Confluent Kafka Conference."
Viktor Gamov: That's a very interesting choice. It's Kafka Summit.
Tim Berglund: It's cool. I get it. Now it is happening in this real-time and all that stuff. It makes sense. And you know who's talking trash right now, the individual who's talking trash? Is the one who named his event-driven architecture Kafka podcast "Streaming Audio," which is equally clever and cool and trash to Google.
Viktor Gamov: Difficult to Google it.
Tim Berglund: Absolutely. SEO fail. But cool name with it right there. Anyway, check that out. I can shamelessly plug just because I'm the host. I'll also have a talk at Current about Apache Pinot, which is a real-time analytics database. That's a great compliment to Kafka kind of plugs into the ecosystem in an interesting way. Viktor Gamov and I will both be there. And it's kind of fun. We're looking forward to it. Two of our other former colleagues, Ricardo Ferreira and Robin Moffat, we worked together, really, the four developer advocates at Confluent for a while. We're back together. Austin, look out. It's gonna be a great time. Or maybe by the time you hear this, it was a great time. We're looking forward to it.
Recommended talk: Observability for Data Pipelines: Monitoring, Alerting & Tracing Lineage • Jiaqi Liu • GOTO 2020
Viktor Gamov: And you learned something from local news about how things went.
Tim Berglund: Yes. Certainly, certainly. I'm sure that'll show up. It'll trend on Twitter. Viktor, this is unplanned, but will you do a lightning round with me as we close out?
Viktor Gamov: I'm sorry. What?
Tim Berglund: Lightning round of... Will you do a lightning round of questions? Can I just ask you a few questions, and you give me short answers?
Viktor Gamov: Please do. By all means. That's something that I like to do.
Tim Berglund: Okay. Favorite MCU film.
Viktor Gamov: "Iron Man."
Tim Berglund: Favorite film director.
Viktor Gamov: Christopher Nolan.
Tim Berglund: The most recent song you've listened to.
Viktor Gamov: Most recent song I've listened to is "Highway to Mars" by Finnish heavy metal band, Beast in Black.
Tim Berglund: Didn't see that one coming. The other one's sort of [crosstalk 00:28:04]
Viktor Gamov: You did not, but...
Tim Berglund: I did see the other ones coming. Give me two or three important books that you've read that you think other people should read.
Viktor Gamov: If you're in a developer relations role, please find yourself time to buy a book called "People Powered." It's about building communities. And the book called Business Value of Developer Relations. The first one is by Jono Bacon, and the second one is by Mary Thengvall. Very inspirational books. If you're in the data world and you would try to get yourself into understanding all this architecture, "Designing Data-Intensive Applications" by Martin Kleppmann. Probably the best book that you can read to get started in this data world and try to understand where everything fits together and how everything works together and why are there so many different data systems in distributed systems in general. So, it's a brilliant book. I highly recommend it to everyone. And you can also read this kind from the first page to the last page, but you can also read this by chapter if you're interested in a particular topic. There are a bunch of things that are eye-openers. And Martin Kleppmann has fantastic language to explain complex things.
Recommended talk: Conflict Resolution for Eventual Consistency • Martin Kleppmann • GOTO 2016
Tim Berglund: Yes. And I think if you drew a graph of all of the books, and blog posts, and talks, and ideas in event-driven architecture and had, you know, directed edges of this idea owes something to this one, you know, just kind of built up that graph, the Martin Kleppmann node would be a heavily connected... The degree of that node would be very high. I think he's kind of the source of a lot of the key ideas.
Viktor Gamov: He was involved in some of the...
Tim Berglund: Jay Kreps might...
Viktor Gamov: ...in some open source projects including Kafka, including Samza, including some other things, Avro.
Tim Berglund: Some might argue Jay Kreps needs more edges on his node there than I'm painting in that picture and that's likely true, but a lot of these ideas come from Kleppmann. He's an important thinker. And final question. Current top recommended Instagram follow.
Viktor Gamov: I quit Instagram for a very long time, but you can follow tlberglund, a very good account.
Tim Berglund: timl. It's unfortunately timlberglund. It's not tlberglund. tl was taken.
Viktor Gamov: Okay, so it's not anymore?
Tim Berglund: I don't know who that guy is. Its timlberglund is what it always was.
Viktor Gamov: timlberglund. A very good account. I highly recommend the following.
Tim Berglund: This message was brought to you by Tim Berglund. No, this wasn't planned. Thanks, man. That's nice of you.
Viktor Gamov: Yes. I realized that I spent too much time on social media, so I'm trying to kind of, like...
Tim Berglund: And I realized the other day...
Viktor Gamov: I'm trying to be...
Tim Berglund: ...I sent you a reel and I realized I haven't gotten a reel from Viktor in a long time. I wonder if he's mad at me. But that's because you haven't been on Instagram.
Viktor Gamov: And I'm one of those people, some people always threaten to leave social media. I never did this, but I just silently quit.
Tim Berglund: You just do it. You don't have a big post about, "Well, people, I'm leaving." Yeah, that's not your style. You could just do rather than talk. All right.
Viktor Gamov: Yes.
Tim Berglund: Hey, Viktor, it's been great to talk to you.
Viktor Gamov: Yes. So, check out kafkainaction.org. You can use the kafkaa35 to get the 35% discount for any edition. You can get the electronic edition, you can get the printed edition. It works only on the Manning website. If you got this book, write a review on Amazon. It will help a lot.
Tim Berglund: Five stars?
Viktor Gamov: You can be honest, but a good review helps to promote the book. And we tried. We tried hard to make this book.
Tim Berglund: Yes.
Viktor Gamov: I'm proud of this work.
Tim Berglund: The change...
Viktor Gamov: I'm not saying it was like a...
Tim Berglund: Go ahead.
Viktor Gamov: Yeah. It was not a half-ass job. It was, like, 100%, like, complete dedication, this one.
Tim Berglund: Absolutely.
Viktor Gamov: Trust me when I said that it was not easy, but we made it, and I hope you like it.
Tim Berglund: The change in software architecture paradigm is happening and you don't have a choice about that, but you do have a choice about whether you're going to know what's going on. Kafka is a key part of that, so check the book out. Viktor is always a delight. Have a good one, man.
Viktor Gamov: My name is Viktor Gamov, and as always, have a nice day.