Home Gotopia Articles Reading Code Eff...

Reading Code Effectively: An Overlooked Developer Skill

Share on:
linkedin facebook
Copied!

About the experts

Marit van Dijk

Marit van Dijk ( interviewer )

Developer Advocate, Open Source contributor

Hannes Lowette

Hannes Lowette ( expert )

Head of learning & development at Axxes and a passionate .NET developer

Read further

The Undervalued Skill: Reading Code

Hannes Lowette:Hey. I'm Hannes. I'm a principal consultant for a company called Access in Belgium. And I'm joined here today by Marit van Dijk. Now we're going to talk about reading code. But first I'm going to let Marit introduce herself to you.

Marit van Dijk: Thanks, Hannes. Hi I'm Marit van Dijk. I'm a Developer Advocate at JetBrains. Java Champion, and I've worked in IT for over 20 years.

Hannes Lowette: Talking about reading codes because in university, what they train us to do is to write code. And then later in our careers, we spend way more time reading codes than we spend writing code. Yet we don't really train for that.

Marit van Dijk: Exactly. So when we learn programming, I learned programming in university. Maybe you're learning it now in high school or whatever. We learned to write lots of code. I remember also from my programming classes, we didn't really then look at the solutions on what we could have done better. It was just, does it work? Can you explain how it works?

Okay. Next assignment. We don't spend time understanding existing code. And then when you're working on an existing code base, obviously you need to understand that code base or at least a piece of the code base that you're working on. Research has shown that developers spend as much as 60% of their time trying to understand existing source code.

That's actually more than we spend on writing code. There's a Dutch professor, Felienne Hermans. She's a professor of computer science education at Vrije Universiteit in Amsterdam. And her research domain is teaching programming, and she did a talk at Strange Loop in 2019 where she talked about how to teach programming. And one of the things that she mentioned in this talk is research from 1992, which is before I learned programming.

Hannes Lowette: Before I learned programming.

Marit van Dijk: So before we even learned programming, they did research where they had a classroom who had to learn programming concepts, and they split the classroom into three groups. The first group would write lots of code. I don't even know if I can remember all of the groups. But anyway, some groups would write code. Some groups would read code, and some groups would read code with explanations.

So with explanations of why certain choices were made or certain implementations were chosen. And what they found was actually that the groups that read the code with annotations or with explanations about that code were better able to reproduce the programming concepts than the other groups.

Hannes Lowette: Better even than the group that actually did the programming.

Marit van Dijk: Even the group that didn't write any code, but just read code with explanations did better at reproducing the programming concepts. That was really interesting because again, we learn color coding by writing lots of code. Actually, after that talk, apparently someone came up to Felienne and said, so why don't we practice reading code?

Marit van Dijk: They started an experimental code reading club, and I'm a huge fan of Felienne work. We're going to fangirl about Felienne a little bit. Someone came up to her and said, so why don't we practice reading code explicitly? They started as an experiment. They started a code reading club. I read about that somewhere. Or I heard about it and I was like, I want to try that.

That sounds super interesting. I got really lucky because, you know, I have a list of these ideas that I never get to. I got lucky because a friend reached out and said, we're actually starting a code reading club. Do you want to join? I'm like, yeah, I love it when the universe aligns, right? So we had a code reading club.

We've been doing it for about two years, but it kind of petered out by now. Our club is different people who have different roles. So, developers, developer advocates, testers, quality coaches who  live in different countries, who work in different companies, use different programming languages. We've had that club once a month.

Not everybody is able to join every time. I travel also for work, so sometimes I'm not there.

Hannes Lowette: I know you're not at home right now.

Marit van Dijk: Exactly. It's been really interesting. What I really, really love about this code reading club is, first of all, like practicing the skill of reading code and feeling like getting better at that and feeling more confident in that. But also, I really enjoyed learning what other people notice about a piece of code. 

I just love how other people's brains work. I'm looking at this piece of code and all the ways you scan the code. You don't read code from start to finish. You don't run it from start to finish either. So you kind of scan the code and find the part that you're interested in, and also sort of skip other parts that you think are not relevant to, to understand what the code is about.

Then someone else in the group would get totally fixated on the part that I skipped.  is so interesting. Why? Why do you need to know what that means? Because I don't think it's relevant. And why do I think that it's not relevant? So it gets kind of meta about, you know, how different people understand the code.

Recommended talk: How to Read Complex Code • Felienne Hermans • YOW! 2021

Facing the Challenges of Complex Code

Hannes Lowette: I think part of that is probably because we were never taught to read code. And I think it fits into a bigger beef that I've had with our education for a while now, is we we learn a bunch of different skills, but we don't learn how to deal with a normal life size project, like a code base that has been going for a few years, where a lot of the code wasn't written by you, that has developed problems, that has parts of it that aren't documented, like really, and it's a skill we never learn.

I honestly don't understand why, because the first project, the first team you get thrown into after college, you will be faced with all those things.

Marit van Dijk: Exactly.

Hannes Lowette: I think that's what life is. As a software consultant, like, that's even more my job than anything else. I need to be up to speed in a new team really quickly. Meaning like I need to be understanding what you guys have been doing the last couple of years. Usually there is help from people who can explain exactly.

Hannes Lowette: You're also left to your own devices quite a lot looking at the actual code and stare.

Marit van Dijk: You develop different kinds of strategies, especially if you do this right. And I think that the code reading exercises can help if you're confronted by a piece of code that is just so complicated, like sometimes you get overwhelmed, like, I don't even know where to start. In that case, you can take some of the exercises, either from the Code Reading Club or from Felienne’s book.

Felienne has written a book called The Programmer's Brain, or What Every Programmer Should Know about Cognition. I told you we'd be fangirling about Felienne. She has code reading exercises in that book, too, that you could apply to your code base. Obviously working for a tooling company, we build IDEs. There are lots and lots of features in the IDE that can also help you to understand the code.

Recommended talk: Navigating Through Programming's Greatest Mistakes • Mark Rendle & Hannes Lowette • GOTO 2023

Strategies for Understanding New Codebases

Hannes Lowette: I definitely want to zoom in on that. But first, before we go there, like from the Code Reading Club. What different strategies have you noticed because you said you were interested in other people's approaches? What kind of strategies have you seen people apply to understanding new code bases?

Marit van Dijk: So there's a certain set of exercises that we do. Usually we start with an exercise which is called first glance, which is literally what it is. You take one minute, you look at the code and you find the first thing that you notice. Why is that? Realize why is that the first thing that you notice that already gives you a place to start and it can be, oh, hey, I recognize this term.

That gives me an idea of what this might or might be about. Or I have absolutely no idea what this syntax means. I need to understand that in order to make sense of the code. Even if I'm reading or trying to read JavaScript, which I don't understand, and my brain immediately freezes and is pretty sure I won't be able to figure it out. These code code reading exercises can actually help me get past that, to the point where I can somewhat make sense of JavaScript. 

Hannes Lowette: If I've been involved in an audit or two. That was for a code base that wasn't in my native programming language. So I'm a native C-sharp speaker. It's.

Marit van Dijk: Almost like Java.

Hannes Lowette: It's like my Microsoft Java job. I mean, we can disagree on the terminology here. We can't deny that Java and C-sharp have been stealing features of one another for two decades. It's the same thing, but different.

Marit van Dijk: But even Java currently is looking at other languages as well, like Kotlin and Go and other things.

Hannes Lowette: I've done audits that involved a bunch of PSP as well, and that's a vastly different language. If you are proficient at reading code and figuring out where the pain points. Even in a language like PHP, I can figure out some of the things that might be wrong with it and some of the things that they've done.

Hannes Lowette: Well, I mean, not to the same degree of finesse that I could do in C sharp.

Marit van Dijk: You might not know all of the gotchas for a particular language.

Recommended talk: "Looks Good to Me" Constructive Code Reviews • Adrienne Braganza Tacke & Paul Slaughter • GOTO 2024

The Human Element: Empathy and Collaboration

Hannes Lowette: You're not familiar with the pitfalls. Indeed. But you can still figure out like, hey, this part looks really great because you've nicely separated out these concerns. And what, that part looks like a mess. Can we talk about that? Like as a conversation starter? Definitely. Possibly.

Marit van Dijk: One of my first questions is are there any tests? Otherwise I have some questions. So what I also really like from the coding Club is the different perspectives and the, the fact that different people notice different things and it kind of also teaches you empathy in that different people will be working on a code base.

What seems clear to you might not be clear to someone else, which, you know, you might also know from code reviews or. 

Hannes Lowette: I think especially in audit situations, like they bring you in as a consultant to look at basically what's wrong.

Basically it means we're stuck. How do we get unstuck? That's basically the question that they ask you and you walk in and you look at their code and they're expecting you to, to basically tear it apart. And if you do that you're going to get nowhere. You're not going to get any long term results out of the exercise.

As a consultant, I think empathy is very important. Yes. So the first thing I always do is point out like, hey, this part is really well done and I like this, this part of your architecture, how you think that code, this problem. Because if you've written the code for a while, you can also spot those things really nicely.

It's very easy to glance over that and to focus on this thing that isn't done well and then point that out. And basically what you're doing is you're destroying their life's work. Right? I've worked really hard on that. All right.

That is not my job because my job is to help them see where they can improve, but in a way that they will want to improve it.

Marit van Dijk:  And telling them what all is wrong with the code base is maybe not the best way to get to work.

Hannes Lowette: Unless it's the same skill set that you need when you're pairing with a junior devs, when you're doing code reviews for people in your team, that's like, like only focusing on the things that need improving. It'll wear out people in no time.

Marit van Dijk: So we still need to be able to talk to people as developers, is what you're saying?

Hannes Lowette:  We need communication skills.

Marit van Dijk: I know. One of the things that I do because I also do a talk on reading code where I discuss the code reading club and how your IDE can help you make sense of a code. I always ask, you know, I say, you know, you look at code that was written by someone else and immediately you hate it because you didn't write it.

It's not the way that you would write it or even if you did write it, it's not how you would write it today. You have learned new things. I've actually looked at code that I had written six months ago, and, you know, learned a lot in those six months.

Hannes Lowette: And do a better job.

Marit van Dijk: Because now I can do better. We switched from Java to Kotlin. And, you know, Kotlin is fairly easy to get started with. If you're coming from Java, it's very similar, to get started with, but to make it actually idiomatic might take a while. So that was basically the six months. Oh, I learned a lot of idiomatic Kotlin.

Marit van Dijk: Now I can make this more idiomatic. So yeah, if you're learning a foreign language, you will put the words in the wrong order and you'll basically be speaking your own language in that other language. And the more fluent you become the more idiomatic it becomes as well.

Hannes Lowette: I would write Java the way that I write C-sharp.

Marit van Dijk: Probably at least for a start. 

Hannes Lowette:We just have some just like certain ones.

Marit van Dijk: If we're confronted with someone else's code, even code by past you.

Hannes Lowette: But that's proven, like, psychologically. Right? Okay.

Marit van Dijk: That I don't know about. I don't have a psychology degree, but we see code and we don't like it. But also then I ask if I'm doing my talk, who here in the audience sets out to write bad code? Sometimes some hands go up. I have to admit, my hand goes up because as a developer advocate, sometimes you have to write bad code so that you can “fix” it.

Marit van Dijk: That's why I write code. Bad code? No other reason. Just saying. Anyway, so nobody sets out to write bad code. But you know, you learn new things and it might be about the programming language. It might be about architectural patterns. It might be about the domain that you're working on, because we're not all domain experts on all the things that we're working on.

Hannes Lowette: It might be because of the constraints that were imposed by the code that was already there, that it was simply not possible to do a clean job on it.

Marit van Dijk: Or time constraints or, you know, getting stuff out the door quick.

Hannes Lowette: Somebody once told me that, and I've taken this to heart when I'm doing code reviews for, for, for instance, I always assume that all the code was written by that person in the time that they had available to them, to the best of their ability. Like I never assumed malicious scripts. Exactly like I never assume malicious intent.

Maybe they didn't have the right skills at the time, and that might have been a skill issue. It might have been that, like it was delivered under time pressure or I mean, there's so many factors that could have.

Marit van Dijk: Stuff going on in their personal lives that you don't know about, you know, it might be anything.

Hannes Lowette: Might have even not been that bad of a bit of code. It might just be that because you didn't write it and you don't have the mental model that they were handling at the time that they wrote the code, that that is not vibing with you. And it feels like it's bad code.

Marit van Dijk: Exactly. It's like, this is not how I would write it, and therefore it is bad. And I think we need to get rid of that. And therefore this is bad to do that mindset.

Hannes Lowette: It helps if the person is still there and you can talk to them.

Marit van Dijk: If not, then there's a lot that you can do. I like to do a little bit of git archeology sometimes. To figure out, hey, this is not the way that I remember. It's what happened. And see, you know, what was changed as part of what commits. Maybe get some information from the git tool window about the commits.

Marit van Dijk: Hopefully people write clear commits, commit messages of why something was changed. Not what has changed? Because we can see that in the diff. 

Hannes Lowette: That's a good one. Like commit messages should always be the why exactly.

Marit van Dijk: Because, well, there's also a lot of, you know, people who love to talk about writing readable code. And there are lots of people who have opinions on that. Coincidentally, Felienne's book that I already mentioned, I'm not getting a commission, but you should totally read it. The programmer's brain by Felienne Hermans.

Hannes Lowette: And you should watch some of her talks for sure.

Marit van Dijk: In her book, she also, because she's a professor, uses a lot of research on what makes a good variable or not, and why.  I really rather like that over lots of opinions that people have on what makes code readable. I mean, I do believe in naming being as clear as it can possibly be.

Of course, you know, naming things is hard. There are two hard problems in computer science: naming things, cache invalidation and off-by-one errors. But sometimes there are things that you cannot capture in the code. There might be a business requirement that you can't capture in the naming of your variables or classes or methods or whatever.

You might be able to put that in the commit message. You might need to put it somewhere else. You might be able to express it in the names of your tests, for example, but sometimes you can't.

Recommended talk: Reading Code • Marit van Dijk • GOTO 2023

Writing Code That's Easy to Read and Understand

Hannes Lowette: That's where I wanted to circle back to something that you mentioned earlier, where the group that was given context with the code that they were reading, they did the best. Right? And as developers are really good at providing context with the code that we write, because we write lots of comments and documentation and tests.

Marit van Dijk: One of my most popular social media posts is actually Schrodinger's documentation. If there is documentation, people don't read it. And if there isn't any documentation, people complain..

Hannes Lowette: It's like we're terribly like, even if we do it to the best of our intent, we still suck at it because we don't maintain the documentation it gets. It gets lost.

It might refer to a ticket number of a ticketing system you're no longer using. I mean, I've seen all of this happen, right?

That brings me back to the thing you mentioned earlier. You said that IDEs can also help us understand code.

Marit van Dijk: In this book, she mentions three reasons why code might be confusing. One is a lack of knowledge, one is a lack of information, and one is a lack of processing power in the brain. Obviously a lack of knowledge, it might not be able to help you with, although you can navigate to, for example, the library that you're using to navigate to the source code and read that and come back.

Although that is not necessarily a lack of knowledge. Now, lack of knowledge you might need to solve by doing some research yourself, possibly outside the IDE. Lack of information. You can navigate to code, but there are also ways that you can pull up information like quick documentation or quick definition to see what the documentation or actual code for a class is without navigating there.

Because if you keep navigating in a code base, you might end up getting lost. Although there are ways that your IDE can also help you with that. If you are familiar with the shortcuts for navigation or, if you're using IntelliJ IDEA.

Hannes Lowette: Rider.

Marit van Dijk: Or Rider or any of the JetBrains IDEs.

Hannes Lowette: Or even Visual Studio Opera.

Marit van Dijk: That works through, so, I'm more familiar with IntelliJ IDEA, as a Java or mostly JVM programmer. The brackets in .Net give me eye twitches.

Hannes Lowette: No, no, we have the right way around. No, no, you can't be wrong.

Marit van Dijk: But we can. We can fight about it. So you can use, like, a little crosshair symbol to figure out, okay, where actually in my project is this class, so that can be helpful. So before we started recording, we discussed it a little bit; JetBrains has an AI assistant that is integrated into the IDE.

Obviously there is a way to chat with an AI assistant, ask questions. You can ask it questions about your code base. But there are also lots of ways that it's integrated throughout the IDE. So you can use it, for example, to generate commit messages. So no more excuses for bad commit messages. 

Hannes Lowette: Problem is that I want to know why.

Marit van Dijk: Sometimes it kind of does or kind of guess at it. And you can still edit the generated commit message. And so it can also help you get over your writer's block. If you're like, how would I summarize it?

It helps. One of my favorite features is actually explaining this commit to me. So if you want to figure out what happened in this commit, obviously you can look at the diff. But if the diff is very big, I can kind of summarize what happened there. And I find that helpful to like get a start of, okay, what happened in this commit.

Marit van Dijk: Then if I need to, I can still look at the diff. But it will give me sort of a summary.

Hannes Lowette: It gives you context before you dive in. That's a helpful way to apply it. So that's a good thing that we have that.

Marit van Dijk: Now my favorite features are actually more features like  Explain this to me than Write this code for me. I can write the code myself. That's the part I enjoy. It's figuring out things that can be hard.

Hannes Lowette: I think that we need to focus on that readability in our code base, whichever way we can, whether it is with the help of some ID or get better at writing readable code, or get better at reading code, yes, like.

Marit van Dijk: All of the above.

Hannes Lowette: They're all part of the same probability.

Marit van Dijk: Yes. There's research on, you know, naming your variables this way over that way is better.

Hannes Lowette: Which is not good.

Marit van Dijk: It's in the eye of the beholder. And that's what if one of the things that I really enjoyed about the Code Reading Club is a different perspective and something that might be clear to you with your, you know, background and experience and knowledge that you have.

Hannes Lowette: And with my frame of reference for a particular code base.

Marit van Dijk: If you know how to write a parser, you might be able to read code for a parser. If you don't know how to write a parser, you might not. So there's also the knowledge component. So yeah, to get to circle back to the three things that make code complicated.

Another thing that might make it complicated is a lack of processing power in the brain. So if there is too much going on because you have like nested loops and lots of variables and things to keep track of, then that's where a debugger can, for example, really help, because it can help you visualize the state of your code and the state of your objects and variables, etc..

I really like running stuff through the debugger. Also my own code. I make sure that what I think happens is actually what happens. Or it can help you figure out bugs. I especially like to write tests to reproduce bugs, so that I know that I understood the bugs before I fix them.

Also then I have a test to verify that I have fixed it. That it stays dead. 

Hannes Lowette: I try to define, not for all of the code that I write, but for the domain codes. I love expressing certain functionalities as tests first.

Marit van Dijk: Yes.

Hannes Lowette: Hey, I'm going to expect this kind of behavior from my code. Exactly. And it gives me two things. It gives me documentation about the code that is inevitably going to be there. It also gives me a better understanding of what the code should be doing. And I find that I write cleaner code because of it.

Marit van Dijk: If you think about what the code is supposed to do, and you think of examples of what it should do, and you know, if this happens, this should happen. If this doesn't happen, okay, does nothing happen? Does something else happen? Figuring that out beforehand will help you have a clearer picture of what it is that you need to write.

So definitely a fan of any type of TDD

Hannes Lowette :It's also why I gravitate towards event driven systems because I love expressing the functionality inside the domain code as a series of events. If this and this and this has happened, and then I try to do this, the outcome shouldn't be this and this, and it reduces the complexity in your tests because you're always talking about the same thing.

It's going to be a series of events. Then I queue a command and I'm expecting these events out of it.

Marit van Dijk: But any type of test will have that same structure. And it helps to think about, you know, given a certain situation where anything happens and then this should be the result. And then we have like the BDD or behavior driven design, given when then which really is the same as the arrange / act / assert from traditional uni testing

Hannes Lowette: Always the same thing. I always say mocking it is way harder than just queuing events and then a command and then seeing events come out once you have that first time infrastructure ready. But it might be a matter of opinion here.

Marit van Dijk: I'm not sure. I have a lot of experience in event driven systems, so I can't really say. But it's for me, it's thinking about, you know, making it explicit what state you expect the system to be in before you interact with it, how you're going to interact with it, and then what you're expected result is, however you call that or or structure that, I find that extremely helpful because, you know, if you have, for example, a bunch of flaky tests, maybe it's because your assumption of the state that the system will be in is incorrect.

And that can be the case. For example, if you have a really large test suite that is all operating on the same end environment and might be operating on the same entities in that environment and changing stuff back and forth and interfering with each other. 

Hannes Lowette: Running integration tests and then running into race conditions and then.

Marit van Dijk: End to end tests, where in a test environment where other people are also testing all of these things. So then you go to being in control of those environments, being in control of that data, maybe your test sets it up themselves so that each is in control of their own data, so that you know that your given state is, in fact, what you expect, which will save you a lot.

Hannes Lowette: So are we now advocating for tests as documentation?

Marit van Dijk: Obviously. Okay. Is anyone not advocating for tests?

I wrote an article for 97 things every Java developer should know on, using tests to write better software. And, Kevlin Henney actually quotes. That article is quoted in one of his talks. But basically, you know, make sure that your tests explain the intended behavior of your software because if they fail  that will help you clarify what you actually expect that will make it so that you write better code, as we've just discussed.

But also that test is there basically, like, a safety belt, you know, in case you go too fast in the future, it's going to save you because it's going to fail if you break it. And then, you know, however far in the future that test breaks, you might not remember the exact context or what exactly it was supposed to do.

So then you have to analyze that failure and figure out, did I break something that should still work? Or did I forget an area of the application where actually this new change also leads to changes? If you're if the name of your test clearly expresses the intended behavior, figuring that out is going to be much, much faster. I have a few rants on tests because I've seen a lot of tests.

I've literally worked on a test code base where the tests were named test scenario one, test scenario two. And oh, when those fail, it's not really clear what needs to happen. So then you have to go read the tests and figure out what they meant. And that adds up.

Recommended talk: Structure and Interpretation of Test Cases • Kevlin Henney • GOTO 2022

Hannes Lowette: But that's reading code again.

Marit van Dijk: I would have preferred there to be a name that actually means something. Yeah, that means something because that makes it more readable code or more easily suitable.

Hannes Lowette: Yes. I think whichever way we look at it, like reading code is a skill that we're all going to have to improve. And we can have Ides help us, especially with the navigation part and the trying to build that mental model of someone else's code. So we don't run into the brain processing power limits. I can help a little bit as well.

Recommended talk: 97 Things Every Java Programmer Should Know • Trisha Gee & Kevlin Henney • GOTO 2020

Tips for Improving Your Code Reading Skills

Hannes Lowette: But what are the tips that you would give to somebody? That is writing code, how they would get better at reading code.

Marit van Dijk: My favorite Sarah Andersen  is about practice. How did you get so good at drawing? Practice! How did you get so good at coding? Practice How did you get so good at reading code? Practice! So, I definitely think that experience helps there, but there are ways that you can deliberately practice reading code the same way that we practice writing code with contests or, you know.

Hannes Lowette: Or Advent of Code.

Marit van Dijk: Or exactly things like that. So you can deliberately practice reading code. You can go to Code reading Dot club, I think is the URL, but we can figure that out for the show notes.

They occasionally have online events, although they haven't for a while. But they also have resources that you can use so they have a GitHub repo. We'll put that in the show notes too.  have a GitHub repo where they have materials for at least three sessions that you can do, with a code sample and exercises that you can do, and notes for a facilitator so that you can try this out with your team or your colleagues or your software developer friends to practice reading code. If you can't find people to do a session with you, you can use Felienne’s  book and use the exercises from that. Apply that to your own code base, or your work code base or whatever. And I think that these can help and obviously, you know, familiarize yourself with the features of your IDE that can help you searching and navigating, writing and running tests, running them through the debugger, ways that you can pull up information like quick documentation, quick reference, type information, lots of hints also that your IDE can give you like inlay hints on the types of things.

All of these things can be helpful.

Hannes Lowette: We have so many tools available to us.

Marit van Dijk: And not to mention AI assistants and other tools like that. 

Hannes Lowette: Great. I already feel like I want to try those tips from the Code Reading Club with some of my team. So. But thank you so much.

Marit van Dijk: Thank you. Thanks for having me.