Expert Talk: Zig Programming Language and Linters

Updated on April 28, 2023
Share on:
linkedin facebook
Copied!
28 min read

This conversation between Jeroen Engels, a software engineer at CrowdStrike, and Andrew Kelley, the president and lead software developer of the Zig Software Foundation, discusses the use of linters in programming languages. They talk about the challenges of refactoring code with custom macros and the need for improved refactoring tools and integration with compilers for programming languages. The conversation also covers the importance of error codes versus warning codes in linters, handling potentially null values, and the tradeoffs of having linting errors. Although the Zig compiler does not have a separate linter, they agree that a separate linter step from the compilation step is a viable option. The conversation highlighted the importance of enforcing linting in the continuous integration (CI) process and the need for programmers to cooperate to make functions work without side effects.

This conversation is between Jeroen Engels, a software engineer at CrowdStrike, and Andrew Kelley, the president and lead software developer of the Zig Software Foundation, where they discuss the use of linters in programming languages.They talk about the challenges of refactoring code with custom macros and the need for improved refactoring tools and integration with compilers for programming languages. The conversation also covers the importance of error codes versus warning codes in linters, handling potentially null values, and the tradeoffs of having linting errors.

Although the Zig compiler does not have a separate linter, they agree that a separate linter step from the compilation step is a viable option. The conversation highlighted the importance of enforcing linting in the continuous integration (CI) process and the need for programmers to cooperate to make functions work without side effects.

Intro 

Jeroen Engels: This is "GOTO Unscripted." We're at GOTO Copenhagen. My name is Jeroen Engels. I'm joined with Andrew Kelley. So, I'm a software engineer at CrowdStrike. I work primarily with Elm. I like to, in my spare time, work in the Elm linter, which is called elm-review, which I presented at the conference. Andrew?

Andrew Kelley: Hello. As Jeroen mentioned, my name is Andrew Kelley. I am the president and lead software developer of Zig Software Foundation.

Jeroen Engels: That sounds so much fancier than what I had.

Linter overview and programming languages application  

Andrew Kelley: Well, I heard that you have a thirst for linters.

Jeroen Engels: I do, I do. And I heard that you don't use linter. The static analysis tools is just something that I find very enjoyable because, like, I've been a JavaScript developer before, and I was always kind of frustrated with all the problems that popped up with the code.

I worked with ESLint on trying to figure out, like, what rules can we enable to make sure that these problems don't end up in our production codebase? At some point, I started looking at Elm, which is basically a very good, fresh, new perspective, where all the problems that I had with ESLint, or with JavaScript, didn't appear anymore. But I still figured, it would still make sense to have a linter for Elm, even though, like, almost none of the same problems apply. 

Andrew Kelley: Does Elm offer any powerful refactoring tools?

Jeroen Engels: You mean for instance in IDs, or?

Andrew Kelley: For instance, Java developers enjoy very high-level abstraction refactoring tools, such as they can highlight a block of code and say, "Extract into method," or they can even take a function and just reorder the parameters, and it will update every call site at once. Do you have anything like this for Elm?

Jeroen Engels: We have, to some extent, but definitely not to the same level. But it's just a matter of someone who needs to be passionate about it and tackle those issues, because we have so much more knowledge about what the code is doing in Elm, compared to Java or other languages, in my opinion, that it's all doable. So it's just a matter of someone needs to do it. We have some refactorings, like we can extract variables, can rename things, but that's about it for now. Well, and more.

Andrew Kelley: Oh, that's pretty nice.

Jeroen Engels: What do you have for Zig ? Do you have a good integration with, I think VS Code is the one where that supports Zig the best?

Andrew Kelley: The best we have for now is a third-party language server protocol implementation, but it's kind of a best-effort implementation, and, you know, third-party, it's not, doesn't come with a compiler. It can break separately.

Jeroen Engels: Is it best effort because the focus hasn't been there yet, or?

Andrew Kelley: I mean, I don't work on it. You know, someone from the community works on it. They do a great job, but shout-outs to August for working on that. But there's only so much you can do without it being integrated with the actual type of information that the compiler has.

But I will say that the investment that we've done for this for the future is designing the language with conditional compilation being a first-class part of the language, rather than being through, you know, a textual pre-processor.

The Challenges of Refactoring Code with Custom Macros

Jeroen Engels: What do you mean? What does that change?

Andrew Kelley: You know, if I have an IDE for C or C++ code, and part of my C library API is that one of the functions is just a macro that gets textually replaced, then, you know, the refactoring tools, they don't know how to deal with this because it's not one language, it's two languages, and one of them is, you know, text-based concatenation, and it doesn't know how to...

Jeroen Engels: If your language, if it was somewhat standard that you had macros that were being replaced all over the place, well, then the tool could analyze it, "Well, okay, well, we know that there's this macro in the codebase, therefore it will be replaced at this location, this location, this location."

But if those macros get too custom, then it's really hard to analyze, right? And therefore you lose all the guarantees about, "Oh, well, I can't see any GOTO instruction here, therefore I know that it's not doing anything weird." But if you have macros that change the code, then you lose that kind of guarantee.

Andrew Kelley: You lose that kind of guarantee, or you have to execute the pre-processor and then assume that, you know, one set of DEFINEs is true. But maybe, you know, if your build system changes the option, then this other IFDEF defines it the other way, and so then you try to do a refactoring tool, but then it's wrong for all the places where the other definition would be activated. You know, you can't solve the problem.

Jeroen Engels: And also you have to point the error at some point, at some location in the codebase, but if that codebase doesn't exist because that's the result of applying the macro, then what are you pointing at?

Linters: Errors vs. Warnings

Andrew Kelley: That's a good point. Do you wanna talk about errors versus warnings?

Jeroen Engels: Sure. So, you meant what I specified in my talk, right? So, what I said during my talk was that a lot of linters, they have a mechanism to make sure that you don't have errors that crash. Can we cut that out, and I'll try it again?

During my talk, I talked about severity levels. Some linters allow you to specify for each rule how you want them to influence the exit code of the linter. So, if you have an error that is marked...if you have a rule that is set to be an error, then whatever it reports will cause the linter to exit with an error code, meaning that it will cause your test to fail, and you will be notified, and you will have to fix it.

And then you also have warnings, which do not cause your linter to exit with an error code. And as I said during the talk, that doesn't really make much sense, because you're trying to enforce a rule without trying to enforce it, because you enable a rule, and you say, "Well, I want this rule to be enforced, but I also don't want it to cause the test to fail, so, therefore, it's not enforced." And that just doesn't make sense.

Andrew Kelley: I completely agree.

Jeroen Engels: Yes.

Andrew Kelley: Okay, but let's try to explore this idea. So, you talked about false positives, and you gave the example of your favorite linter rule, which I also happen to have as a favorite, which is dead code. Unused variables, unused functions. Get rid of 'em. I love that.

That one is not one that has false positives. But what about a linter rule that is useful, but, fundamentally, must have false positives in it. Can you think of any? Or, perhaps, do you think that there should never be this kind of linter rule?

Jeroen Engels: So, code smells are usually...

Andrew Kelley: Code smells.

Jeroen Engels: ...those kinds, right? We always say it's code smell because it's probably a sign that there's something bad about it, but we don't know for sure. Sometimes it's good, just like cheese. It smells bad, but in some cases, it's good. So, code smells, whatever that might be for your language or your ecosystem.

Andrew Kelley: If it smells like a stinky foot, could be a bug, could be bleu cheese.

Jeroen Engels: Or a stinky foot.

Andrew Kelley: Or a stinky foot.

Jeroen Engels: And a stinky foot is better than no foot.

Andrew Kelley: Wow. Oh, I see. I see your point. Your point is that maybe it is smelly code, but there's no other way around it. This problem is hairy, and this is, on the stinky foot, has hair on it.

Jeroen Engels: Potentially.

Andrew Kelley: I see. This analogy has gone quite far. But do you have an example of a code smell lint? I'm going somewhere with this. I mean, I'm gonna ask about disabling comments, but I wanna come up with an example first that we can examine.

Jeroen Engels: The thing is, I don't have too many examples because when we have too many counterexamples when we are trying to think of a rule, we tend to not implement that rule.

Andrew Kelley: Right. Right, right.

Jeroen Engels: It's a bit tricky for me because I just didn't think about those for a while.

Andrew Kelley: For Elm, right?

Jeroen Engels: Yes.

Andrew Kelley: But what if you're stuck with a more legacy language? You know, C, C++, JavaScript? So, maybe the language is not as nice, and so we might have more smells?

Jeroen Engels: In those cases, for instance, you could say, well, you should never access anything on null, right?

Andrew Kelley: Okay.

Jeroen Engels: And imagine, if we imagined we're targeting JavaScript and not TypeScript, we don't have any information about whether something is nullable or not, well, then you pretty much have to report everything, right? If you don't say, "Oh, if this parameter is not null, then you can do this. If it's null, then you do something else." But if you don't have those checks, then you're gonna have to report every usage of this. Do you see what I mean?

Andrew Kelley: What is the lint? The lint is...?

Jeroen Engels: Let's imagine the lint is, we want to report any fields usage of a potentially null value. Like, if you do A, B, then if we haven't checked that A is null, or not null...

Andrew Kelley: Well, what if A comes from the function parameter, and we're expecting it to never be null, on...

Jeroen Engels: How would you tell it that it shouldn't be null if you don't have types?

Andrew Kelley: Oh, I see. So, you would need to assert that it's not null and that assertion would make the linter error go away?

Jeroen Engels: Yes.

Andrew Kelley: Okay. This sounds okay. Sounds kinda nice, actually.

Jeroen Engels:  But you would have a lot of false positives because, you know, oh, well, this function is never called with a null value. We know it because we have asserted it before. But because the linter doesn't know that, it has to force you to reassert that it's not null.

Andrew Kelley: Well, that answers my question because the next question I was going to ask is why not... I mean, you mentioned that you think that there's never a reason to have a disabled comment for a linter.

Jeroen Engels: I wouldn't say never.

Andrew Kelley: Oh, not never, okay.

Jeroen Engels: But it should be very rare.

Andrew Kelley: Very rare. And, well, you already showed that in this case, it could be disabled, not with a comment, but with an assert, and that's better. And that doesn't count as a disabled comment, right?

Jeroen Engels: No. That is you pushing towards better code, or code that reads more like you what you want, right?

Andrew Kelley: That's very nice. I have to admit, during your talk, I was thinking to myself, there have to be lints where you need to disable them, but now that we're trying to think of any, I'm coming up dry.

Jeroen Engels:  In some cases, we'll be like, we don't have the information that we need, but people can always change their codes in a way that the linter can understand that, "Hey, here, there's no problem," because we added an assert, or we added an if condition where it's, we say, "Is this value null?" Things like that.

So, whenever you get a linter report, you always have to change your code, be it through a disabled comment or through changing the code.

Recommended talk: Static Code Analysis - A Behind-the-scenes Look • Arno Haase • GOTO 2022

The Benefits of Prompts in Linter Auto-Fix

Andrew Kelley: That makes sense. Do you want to talk about auto-fix, or prompts, fix prompts?

Jeroen Engels:  What I talked about during my presentation was that linters, they tend to have this feature where they automatically fix some of the issues, which is a very hard thing to do. Like, I dunno if you've written any linter rules in your career, but writing a linter rule that does the right thing always is very hard. It takes a lot of gathering contact, gathering information, and do some logic to figure out if is there a problem or if is there no problem?

And writing a fix for it is a lot harder, because you need to gather a lot more information to make sure that you don't change the code to something that will not compile, that will break, maybe even that doesn't look weird code style-wise. Like, the indentation still needs to be all right.

Andrew Kelley: Right, right. And you might not have type information.

Jeroen Engels: We might not have type information. Fixes are super useful because it simplifies developers' life because they just remove a lot of time that they could have spent on other things. But they can be done in a trustworthy or untrustworthy way.

The example that I took was for ESLint, where I said if you run ESLint --fix, it'll fix all the issues that it can fix automatically. And the problem is that if you do that on a new project, or you just enabled a very large, new rule that changes a lot of things, then you have a very big diff. And that diff can be very hard to analyze.

And the problem is that the linter doesn't tell you which errors were reported, and it doesn't tell you how it tried to fix each individual issue, and therefore you have a lot of trouble figuring out whether the change was safe, and whether you can push this to production.

So what I do with elm-review is, when you run it with the fix flag, it prompts you for every error with a fix. Like, it tells you all the details, like, "This is why I'm reporting this issue. This is what you did wrong, but I think I can fix this. Would you like to accept this change? Yes or no?"

And by doing this process of prompting for every error, we can get to trust the tool, because we see that it's doing the correct thing. We see, "Well, it reported this problem. It suggested this fix. That looks pretty good to me." And if I see that it does that a hundred times in a row, I start to trust it. And only when you trust the tool, then we have an elm-review --fix-all feature to fix all the issues in one go, and then prompt you.

Andrew Kelley: That's nice. I have to admit though, I found myself thinking while watching your talk on this that you showed a 600-line diff, and the alternative is a command line prompt that shows you a small diff, I don't know, a hundred times, that adds up to 600-line diff.

And for me personally, I actually would rather pick the 600-line diff, because, on one hand, it's nice that the smaller prompt will give you the context, but it's going to be the same issue over and over again, right? You know, it did the same fix, the same fix, the same fix.

Jeroen Engels: If it's the same rule, yes. If you have 200 rules that each do different things, and fix the issue, or fix their issue in different ways, then those compound. You have one transformation, then another transformation, then another transformation, and the beginning code and the end result code, they look very different. So, you don't know, like, how many errors were reported, how many fixes were applied.

Andrew Kelley: Because it might be multiple fixes along the same lines.

Jeroen Engels: Yes.

Andrew Kelley: I understand.

Jeroen Engels: In that case, it gets complex. If you only have a single rule that reports all these issues, go use --fix-all. If you think that looking at gen diff is good enough in this case, go for it. That's why you have elm-review --fix-all.

Andrew Kelley: Maybe just, you know, if scrolling through the diff is fastest, and the changes are simple enough, then perfect, but it's nice to have that advanced option for when it's a little more tricky to understand what just happened. At least we have this tool that can break it down, so you're never just trying to trust... You don't have to trust the tool. You can have the tool explain to you why is it doing what it's doing.

Trusting vs Submitting to Linters

Jeroen Engels: You don't have to "trust the tool."

Andrew Kelley: Right, I see your point.

Jeroen Engels: You don't trust it. You submit to it, is what I like to say.

Andrew Kelley: Yes.

Jeroen Engels: Because imagine you're a junior developer. You just started using JavaScript, you just start using ESLint because someone told you it was good, and you run eslint --fix, and then it changes the code in very different ways, and you have no clue. Like, I barely knew what the code was doing before. Now I don't know what it's doing now.

Andrew Kelley: Well, yes. If I'm a junior developer, I'm going to assume that the tool knows better than me, and accept it blindly, right?

Jeroen Engels: Yes

Andrew Kelley: I mean, I would just read the diff, but if I was a junior developer, I would just assume that someone else knows better than me, and just say, "Yes, yes, yes, yes."

Jeroen Engels: Yes. But that's not always correct, right? Because the fix is just a suggestion of a fix.

Andrew Kelley: Right, right.

Jeroen Engels: For instance, if you have an unused variable, the fix is to remove it, right? But, potentially, it's the code that I just wrote, and the correct solution to that is to start using it somewhere. So that's also a reason why I like to push towards prompting for every fix, yes, to notice, "Oh, there's something that I did wrong, and that the tool won't help me with." So, the tool's not doing something wrong, but there are sometimes better solutions to the problem.

Recommended talk: Why You Don't Trust Your Linter • Jeroen Engels • GOTO 2022

The workflow of using linters

Andrew Kelley: Okay. So, here's a question. So, you've written elm-review, and you've put a lot of thought into, well, the workflow of using linters.

Okay. So, I've created the Zig compiler. And the Zig compiler is, it has more features than most compilers. It's not a bare-bones compiler. I mean, it has the formatter built into it. I mean, it has unused variable errors. And in a branch, I haven't merged it yet, but I have this --fix feature in the compiler directly, not a separate linting tool. So, the topic I wanted to bring up for you is, can we talk about the tradeoffs of having linting errors, so stuff like, you know, removing unused variables, maybe other things like that too...

Jeroen Engels: And I'm sure that the C compiler has a lot of warnings that you don't have...

Andrew Kelley: We do not have warnings. Only errors. So...

Jeroen Engels: Okay. Sounds good to me.

Andrew Kelley: Sounds good, right? Okay, but, but we also don't have a linter. And so people do find it annoying that when they're trying to iterate quickly, they are not allowed to have unused variables. And so, I mean, one obvious choice is just to separate the linter step from the compilation step, and that's the workflow that you've described.

Jeroen Engels: Yes. I think that's the way to go.

Andrew Kelley: Okay, then here's the downside though. If I'm looking at someone's code, then maybe they didn't run the linter step, and I'm looking at it, and I'm seeing this function, and I'm trying to understand it, and it's annoying because it doesn't make sense. Why is it doing this? Why is it doing this? And then 30 minutes later, I realize it's never called, and that explains it, right?

Jeroen Engels: Yes.

Andrew Kelley: It would've been nice if that linter guarantee was there, but they just didn't run the linter yet, because it's not Tuesday, I don't know. Do you see what I'm saying?

Jeroen Engels: In which context are you looking at it? Because that changes how you're thinking about it as well. For instance, if you have a pull request, and the tests are green, and you look at the code, then you will still have the guarantee, well, all the code that is there is used...

Andrew Kelley: I see.

Jeroen Engels: ...because the linter has run. If you're looking at code that is still being written, like, you're pairing with someone or someone who says, "Hey, I have a bug. Can you help me fix it?" Then sure, the code might be unused, but for a good reason because they're still working on the function, and they may want to clean it up later.

So it really depends on the context of when you're looking at the code, I think. You could also say, "Well, if I want to look at any Zig code on the internet, just, like, in a GitHub Gist or something, Gist, then I want to know whether all the used things are used or not, but someone might paste some non-compiling Zig code as well, so...

Andrew Kelley: That's true. Yes.

Jeroen Engels: I think if you don't have a CI running next to the code that you're looking at, or you just run the test, then you don't have any guarantees anyway.

Andrew Kelley:  I think that's an interesting point. One kind of, takeaway, at least for me, from this conversation is that linting is fundamentally related to the idea of continuous integration.

Jeroen Engels: Yes. Like, if you have a linter, but you don't enforce it in your CI, or in your test suite, that's no use. It's just like having warnings for everything.

Andrew Kelley: Right. So, we have this phase of the development cycle. There's the development phase, and then there's the integration phase. And even if you have continuous integration, that's still a separate phase that happens when you make the patch set to send.

Jeroen Engels: Yes.

Andrew Kelley: That's interesting. I'm just brainstorming here. Are there any projects that justifiably do not have a separate integration phase?

Jeroen Engels: When you say integration phase, you mean specifically what?

Andrew Kelley: Well, I'm calling the part where you run the CI tests, that you, maybe you make a pull request and then the tests run automatically, you know, before you merge it. That's the integration phase, right? The development phase would be on the local developer's computer before they submit the patch.

Jeroen Engels: So, are there any projects where an integration phase doesn't make sense?

Andrew Kelley: Well, where the team justifiably does not have an integration phase. So, I'm not trying to make a point. I'm actually just musing out loud. Like, I don't know, maybe video game companies don't have an integration phase. Or maybe their integration phase is you play-test the code. I don't know.

Jeroen Engels: I think there might be two use cases that I can think of. One is when the product or the project is very early in its development phase, and people don't care about the code quality. Like, we've seen a talk from Henrik about the fact that code quality should be done after the prototyping, because you want to iterate, you want to explore ideas, and afterward, then you can think about code quality.

Andrew Kelley: But then you could just not do the linter step on that case as well.

Jeroen Engels: Potentially. It's an approach that I haven't tried out, so I'm curious to know how it would work out. Maybe you would only enable some of the linter rules for the code that, you know, should be in the code quality phase, then you enable more rules for that specific part of the codebase, maybe.

I had another...what was the other use case I thought about? No, can't figure it out. I can't remember it.

Guarantees vs. Power: A Comparison of Nim and Zig

Andrew Kelley: That was pretty interesting. Did you have any other topics that you want to examine?

Jeroen Engels: From the little I've seen from Zig, it cares a lot about guarantees. Like, you told me, like, I see some Zig code, I want to know, I want to have the guarantee that this function here is used, that this function compiles, that it will not crash, stuff like that. Those things are enforced by a compiler, potentially by a linter, or a code formatter.

And I feel like the same happens with Elm. Like, we care a lot about guarantees, about adding constraints that give us a lot of things in return. And I feel like that's something that is quite recent-ish. I've worked with JavaScript before, where you get almost no guarantees. You've worked with C, where you have...a lot of things can go wrong.

And I feel like the language that popped out recently, especially the ones that come with the functional programming paradigm, they care a lot about giving guarantees about the code, about how it will execute. And I feel like, is there a trend to add more guarantees in languages? Is that something that we now all care about? What do you think?

Andrew Kelley: I actually do not think that that's the case, because...

Jeroen Engels: Okay.

Andrew Kelley: ...I do see a lot of contemporary new languages, which they don't seem to focus on it too much.

Jeroen Engels: Do you have any examples?

Andrew Kelley: You're gonna make me burn another project, huh? I will give an example. So, the example I'll give will be Nim. So, Nim's emphasis is on flexibility and power. And so you can do some really impressive things with Nim macros.

I think that they pride themselves on having a lot of the core syntax, such as: just, like, plus, minus, and division and things like this, defined in the standard library. You can also implement async/await in userland in Nim, which is pretty cool. Right? It's pretty powerful that you can do that. That's a fundamental transformation of the control flow of a function, to make it async/await.

And they do that with, like, the powerful metaprogramming tools that the language exposes. You see where I'm going with this, though. You know, if you're reading a function, does it use a powerful metaprogramming technique to fundamentally change what that function does? Maybe. You don't have that guarantee.

Jeroen Engels: It doesn't say it explicitly. Does it implicitly?

Andrew Kelley: I don't know enough. But, I mean, you could just be scrolled down and you're not looking at the top of the function or something like this. Do you see what I'm saying?

Jeroen Engels: Yes.

Andrew Kelley: Whereas with Zig, it's a tradeoff. We don't have some of these powers. Like, you can't implement async/await in userland. It's part of the language syntax.

But if you're in the middle of a function, just looking at a piece of code, you have a lot of guarantees that if you see this variable, and you see the definition of that variable somewhere else on an outer scope, it's the same thing. It's not shadowed or redefined or something like this. There's my crafty... I'm not trying to burn Nim, but there's an example for you, language.

Jeroen Engels: Why do you care about having so many guarantees?

Andrew Kelley: Just preference. I mean, it's just, my subjective opinion is that I like to make reading code the easiest thing to do with the language. We see this, sometimes people use Advent of Code to learn Zig, and it doesn't really go well for them, because Advent of Code is write-only code. You're never gonna come back and read it again. You don't care about it. It's a small, you know, 20, 30-line program, and it's just, it's write-only.

But, I mean, Zig code is almost read-only, you know. It's, obviously, it's not read-only because you have to edit it, but, you know, Zig code is meant to be maintained. It's meant to be refactored, moved around. It's meant to be a large codebase that you're trying to manage the complexity of, and...

Jeroen Engels: And secure and safe.

Andrew Kelley: All that good stuff.

Jeroen Engels: Refactor it, and it will still work.

Recommended talk: Intro to the Zig Programming Language • Andrew Kelley • GOTO 2022

Linter errors and functional programming

Andrew Kelley: I have to tell you, so, I have... A lot of our "linter errors," I guess you can call them, like unused variables, there are categories of errors. Some require the type checker, but some can operate directly on a file. It doesn't matter what flags you pass, it doesn't matter what target you pass. Like, we know if you have an unused variable just based on the file alone, nothing else.

I have this on unsafe. When I save a file in Zig, it runs the formatter, and it will give me, like, an error list for unused variables, for use of undeclared variables, like, a certain class of errors that are detected on, like, just a file level. Those are all reported, just instantly.

And I love it so much for refactoring, because all I have to do is just grab a block of code. I can just cut and paste code. I don't even read the code. I just cut it, I paste it, I put it somewhere else, or if I want a piece of logic from this function, I just move it.

And then I get errors for... It's almost like I reached into a robot and just, like, grabbed their arm, and then I just put it on another robot. And then I just get an error for every wire I just need to, like, reattach to the, you know, that's exposed, and then it works. Because of all these guarantees. I love that, just that ability to move large pieces of code around.

Jeroen Engels: Do you also call it, "If it compiles, it works?" Because we do that in Elm all the time.

Andrew Kelley: I mean, it's subjective, and I'm biased, but I definitely feel that way,.

Jeroen Engels: I'm guessing there's also the bias, like, "Well, I'm a senior engineer. I have experience, so, of course, it's gonna work because I did the right things." Does it also work for a junior? Well, maybe.

Andrew Kelley: Maybe, there are lessons to be learned in order to get to that level, but the power is there.

Jeroen Engels: I absolutely understand the need and the longing for that kind of safety. We have it in Elm as well, and it's just amazing. It's so hard to imagine coding without it, because I know that I'm gonna make a lot of mistakes, and I just want some tool to help me figure out that I messed up and when I'm gonna mess up.

Andrew Kelley: Okay. Here's an interesting topic. So, I'm used to doing imperative programming, where the goal is that I write code, and at the end of the day, it's machine code, you know, and that's the transformation, targeting a virtual machine or actual machine.

You're used to doing functional programming in Elm, and you're used to having certain kinds of guarantees. We both understand the guarantees, but let's try to find a bonus guarantee that you have in Elm, that I don't have in Zig, because of imperative versus functional. But you have to help me because you have the expert on what is available to you.

Jeroen Engels: I have referential transparency.

Andrew Kelley: Okay. Can you explain that?

Jeroen Engels: So, basically, when you do an operation, if you take the same code and you put it somewhere else, they will give you the same results. So, doing the same operation will give you the same results. For the same inputs, you get the same outputs.

That is only true if you don't have any side effects or side causes, like accessing global variables, mutating them, making HTTP calls, and things like that. Because in Elm, you don't have computations, you don't have side effects. Everything is just pure computation, based on the inputs and based on constants. You're always going to get the same result for the same input. And that can make some simplifications a lot easier. For instance, if you do, if F of zero is equal to F of zero, then do something.

Well, in a functional language, or at least a purely functional language, we don't care what F is. We don't care about the implementation. We know that it's one function with the argument zero, and we compare it to the same function with the same arguments. So we know those are always gonna be equal. So we can simplify that if the expression to...

Andrew Kelley: Oh, I see. At the call site, you can simplify this.

Jeroen Engels: Yes. For instance.

Andrew Kelley: Ah, I see.

Jeroen Engels: Those kinds of simplifications, we can do. We can also move code around without caring about, well, did this function depend on this other function to be called first?

So we can move it very easily. A linter can do that for us. And you can do that with imperative languages by either getting false positives or by doing a lot of static analysis to figure out whether this is okay or not to do.

Andrew Kelley: But then it would only work if the programmer cooperated and wrote functions that did not have side effects, right?

Jeroen Engels: Or you would have false positives. But you could also have the linter be very smart about it, do a lot of extensive research. Does this function have any side effects? Does it access global variables that the other function also does?

And that is very, very tricky to do, I think. I haven't tried it, but I think, in some cases, you will reach some missing information. For instance, it's using a function from a dependency. If you don't know what the code in the dependency is doing, then you don't know.

Can you call that function twice in a row without having any weird effect? We don't know. Therefore we have some missing information. And when we have missing information, you either have false positives or false negatives.

Andrew Kelley: I think that the bulk of imperative code, I'm trying to think when it might apply to this scenario or not, I think the bulk of imperative code would have a, not necessarily a global variable, but all of these functions, you know, let's say A, B, C, D, E, F, or whatever, they'd be methods.

So they would all take as the first parameter a mutable pointer to some shared state, which effectively acts as a global variable, but it's not global. But, you know, the sequence of function calls is these methods are basically operating on an object instance, and mutating it. And that's kind of the only mutations. So, if we were able to...

Jeroen Engels: In functional programming? Or in...?

Andrew Kelley: In imperative.

Jeroen Engels: Okay.

Andrew Kelley: I think that if we were able to model these as... If these mutations were able to be modeled, then we could have these kinds of abstractions of, you know... I don't know what's the purpose for optimization or for linter warnings or something?

Jeroen Engels: The uses I'm thinking of is simplifying code...

Andrew Kelley: You gave the example with Map, right?

Jeroen Engels: Yes. If you do List.map on a list, and then you take the result of that and you call List.concat, which is, like, a flat map, concatMap, then you can just use List.concatMap instead, in Elm. And you have multiple of these similar transformations that you can do. But if the order of operations matters, then this can be a potentially breaking change, in the sense that it will break your code.

Andrew Kelley: Right, right.

Jeroen Engels: Like, I know that OCaml is a functional language, but it doesn't have purity. So, for instance, whenever it tries to do List.map, it will always try to keep the order of the individual function calls the same. So, we call List.map with a function F. Well, if we call F of the first element, then F of the second element, and so on and so on, and then you have to keep that order. Otherwise, the code might change.

Andrew Kelley: Right, because the code...

Jeroen Engels: The behavior might change.

Andrew Kelley: ...is allowed to rely on that property. Understand.

Jeroen Engels: And if you have a pure functional language, then move it around just like you want.

Andrew Kelley: This is called referential transparency.

Jeroen Engels:I think so. I'm not entirely sure, but I think that it's, given the same inputs, you get the same outputs.

Andrew Kelley: Well, I mean, that's the essence of a purely functional language, right?

Jeroen Engels: Yes.

Andrew Kelley: Well, I enjoyed exploring some of these ideas with you. I'm definitely walking away from here, I don't know, rethinking some of my conclusions about the role of linting in the Zig compiler, so I appreciate that.

Jeroen Engels: I really enjoyed this talk as well. And maybe you will make Zig a functional language soon.

Andrew Kelley: Thanks for your time.

Jeroen Engels: Thank you too.



Related

CONTENT

You Really Don't Need All That Javascript, I Promise
You Really Don't Need All That Javascript, I Promise
GOTO Chicago 2020
Adopting gRPC: Overcoming Team and Technical Hurdles
Adopting gRPC: Overcoming Team and Technical Hurdles
GOTO Chicago 2019
The Journey to Microservices from a Startup Perspective
The Journey to Microservices from a Startup Perspective
GOTO Chicago 2017