Anna Rose (00:05):


Welcome to Zero Knowledge. I'm your host, Anna Rose. In this podcast, we will be exploring the latest in zero knowledge research and the decentralized web, as well as new paradigms that promise to change the way we interact and transact online.


(00:26):


This week, I speak with Alex Ozdemir, a PhD student at Stanford researching formal methods, cryptography and distributed systems. In this conversation, he shares an overview of ZK languages, what they do and where they live in the stack. It's a bit of a bird's eye view that might be helpful to devs who want to jump in. We also talk about his recent work called CirC, dealing with practical implementations of generalized computation within ZK systems. But before I start in, I want to remind everyone that there's currently a Gitcoin Grant CLR matching round on. And if you want to support the Zero Knowledge podcast, this would be a great time. Be sure to pop over and donate. All donations are matched, and so your donation can go a lot further if you donate now. I've added the link to our grant in the show notes. Now, I also want to thank this week's sponsor Mina protocol.


(01:17):


Mina is the world's lightest blockchain, creating a private gateway between the real world and crypto. The layer 1 protocol replaces the traditional blockchain with a zero knowledge proof, ensuring a super light and constant-sized chain, which allows participants to quickly sync and verify the network. The entire chain is, and always will be, about 22 kilobytes, even as it scales, and SNARK-powered dApps, called Snapps, allow access to verified real-world data from any website for on-chain use. The ecosystem is growing fast ahead of Mina's upcoming mainnet launch, with validators and community members in more than 120 countries. But it's still early and there's still opportunities to get involved at the ground level. To do so, visit minaprotocol.com and find out more. You should also check out Mina's first virtual Illuminate Summit, which is happening on March 28th. Visit lluminate.minaprotocol.com to secure your spot. I've also added the link to this in the show notes. So thank you once again, Mina protocol for supporting this podcast. Now, here is my conversation with Alex Ozdemir all about ZK languages.


(02:28):


So today I'm sitting with Alex Ozdemir, a PhD student at Stanford researching formal methods, cryptography and distributed systems. His recent work deals with practical implementations of generalized computation within zero knowledge systems, but that's actually something we're going to come back to later. The majority of this episode is going to be about mapping the different ZK languages out there. So I want to say a big welcome to Alex. Welcome to the show.


Alex Ozdemir (02:55):


Thank you, Anna. I'm happy to be here. And as you said I've been spending a lot of time lately thinking about these systems that we build to help map our ideas into zero knowledge proofs and kind of the process of thinking about that. I've been exploring all these different languages that people have cooked up and I'm excited to talk about them.


Anna Rose (03:14):


Very cool. And I think you're a great person to have on to talk about this because of that experience. I think for the majority of episodes that we've done on languages, we focused in on one particular project's kind of interpretation of that. And so I think throughout this episode, we will be mentioning languages from projects that we have had on in the past or languages that we've just mentioned. Though the reason I want to do this also on a personal level is, I don't always understand where they fit. I don't understand which languages are actually competing with other languages, where they're compatible, if they're focused on different parts of the ZK stack. And so this is what I'm hoping we can better understand through this interview. Or even I don't know if I should call it an interview, It might be a little bit more of a collaborative description.


Alex Ozdemir (04:07):


Something like that. Yeah. I think that, and the instinct here to want to look at all of them at once is totally right. Many colleges offer these programming languages classes when you're studying computer science. And the whole point of that class is to look at all these languages at once with the sense that you can only really understand them in comparison to one another. And that's true for languages broadly construed as well. I mean, I guess it's even true if you're studying French and English or something, oftentimes you learn through contrast and the same thing is true for these SNARK languages.


Anna Rose (04:39):


Cool. Before we kick off this dive into the ZK language map, I want to hear a bit more about you and your background. What you were doing maybe before getting into this. What led you to this?


Alex Ozdemir (04:54):


I think that actually sort of the beginning of maybe me being here today was in college, when I wasn't even aware that cryptography existed. And the thing that really fascinated me was the systems, the compilers, the runtime systems that people use to bring programming languages to life. And I was hunting for an opportunity to get to learn more about that, and I kept on striking out. Nobody around me was working on that. And so I got kind of diverted onto a theoretical path that ultimately led me to cryptography and then serendipitously, it seems like cryptography is at a point where really there are some substantial language challenges, in cryptography. So I guess in some sense, I ended up where I was going all along. Yeah.


Anna Rose (05:35):


What led you to the ZK space specifically though? Because the cryptography space is wider and I'm sure cryptography engineering is wider. So why zero knowledge?


Alex Ozdemir (05:46):


I think the answer here is kind of on the theory side. So I showed up at Stanford as a theory student and I was hanging out with other theory students and there's this really fascinating theoretical result called the PCP theorem. And it's the theorem about proofs. So that's something that people from zero knowledge can appreciate. I mean , zero knowledge says that you can write these proofs that don't reveal secrets, and what the PCP theorem says is that you can write down proofs, where the verifier only has to check three bits of the whole proof. So anyways, it was this really interesting theoretical result. And when I first heard it, I thought it was false. It sounded like, no, that's absurd, right? Like checking three bits of the whole proof, there's no way that you're actually going to be... It's going to be a valid proof. Then I started bumping into some of Dan's students and they were telling me that actually, not only is this true, but there are actually practical systems that are built around this kind of theorem. And those systems are of course all the systems that underlie zero knowledge. And so I kind of came from the hard theory angle, I guess, just being really excited that there was an application of this seemingly impossible and seemingly wild theoretical result.


Anna Rose (06:54):


So you're also a member of Dan Boneh's applied cryptography group. We've actually had a few other members of that group on the show already. When did you join that group?


Alex Ozdemir (07:06):


I guess we're coming up on like a year and a half now. I started working with Dan the summer after I began my PhD. I don't think you've had Riad here...


Anna Rose (07:14):


No, I don't think so.


Alex Ozdemir (07:15) :


But I was working very closely with Riad, and still do.


Anna Rose (07:18):


Cool. And maybe another guest for future episodes if he's interested.


Alex Ozdemir (07:22):


Yeah, I would encourage it. Yeah.


Anna Rose (07:24):


Cool. Okay. So then let's start to scratch the surface on this topic that we want to cover. Maybe as a starting point though, it might be good to actually explain what these languages are even for. Like going from maybe the developer, even of an application, what are these different languages actually supposed to do, the ZK languages?


Alex Ozdemir (07:48):


Yeah. That's a good question. And I actually like to think of the question even a little bit more broadly than that, which is that let's not start at the developer, let's start at the person who has the cool idea, even if they don't know how to program. Of like, hey, this is something that, it's like a statement. It's like something that you could imagine proving, and there's some notion of secrecy. And I think that the core problem for zero knowledge is that it's super powerful, it gives you these great properties, but it gives you these great properties in a very arithmetic world, a very mathematical world. It doesn't tell you about how to show that you know your password, it tells you about how you can show that you know the solutions to some polynomial equations.


Anna Rose (08:27):


There's a disconnect there.


Alex Ozdemir (08:29):


Like if I told my mom that, she would have to be like, what are you talking about, right? And so, I guess the idea is that we need some compromise between what the cryptography gives us here, which is these statements about equations, and between what humans can understand. And I think that this is where languages come in. So the idea is like okay, we're going to come up with a language, and it's going to be better than equations. But it's still going to be formal. And the idea is that you're going to take your thoughts and you're going to formalize them into the language, and that's the work that you're going to do. This is something that a developer is going to do, that's going to be writing some code. And then the language's compiler or language's implementation, whatever it is, it's going to get you from that language down to a zero knowledge proof. And perhaps it's going to do that by turning the language into a bunch of equations over a finite field or something else. So it's a big pipeline and we're at the top side of it. We're asking how do you take the human ideas and encode them in a formal way so that can still be translated to field equations.


Anna Rose (09:32):


I'm actually curious as you say this, what's underneath that part? Is that where you get into the Circuit programming languages, or would you also include Circuit programming languages into what you just described? Is it more like you're dealing with the actual computer itself, like the hardware? I'm kind of curious where you stop as you go down that path.


Alex Ozdemir (09:54):


Where do we say, yeah it's not a language anymore, this is now a different problem?


Anna Rose (09:58):


Yeah.


Alex Ozdemir(09:58):


Yeah. I guess as somebody who likes languages, I'm sort of a language maximalist. So I try to find languages anywhere you can imagine them. I would say these Circuit programming systems, I would call those languages too. But certainly I think by the time that you have a bunch of field equations and you're trying to prove that you know a solution to them, this is no longer a language problem. This is where the cryptography comes to bear. And actually I work on that sometimes too, and I enjoy working on that, but that's where I would draw the cut-off point.


Anna Rose (10:33):


Got it.


Alex Ozdemir (10:33):


And there's lots of great work that goes under the hood. It goes on under the hood there that we're not going to talk about today. And this is like, I don't know if you've ever got Ale Chiesa on the podcast, but he's probably the world expert in this.


Anna Rose (10:46):


We did, but we actually had him on for StarkWare not for languages.


Alex Ozdemir (10:51):


Interesting.


Anna Rose (10:52):


So, I don't know if we talked about STARKs.


Alex Ozdemir (10:55):


Yeah, that's a great example. So STARKs are one of those systems for saying, okay, I've got something like a set of equations over a finite field, I want to prove that I have a solution to them. STARKs, they're one of these lower components that are going to allow you to do that. And the same way that Groth16 or Sonic or Plonk, all of these systems that we talk about, those are at the bottom of the stack. They're proving that you know the solution to some set of equations.


Anna Rose (11:22):


Got it. Now, given the dynamic of this, we're going to be talking more at this top level, from the user developer to the actual... I guess, can we call that Circuit language or how would you group that Groth, STARKs? What is that called?


Alex Ozdemir (11:40):


We call them all circuits.


Anna Rose (11:42):


Circuits. Okay.


Alex Ozdemir (11:44):


Yeah, but when we say that, we're not imagining the kinds of digital circuits that are in your computer that evolve over time. They are circuits in the sense that they take a bunch of values and combine them and collapse them down into a single answer, yes or no.


Anna Rose (11:59):


Got it. So we're working between the user and the sort of circuit level, but what was there in the past? What were people using to do that, originally? I don't think that they were bespoke languages right off the bat. So what would somebody have used kind of early on?


Alex Ozdemir (12:18):


I guess this is our beginning of the language story, sort of what came first, and in the very beginning, people, in some sense, didn't care. They were just trying to show that they knew the solutions to these field equations. But then once they started using it, so for example, the first... The big original push for Zcash, they needed tools that were going to help them construct equations that checked, in this case, the Zcash properties. So the fact that you know the identity of some secret coin. And probably the first notable cut at this is something like gadgetlib, which is folded up inside the libsnark codebase. So one of the earlier, not the first, but one of the earlier implementations, and really that was just like a C++ library that allowed you to construct field equations. That's what it was. And so I don't know if you want to call that a language or not, I think maybe it's fair to say at the beginning it wasn't, you were just putting together these equations, but it's inside C++. And so you can start to write C++ structures on top of it. Imagine you write like, oh, I'm going to define a C++ class that represents a boolean represented in the field. So it's like a C++ class that captures a single field element, that's zero or one, and I'm going to define NOT over it, I'm going to define AND over it, all the Boolean operations, and now you start to get something that looks a little bit more like a language. So that's what gadgetlib is, it was building these gadgets. That's what it was. It was building these gadgets that would allow you to capture kind of C++ concepts inside the circuit.


Anna Rose (13:46):


You just mentioned libsnark. I think we did talk about that, possibly when we had Howard Wu on, a long time ago. He did an intro to zk-SNARKs, and I think we potentially would have mentioned it back then. But was libsnark, like lib, I guess, stands for library?


Alex Ozdemir (14:04):


Yeah:


Anna Rose (14:05):


Is that a language? Is that a collection of libraries? I was never entirely clear on how that was used.


Alex Ozdemir (14:15):


Generally, when I talk about libsnark, I'm talking about the implementation of a proof system. So a code that implements that thing that goes from a circuit to a proof. But I think also gadgetlib, this library for constructing circuits, I think that was also in the libstark codebase. Certainly, those two things were developed by the same people at the same time.


Anna Rose (14:36):


Okay. For solving kind of in the same direction, I guess.


Alex Ozdemir (14:42):


Exactly, yeah. I think of it as being useful to say that libsnark is the proof system implementation and gadgetlib is the library for creating circuits.


Anna Rose (14:50):


Okay. So I think that really nicely sets the scene for what we're going to talk about going forward, which is a lot of the new languages and the new implementations that we've seen emerge. I kind of want to hand this to you a little bit. Do you want to take the lead and start to share with us some of the findings of your research, maybe help us navigate and map what's out there?


Alex Ozdemir (15:15):


Yeah, totally. And I think this is actually kind of an exciting story to tell, because it really is a story. So we started with this thing, like gadgetlib, which was just a C++ library for building circuits and people looked at it and they thought, I can do better. And they thought that 10 times and we have 10 languages that came out of it. Probably, more than 10, to be honest, but we'll talk about what we can. So that first thing that people thought when they looked at gadgetlib is they thought, this is great, but I don't like C++. Which I can relate to. I write a lot of C++ in my job and it's maybe not my favorite thing in the world to do.


Anna Rose (15:53):


Why is that? What's wrong with it? What's annoying about C++ in this context?


Alex Ozdemir (15:58):


I think that most of the challenges with using C++ are general challenges and they're sort of amplified by our context. So let me be a little bit more specific there. I think C++, it was designed now 30, 40 years ago, and the idea was that it didn't want to get between you and the machine. It wanted to let you tell the machine to do anything that the machine could do. And in order to enable that it doesn't have very nice abstractions. It's very easy to make a lot of mistakes in that language. In the years since people have come up with ways to still write very efficient programming languages that have better abstractions. And so in some sense, I don't want to pick on the language, it was great. And it's incredible that it survived so long and it's still so widely used. But nowadays people just have other things that they like more and frankly are better. They're just newer and are still seeing their adoption get more widespread. So normally this actually is a good segue. So normally, the language that people use as an alternative to C++ is Rust, which is widely used in the blockchain community. And it has the same goal, it doesn't want to get between you and the machine, but it just has a lot nicer abstractions and this is important because in blockchain, security, accuracy, these are things that are so important. So having a language that's going to have clean abstractions and help you avoid mistakes is usually useful in trying to make reliable software. So one of the first replacements to gadgetlib was Bellman, which was essentially the same idea. We want to make a good library for helping you construct circuits, but we're going to do it in Rust. And we're going to be more confident that the library is not going to mess up, because we're using a language that has better abstractions. And so this is like Zcash Sapling, to my knowledge, Bellman was developed in the course of building Zcash Sapling, and has since been used more broadly. But that was one of the priorities of the team, when they were building on Sapling.


Anna Rose (17:55):


Do you actually know who wrote Bellman? Is it the Zcash, is it them, is it that team, or is it a more collaborative effort?


Alex Ozdemir (18:04):


To my knowledge, it started out with them, and it has become more of a community project as time has gone on. So they still contribute to it, and the folks over at Matter Labs contribute to it quite a bit as well. And there could be others, if I'm missing your contributions, it's ignorance, not malice. My apologies.


Anna Rose (18:24):


Cool.


Alex Ozdemir (18:25):


So, yeah, there were these people who thought, okay, gadgetlib would be better if it was in Rust. And also, I want to say at the same time, they did some other stuff on the proof system side, not on the language side. So they also reimplemented some proof systems, very interesting work, very, very great, just not what we're talking about right now. But not everyone loves Rust. So some people... This is a simple idea, I don't like C++, I want gadgetlib in my favorite language. You can repeat that for all kinds of languages. And so another project that I would place in this vein is the Snarky. So Snarky had the same idea of, okay, basically we're writing a library to construct circuits. Now we want to do it in OCaml, which is a language that's very common in the programming language community and is often taught in schools. Some similar project, I think they had slightly more ambitious goals in terms of how OCamly they wanted it to be. They wanted to try to get a little bit further from the circuit world, but on the whole the same idea. So I think that there are a few other projects in this vein, but maybe we'll call it, for now and maybe say that that's the end of the era of I just want a library to help me build circuits. And then people started to ask, hey, what about languages for building circuits? Is there some good language abstraction here? And one of the first things that comes to mind when you think about languages for circuits, if you have an electrical engineering background, is you think about the languages that people use to build the circuits in your computer?


Anna Rose (19:59):


What are those, actually? Like are you talking about the XOR and XAND gates? I mean, I don't know, I did that very basic intro to computer science where you build a computer from scratch, but I'm curious at what level is that?


Alex Ozdemir (20:16):


Exactly that level. At the lowest level of the stuff in your machine is a bunch of NAND gates, typically, some kind of binary gate and some chunks of memory, they call them registers, and clocks. But yeah I know... So at the bottom level, your computer is like a clock, something that just goes up and down and registers things that save values until the next clock cycle on these gates. And somehow that's got to be organized into a central processing unit and a memory, into the floating point unit, into everything. And there are languages to help people do this. So typically the industry standard here is a language called Verilog or SystemVerilog. And these languages, they're like the Wild West, people don't even really call them programming languages. Sometimes they're called hardware description languages. And the reason that they're called that is they're not really, they're not for programming, they're for building these circuits and hardware. And they have a different mental model for how they work.


(21:16):


But in some sense, it's similar to what we're doing here. So they're building circuits, they're building these circuits that change over time, but some parts of them don't change over time. Some of them just do math and in some sense, that's what we're trying to do. We're trying to build circuits that just do math. We're not doing it on booleans, we're doing it on fields, but like, oh, whatever, similar enough. And so this is something that a guy named Jordi Baylina was thinking about. And so he asked, hey, could we build a Verilog-like language, but for circuits over fields, not for circuits over bits. And the answer is, of course you can. That makes total sense. Like you have gates that add and multiply. And so the language that came out of that line of thinking is called Circom, which I think is supposed to stand for circuit compiler. And it's totally like Verilog or SystemVerilog, except for you're handling field elements.


Anna Rose (22:06):


We've actually had Jordi on the show, and I think we did talk quite a lot about Circom. So I'll add the link to that in the show notes, if people want to hear a bit more about it.


Alex Ozdemir (22:14):


Yeah. He's a great guy to talk about it with. I guess he understands the language better than anybody else by a good margin. So where are we? Okay. We're thinking about Circom. So Circom is this hardware description language take on what it means to build circuits for proof. And this was different because previous things had just been sort of libraries for building the circuits. So it's like you went from a library for circuits to a language for circuits. And in some sense, it seems like maybe this is the end of the story. Like, great, we have a language. But it's not. And the reason it's not is because for those of the listeners who might have a background in electrical engineering, I had said earlier that hardware description languages and programming languages aren't really the same, like they kind of work in different ways. I think anybody who's worked with both can tell you which one they like more. They like programming languages. So in programming languages, we have these notions of variables that get modified, we're very used to using them as developers. It's a pretty clean mental model. And Verilog, it's crazy. It's really hard to reason about. And so it's not really enough to have a hardware description language for circuits. Really we want to be able to compile more advanced programming languages to circuits. And so even after Circom, people were still thinking, can we do better?


Anna Rose (23:36):


In this case though, would they then build on top of Circom, or would they rethink that role?


Alex Ozdemir (23:42):


That's a great question. Yeah. Is the idea here that you're going to take C and compile it to Circom, and take Circom and compile it to a circuit? It's something that you could do. I think most people said, hey, if we're going to do all this work of building a compiler to circuits, that's a pretty challenging task. We're totally happy to use a library for building the circuit. Circom isn't going to help us that much if we're already doing this much work. So I think it's more common for these systems to build on Bellman or on gadgetlib, or on systems that are very like those. Hardware description languages are not the end of the road for languages for SNARKs. What we really want is we want to be able to take a language like Python, or like Rust, some language that we know and love, and we understand the semantics of, and compile that to a circuit. And maybe we can compile C or Rust to a circuit, that would be awesome. Or maybe we can compile something like C or Rust to a circuit, that would also be good.


(24:44):


At this point, there's a whole host of projects. And I'm going to talk about them in a particular order, but it's not quite necessarily the order that things happened, and a lot of this work was really concurrent. And some of it actually was even concurrent with what we've been talking about already. So probably the first language we'll talk about here is ZoKrates. ZoKrates has this cute name inspired by the Greek philosopher, also inspired by zero knowledge. And the idea for ZoKrates was to be a compiler from a Rust-like language to a circuit. And when you start thinking about building a compiler like this, there's a few things that immediately become challenging. One thing that's challenging is that in a language like Rust, you have the idea of a variable that you can put a value in. It's like a box, like the variable... You can think of it as a box holding values, you can put a value in, and then later you can take that value out, put another value and so on. And in circuits you don't really have this. So that's a challenge that ZoKrates had to bridge. Like, what does it mean, when I say X equals 5, and later on, I say, actually I want X to be Y? How does that get turned into a circuit? But the people who are building ZoKrates figured out how to handle that. So that gave them the notion of a variable.


Alex Ozdemir (25:58):


They also wanted to capture the notion of computing over things that were not finite fields. So computing over booleans, computing over fixed-width integers. These were things that previously had been tackled in the context of gadgets. Like I have a boolean gadget. And largely ZoKrates used the same techniques, but now to capture a language in which there was a primitive boolean type. So you interacted with the boolean just like it was a boolean, and you didn't interact with it, as if it was some kind of specialized gadget for booleans. So I think that there's a really big usability improvement here. I go from having to use this arcane Rust or C++ library that gives a gadget from booleans to, oh, I can write bool A equals whatever. And I think that even though ZoKrates wasn't doing a lot of technical innovation in terms of the algorithms and how they turn things into circuits, the usability win here is huge. And so I think that they deserve a lot of credit for that. And similarly other people who are working on these high-level languages, where we're trying to pursue that usability gain.


Anna Rose (27:01):


Are there any other differences from the ZoKrates model and what you described as this Verilog-like construction?


Alex Ozdemir (27:09):


That's a good question. In the ZoKrates' case specifically, there is an important technical difference between the Verilog-like model and ZoKrates, which is that in the Verilog model, you're building the circuit. In some sense, if the machine is a circuit, you are on the machine, the language is not getting in your way, any circuit that exists you can describe. But what they did when they started with ZoKrates, is they decided that they were going to start to restrict you. And they were going to do that in the service of coming up with a sort of mental model of language that was easier to understand. When you're making things easier to understand, often you sort of lie, you cut things out. This is good teaching,


Anna Rose (27:51):


I guess you make things a little bit less flexible?


Alex Ozdemir (27:54):


Yeah, exactly, exactly. Because you're saying...


Anna Rose (27:55):


If you're making it, you have to fall into this pattern of thinking. It's almost like, is it less customizable? I don't know if that's the right term, but...


Alex Ozdemir (28:05):


I think exactly that, it's less customizable. There might be some idea that you have that it's kind of a weird idea, but it performs very well on a circuit, and that's why you care about it, but ZoKrates isn't going to let you say it. And so the critical thing here is this idea of introducing prover-provided values. So the whole point of zero knowledge proofs is that the prover has some secret in its mind, and it's trying to prove to you that it knows that secret. And in some sense, how the prover came up with this secret is not in the circuit. And so in Circom, in the middle of your circuit, you can introduce some new prover-provided value that comes from nowhere, and then you maybe check that it's legitimate. But in ZoKrates that doesn't happen. The prover provides inputs and everything inside the circuit is computed from those inputs. It's a little technical, but I think it's actually important because it means that something like Zcash, you wouldn't be able to implement it nearly as efficiently in ZoKrates, because the Zcash developers really spent a lot of time figuring out how to use all those prover-provided values very strategically.


Anna Rose (29:07):


Does it make it so that certain kinds of optimizations are harder to do? Or is it itself optimized enough that you don't need that?


Alex Ozdemir (29:15):


I think it's something like both of those things are true. It does bar certain kinds of optimizations like many of the stuff that the Zcash developers hooked up. So the question, like, whenever you're optimizing, what's going on is you are writing code in a language, and then below that language there's a compiler that's turning it into some kind of an assembly. In our case, it's turning into a circuit. And the compiler is trying to optimize, like whatever code you give it, it wants to make the best circuit possible, but then also sometimes you're thinking, hey, I'm smarter than the compiler, I know something special about my problem, and you're also trying to optimize. And so in some sense what ZoKrates is saying, it's taking away some of your freedom to optimize by restricting its language. But I also want to say that the ZoKrates compiler itself is a highly optimizing compiler and subject to its constraints, it does a pretty good job. Maybe we'll talk about this a little bit more at the very end of the podcast. But on the whole, it produces small circuits for what it supports. And it's like a big project. I think that they're getting up on 30,000 lines of Rust and a lot of that is optimizing machinery. So they've really done a good job there.


Anna Rose (30:24):


So you mentioned before that Circom is a hardware description language, but ZoKrates, I'm guessing, isn't. What kind of language and category then would it fall under?


Alex Ozdemir (30:42):


Yeah. The way I'd explain it to somebody who's new to computer science, is I would say it's a programming language. But more formally, it's a language in what we call the RAM-register model. And that means it's a language that you could imagine executing on your computer. And so this means that it executes in a way that refers to things like registers or variables, and also has some access to memory. In the case of ZoKrates arrays, that can be accessed at data-dependent locations. And then in addition to being able to make variables and access memory, it can perform computations, just like a circuit can. So it's like a Circuit++. The RAM register machine has circuits in it, but also has RAM and also has registers. And so ZoKrates is a move from the circuit model, which is what hardware description languages capture, to the RAM register model. But maybe now is a good time to say it's not a total transition. So it starts to capture some things. It captures the idea of memory in the form of arrays, it captures the idea of RAM in the form of variables. But there are some things it doesn't capture.


(31:47):


And so one of those things that it doesn't capture is it doesn't capture this idea of branching, of an "if" statement, where I look at some value and depending on what I find, I either execute code A, or I execute code B. So inside the RAM register model, this is a conditional jump in assembly. And in some sense, ZoKrates doesn't have that fundamental thing. It does have conditionals, like if something is true, then A, otherwise B, but it always evaluates both A and B. And even when you just say that, if you have some intuition for programming, you'd think how slow my program be if I always explored all possibilities. You can tell intuitively that there are some costs to not having that. And ZoKrates does not help you with that. But there are other people who did build high level languages, RAM register languages that wanted to capture that as well. And actually you can almost tell from how technical this description is, this starts to sound like an academic problem. And the first projects that tried to capture this kind of control flow were academic projects. So I think that the original one is this compiler called Pequin, which is kind of an interesting name. It turns out, it's the name of some kind of spicy pepper.


Anna Rose (33:09):


Okay. Random.


Alex Ozdemir (33:12):


Yeah. Well, but it gets better because the whole project, they had other compilers and other languages too, and they are other peppers.


Anna Rose (33:20):


Oh, okay. There was this set.


Alex Ozdemir (33:24):


Yeah. But Pequin is the most modern one. And so the idea in Pequin was they wanted to compile from C to a circuit. And in doing that, they wanted to be able to capture things like branching and also like arrays, as ZoKrates did. It's worth noting now that we're really violating the timeline because Pequin actually is several years before ZoKrates. But it was a research project, I would not recommend using it as an actual compiler. But they had this more ambitious goal of trying to capture branching.


Anna Rose (34:02):


Now what about something like Zinc, the Matter Labs language, that I don't know how finished that is, but I know that it exists. I believe that... I think Alex came on the show. Oh, by the way, just in general to our audience, I'm going to put the links to anyone that we've mentioned. I realized that if I say it every time, it's just going to be me saying it over and over again, but check the show notes, cause I'll put those episodes. But I think we had Alex talking about Zinc. So where would you place that? Does that fit with Circom? Does it fit somewhere further up the stack?


Alex Ozdemir (34:35):


Yeah, Zinc, it's quite a bit further up the stack. It's definitely in the realm of ZoKrates and Pequin. Right now it seems like the project is a bit in flux. They don't currently support producing proofs, and I've heard that they're trying to move to a new kind of proof system called Plonk. So the exact details of what they'll support, what they won't, will they support arrays, data-dependent locations, will they support branching? We'll have to see. But they're definitely going to be in this space of Pequin and of ZoKrates. And they're not the only one in this space too. So another prominent project in this space is xJsnark, which is also an academic project, although it's a little bit more polished. They're trying to handle branching, just like Pequin is, and just like Zinc might be. But they sank a lot of time into trying to take specific cryptographic functionality, like Pedersen commitments or like hashes, and put them into the language as primitives that have very efficient implementations. And this is something that you can do with any language, but it's a good idea. And I bet we'll see more of that in more languages, as time goes on.


Anna Rose (35:47):


When you say that though, is that something that other languages can then borrow? Would they be able to use that or because they've built themselves in a certain way, they are unable to make use of that kind of efficient implementation.


Alex Ozdemir (36:01):


That's a good question. I think that almost any language could play that trick. It's pretty easy for any language to say, okay, I want to add a new built-in function or some kind of new built-in abstraction. And even the techniques that you use to implement that abstraction efficiently, like the xJsnark uses, a lot of those are older techniques that were figured out when Zcash was being built. So it's not even like the techniques are super arcane, but the important thing to do here is to just integrate them into a language so that people can write most of their code in the language, and then just have sort of the cryptographic primitives defer to this optimized implementation. I think also one interesting shout out for xJsnark, that's unique about it, is it has an implementation of large integers. So this is a cool thing that comes up when you're interacting with SNARKs. They are equations over a finite field, which means that you can do a lot of arithmetic very easily., but once your numbers get too big to fit in the field, you can't do that anymore. And you have to really break out a lot of challenging techniques in order to handle it.


(37:09):


And currently xJsnark is the best implementation of 2048-bit integers that you can actually use as part of a language. And I think that this is something that other people should look at doing because if you want to capture certain kinds of cryptography, like RSA-based systems, you need this big integer arithmetic. And it's really not something you want to make users do inside the language, because it's really tricky. And because it has a clean abstraction, so you can bring it into the language and still have this nice abstraction. Okay. So very recently there's been some work on a language called Leo. This is developed by Aleo. And, at a higher level, this is for targeting this ZEXE-based system, but from the perspective of a language, this is targeting circuits, specifically rank-1 constraints, just like everything that we've already talked about. And, in some sense, I think it represents... It's still in progress, but it represents the best of many of these. So it implements data-dependent control flow, it implements some notion of arrays. I don't think it's quite as feature-complete as some of the academic projects in this space, but it's a lot more polished than them. And I would definitely consider it like a part of the new generation of languages.


Anna Rose (38:19):


Leo, is that one of these RAM register languages, or is that more of the HDL model?


Alex Ozdemir (38:26):


It's it's totally RAM register. And I would say actually its strongest selling point is that it implements this idea of data-dependent control flow and branching. And it does it with a higher level of polish, than the research projects, which were the only previous things to do this. They're not complete. So Leo still has a few things to fill in, for sure. Noir is another upcoming project. I don't know that much about it, but I'm excited. It's being developed by the folks at Aztec. And we'll see. Maybe... Hopefully, it'll have kind of the same kind of quality that Leo has, or that perhaps new iterations of Zinc will have.


Anna Rose (39:01):


You've talked through a lot of the languages that we have mentioned on the show. But there's a few things that haven't been talked about yet. One is STARKs. So early on you mentioned that STARKs are working at a different level. But there is Cairo, which is the language that StarkWare developed in order to interact with STARKs. Where does that fit into this entire map?


Alex Ozdemir (39:25):


So I think before... So far we've been painting the stack as just kind of like two layers. You got the language, you got the proof system. But the reality is that not all the proof systems have the same interface. So many of the proof systems that were developed early on have this interface of rank-1 constraints, which is a particular way of writing finite field equations and has particular costs associated with it. And STARKs are proof systems that are built for a slightly different way of writing finite field equations. And often the people who developed STARKs and talk about them, describe their model as sort of RAM register-like, and that you imagine applying the same equations again and again and again and again in sequence. So rather than having a big set of equations, you have a small set that gets repeated. And so this gives them a slightly different interface and also it gives them a need for sort of new and different languages, because they have this different interface. And so that's what Cairo is. So Cairo is a language that is targeting the STARK model. And if you look at it, it starts to look like assembly languages that you might've seen in the past. So there's the notion of memory that's pretty cheap to access, there's notions of jumps and conditional jumps that are baked in, it's not a huge compilation problem to support them. I think Cairo is almost like the gadgetlib or the Bellman of the STARK world.


Anna Rose (40:54):


Wow. Like the starting point in a way.


Alex Ozdemir (40:56):


Exactly. And the people who advocate for STARKs think that they'll have an easier time getting higher level faster, because of this different model. Maybe it's true, maybe it's not. I think time will have to tell. But it is exciting, because it seems like we're seeing the beginning of the story for a new lineage of proof systems.


Anna Rose (41:19):


Cool. Some of these languages, as far as I understand, have different models for storing memory. And I guess this relates going back to that like RAM register versus HDL circuit, it has something to do with that, but we didn't really talk about it in the context of memory, or at least I didn't understand it in that context. Is there some difference? Is there some way to help us understand that?


Alex Ozdemir (41:43):


Yeah, that's a great question. It seems like these languages sometimes adopt a different position with respect to memory. What is that difference? And actually it relates to what we're going to talk about next, which is the proof systems that underlie these languages as well. So the first difference is that some languages, almost all the ones that we've discussed, are targeting the single finite field equation model, the R1CS model, and that model has no memory. So for them memory is a compilation problem to be solved. And the languages, some of them solve it well, some of them solve it poorly. Or I shouldn't say poorly, some of them solve it and some of them don't. Some of them give you memory and compile it away, and some of them don't. Now, if you move to the STARK model, of what they call AIR, that's their circuit format, it has a notion of memory built-in. And so for them, there's still some compilation questions related to memory, like how exactly do you represent that to the programmer, but it's not a compilation problem to be solved in the same kind of way. It's fundamentally much easier to support, because the proof system is helping them.


Anna Rose (42:50):


Okay. Another question I have is, last week's episode was all about Arkworks. We had Pratyush on, we talked about that stack. And we didn't just mention this in the last list of programming languages, and we did on that show. I also tried to ask him to place it in a map, but I am curious where do we place it in this context? Where is Arkworks? Where is that work?


Alex Ozdemir (43:16):


Arkworks here is, from our perspective, it's a proof system. So it's what you do after you have a circuit. And it is actually telling that inside Arkworks you have its own mini stack. So even inside proof systems, stacks within stacks within stacks. And so Arkworks, in some sense, it's like a competitor to libsnark and to the Bellman implementation of the Groth16 proof system and to Spartan and to all the new proof systems that people are coming out with. And in some sense, though, its aim is of be a unified implementation of all of these different ideas. So it's trying to be like an umbrella, perhaps I would say.


Anna Rose (43:56):


And that's what you mean by a stack within a stack? Where it might have libsnark within itself or Bellman in itself.


Alex Ozdemir (44:04):


Exactly, yeah.


Anna Rose (44:05):


Or what's another thing that could be in it? Would Marlin fall in that? Would Fractal, Plonk, RedShift? Those are, as far as I know, those are all circuit names, I guess.


Alex Ozdemir (44:20):


Yeah, all of those. In some sense, to be clear, they haven't implemented all of these, but their observation is that a lot of these proof systems themselves, like the distinct proof systems, share common machinery. And it seems from the outside perspective, that they're trying to collect all that machinery in one place. And so I know they've implemented a proof system that they called GM17, and that they've implemented Groth16. I don't think they've implemented Fractal yet, and I don't think they have implemented Plonk either. But they are doing a great job of trying to unify as much infrastructure as possible in the proof system, part of the zero knowledge stack.


Anna Rose (44:58):


I have one more question on another project that we've heard a fair bit about, which is Halo. And I'm wondering, does Halo also fall into R1CS world, or is its own kind of thing on the side? Because I know that it does some pretty innovative things.


Alex Ozdemir (45:16):


This is a little outside of my area of expertise, but to my knowledge, so Halo is one example of trying to build recursion into SNARKs, and this changes how you can use them from the outside, because you can write one proof and then write another proof based on that and roll up their verification. And, in some sense, it's like a verification win in an asymptotic sense, because you accumulate all the proofs. But programming for it, to my knowledge, is a similar challenge. In each Halo proof you prove some single step and that step needs to be expressed as a circuit.


Anna Rose (45:54):


Okay. Maybe we'll leave this aside or throw it out to the folks building Halo to help us place this after the fact. But so far, I guess we've mostly been talking about, except for maybe Cairo, we've been talking about these R1CS proof systems. We're going to put Arkworks in there, even though within Arkworks you can have some of the other ones. But what are other proof systems that these languages may eventually be interacting with?


Alex Ozdemir (46:19):


Oh, yeah, that's a great question. So I mean, I guess, pick your favorite R1CS proof system, like Sonic, Supersonic, whatever. All of those are immediately in scope. I think one interesting thing is that you could imagine taking these languages and getting them to produce circuits for slightly different proof systems as well. And so there's a lot of work right now trying to take full advantage of Plonk. In some sense, you can use Plonk just like R1CS, but that's not the best way that it can be used. It supports other things as well. And I think that that's something that we're going to see some development on over the course of the next few years. Another thing that you might imagine doing, is trying to heal this divide between the stack for R1CS and R1CS-like systems and stacks, which are more like this RAM register model. I bet that you could get a lot of mileage out of trying to build unified compilers or compiler infrastructure that captures both of these models. And there's some challenges there, but I would also expect us to see some development on that front, over the course of perhaps the slightly longer timeframe, maybe three to five years.


Anna Rose (47:27):


Now, a topic that has come up a number of times in the show, but also we did a study club series on this polynomial commitments. Where does that concept fit into this language stack? Is it just being used throughout it? Is it something separate from it? Where does polynomial commitments fit in and what are they?


Alex Ozdemir (47:49):


So maybe just starting with what they are, polynomial commitments are their own cryptographic primitive, kind of like SNARKs are cryptographic primitive. What they do is they allow someone to commit to a polynomial and then later on reveal its evaluation at various locations. And they fit into this stack, you can tell from their description, which is very mathematical, they belong in the proof system part of the stack. So they are a primitive that are used inside of many proof systems. So for example, if you open up Arkworks, you'll find an interface for polynomial commitments and you'll even find multiple implementations of polynomial commitments. That's one example of the common infrastructure that that project is trying to centralize. So there's a number of proof systems, Plonk, Spartan, I guess, I think Sonic as well, that all are sort of almost entirely generic over their polynomial commitment, and you can get different performance or different properties from them by substituting one polynomial commitment with another.


Anna Rose (48:49):


So like you said, this is the very kind of mathy side of it, but I'm wondering these proof systems, are they connecting the mathiness to the computer hardware or are they connecting the mathiness to the higher level languages?


Alex Ozdemir (49:04):


So in some sense, their entry point is a set of equations over a field and the statement that I know a solution to them, that's their entry point. And they're trying to connect that kind of statement with the hardware on your computer, so that your computer can check that kind of statement for you. And also in doing that, they're using a lot of very deep cryptography. So in some sense, they're connecting that statement, your hardware, and they're also connecting it to a complexity theory, because ultimately your hardware is going to run some verifier and that verifier is sound because of what we believe about the hardness of certain problems. I think it's exciting in that way. In some sense, the proof systems, they touched the highest levels of abstraction, and they also touched the lowest levels of abstraction, at the same time.


Anna Rose (49:50):


That's wild. And it's so cool that this is all within this kind of specific cryptographic field. I wonder, would you find the equivalent kind of top to bottom stacks for other types of cryptography? Always?


Alex Ozdemir (50:08):


That's a great question. I think the answer is increasingly yes. And I believe, the reason that the answer is yes, is that we're building more and more cryptography that operates over some kind of computation. So we've been talking about zero knowledge proofs. Here the computation is checking that some statement is true. But multi-party computation is something that even zero knowledge people interact with in the context of these ceremonies. That's also... A multi-party computation is a piece of cryptography that's operating over a computation. And they have really a lot of the same problems that we have here. Similarly, if you look at witness encryption, this is a new branch of cryptography, that's trying to build encryption systems around computation. And they're very theoretical right now, but I think if that work ever becomes practical, there are going to be similar kinds of concerns there.


Anna Rose (51:03):


So I feel like we've done a pretty good job of mapping out a lot of these languages and concepts. In some framework and as a disclaimer, in case your languages or your language was not mentioned, or your proving system wasn't... Apologies, we did our best and we'll always be adding to this list. So do get in touch, so that we remember it for the next time we do a mapping like this. But I do have a question for potentially people who want to get into working with this. And that is how do you actually evaluate all of these languages? How do you choose what actually works for your particular application or your particular idea?


Alex Ozdemir (51:45):


This is a great question. And I think that there are two layers of questions you have to ask yourself. The first question is, where am I on the generality performance trade-off? Am I Zcash? Am I going to write a single circuit that everyone is going to use and really needs to do well? Or am I trying to build on top of ZEXE, or I imagine lots of people are going to write their own circuits. And so the first question is, which one of those am I? And if I'm Zcash, I want a low-level language that's going to allow me to say exactly what I want and really optimize. And if I'm ZEXE or I'm building on ZEXE, then I want a high-level language, that's going to enable people who are non-experts to write circuits that capture what they want. I think most people are like ZEXE. I think that most people are not like Zcash. So then within the ZEXE world, within the model of I'm trying to enable lots of people to write circuits, there are a few things that you should ask yourself about a language.


(52:47):


I think generally you want languages that are in the RAM register model, not in the hardware description language model or the circuit model, because more people understand RAM register model. And then within the RAM register model, there's a number of test cases that you can use for seeing how well language is capturing that model. And these test cases, they seem kind of arcane, but they actually ended up being very important. They are, can you mutate variables? Can you access arrays at data-dependent locations? And can I do data-dependent conditionals, where only one side executes. And those three things, variables, conditionals, and arrays basically capture the hardest part of trying to move from the RAM register model to the circuit model. And I think those are the strongest differentiators among languages. So ideally you want to find the language that's going to give you all those things.


Anna Rose (53:45):


And this is maybe for people to evaluate, because we just mapped the languages that we know of today. There's potentially going to be a lot of new ones coming online. And so this is maybe a bit of a framework to help them navigate that. With the languages that we mentioned, this is a side note and actually something that I should throw over to the Circom team, but I wonder, would they, given that they are more of the circuit model, is it possible for them to also create something that would interact with the RAM register model? Could they create something on top of themselves?


Alex Ozdemir (54:16):


I think they could for sure. From my perspective as a language designer, I don't necessarily see the case of going through Circom instead of just going directly to a circuit from this higher level language. But I could be wrong about that. And regardless, the team has a lot of knowledge about implementing languages, and I'm sure that if they did a RAM register language, it would be a good one.


Anna Rose (54:39):


Cool. Maybe that's a little bit of a wishlist Jordi, if you're listening. Cool. I'm trying to think, is there any other differentiators that we should touch on for people to think about when they're analyzing these languages for their own projects?


Alex Ozdemir (54:56):


I think there are some other things too. And these are things you would ask whether you are programming SNARKs or whether you're programming your computer, which is how good is the compiler, like do I have good error messages? Do I have good tooling? In the SNARK world that looks like, can I produce Ethereum smart contracts that valid… Like is it easy for me to go from a program in this language to an Ethereum smart contract that validates proofs for the circuit? So I think these questions around tooling are also really important and they're the reason that things like Pequin or xJsnark, the academic research projects, it's not like they're bad projects, but they fall a little bit short on that frontier, yeah.


Anna Rose (55:48):


Got it. Is that sort of like you need to have an ecosystem? And does it have something to do with the number of contributors, not only the amount of tools? Because obviously I'm assuming the more contributors, the more just activity generally.


Alex Ozdemir (55:49):


I think so. Yeah. And this is all about, can you get people to invest and to build those tools? And those problems are really challenging, actually. I think that on the global scale, these questions around ecosystem are actually what determined the rise and fall of languages. Really, you want the compiler to be kind of a detail that you don't have to worry about too much. So actually maybe this takes me into something that I wanted to bring up. Which is like, we mentioned already that there are these hard compilation problems that these languages have to solve, and... data dependent array accesses, conditionals, variables. And it's kind of a shame that in the current state of affairs, languages all try to tackle these on their own. And ultimately they choose to tackle some of them and not to tackle others, and you end up with a patchwork landscape where you have to ask that question like does this language support branching? Because it might just not, because they have finite time and it's a hard thing to do. And so one thing that I've been doing recently with some friends of mine is making the case for an LLVM-esque general infrastructure for supporting these languages, that can handle these hard compilation cases automatically. So we call this project, CirC, and the observation is that variables are hard, branching is hard, memory is hard, there are other hard things, but these things are hard in a way that doesn't depend on the syntax of your language, it doesn't depend on exactly what types you support. It doesn't depend on pretty much everything. And so it's possible to implement that. In the same way that Arkworks tries to centralize infrastructure for proof system, it's possible to centralize infrastructure for compilation as well. And that's the goal of this project.


Anna Rose (57:35):


We actually got a chance to do an entire ZK Study Club on CirC, which I'll add a link to, for people to explore. But given that you're here on the show, is there anything more that you want to share with us about that? Maybe like the stage of the project, what you hope it becomes?


Alex Ozdemir (57:53):


Yeah, I think so. The state of CirC right now is that we sort of did a prototype of it and we are in the progress of porting the project to Rust and sort of cleaning it up as we do that. Now we have a much better understanding of what it's supposed to do. And that's actually coming along really well. So our first checkpoint is going to be supporting ZoKrates through our compilation infrastructure. And we're getting pretty close to that. I think that what that project could mean to the community is it could be a tool for all these developers that are creating new languages and new compilers. So if you're listening to this and you've been tasked by your firm, or just as a hobby project, you're trying to construct some kind of language, totally come talk to us. It seems like we have something that can make that job a lot easier. And also we're in the progress, like I said, of porting it, which means that there's a chance to chip in here and be one of the founding people to get this infrastructure into a super productive and effective state.


Anna Rose (58:54):


So I want to say a big thank you, Alex, for coming on the show and helping us navigate all of these amazing, exciting new languages, old languages, how they interact with one another. I honestly feel like what we talked today, it's something that I've been unofficially trying to do myself for a while, like to understand and place where these things live. I think this has been a really great run-through to help better formalize that in my mind. So yeah, big thank you.


Alex Ozdemir (59:27):


Thank you for having me. I think I also really appreciate just how rich this landscape is, and it's a lot of fun to try to map it out.


Anna Rose (59:35):


I hope you get to come back sometime and share maybe some updates as a lot of these things come online.


Alex Ozdemir (59:40):


Yeah. That'd be a lot of fun.


Anna Rose (59:41):


Cool. So to our listeners, thanks for listening.