Anna Rose:
Welcome to Zero Knowledge. I'm your host, Anna Rose. In this podcast, we will be exploring the latest in Zero Knowledge research and the decentralized web, as well as new paradigms that promise to change the way we interact and transact online.
Anna Rose:
This week, I chat with Alessandro Chiesa, a professor working on cryptography, complexity and security. He's also a co-founder at Zcash and StarkWare. Ale has taught many of our previous guests. And in this episode, we explore some of the work that has come out of his lab. We also touch on the challenge of creating educational resources around SNARKs, especially since it is moving and changing so quickly.
Anna Rose:
We also look at which parts of SNARK understanding are somewhat stable and which parts are still very much open problems to be studied further. This also happens to be the Zero Knowledge Podcast's 200th episode. And to be honest, it kind of snuck up on me. Still I'm so excited that for this episode, we have a deep ZK topic and an expert guest. Seems very fitting. I hope you enjoy. But before we start in, I also want to let you know about ZK Hack. ZK Hack is a multi-round online event with workshops and puzzle solving competitions. It's put together by this podcast and the ZK Validator. It's supported by a slew of fantastic sponsors. Many of them ZK projects from the space. It will kick off a weekly cadence starting October 26th, 2021. Think hackathon, meets CTF, meets Dark Forest round-based competition. There will be a leaderboard and there will be prizes, as well as deep dive learning sessions with the best teams in the space. Do head over to the website now to join. If you happen to be in Lisbon at the end of this month for the Lisbon Blockchain Week over there, be sure to check out the ZK Hack party. Together with Aleo and Anoma, we're throwing this in-person event, a little bit it's like a pre-party to the ZK Hack event itself. So yeah, do check it out. We'll be sharing links on our Twitter. Telegram will probably put some links in here once we have them as well. Hope to see you at the ZK Hack and possibly at the ZK Hack party.
Anna Rose:
I would also like to thank this week's sponsor, Centrifuge. Centrifuge puts real world assets on the blockchain, allowing issuers to get liquidity on their assets and investors to make safe stable yields in the volatile crypto world. It's built on substrate, but they also bridge between Ethereum and Polkadot. Centrifuge are currently hiring for a number of positions, including protocol engineer, product manager, full stack engineer, security and DevOps engineer. Do check out the link in the show notes to find out about all of their jobs. And I want to say big, thank you again, Centrifuge, for sponsoring the show. Now here is the 200th episode of the Zero Knowledge Podcast with our guest, Alessandro talking all about SNARK research and education.
Anna Rose:
Today I'm here with Alessandro Chiesa, a professor working on cryptography, complexity and security. He's also the co-founder of Zcash and StarkWare, and actually, Ale, you've been on the show before, so welcome back to the Zero Knowledge Podcast.
Alessandro Chiesa:
Thank you for an introduction. It's wonderful to be back.
Anna Rose:
That last time that we had you on the show, you were mainly here speaking with Eli about StarkWare and the research that you had done leading up to the project. What we didn't really get to cover in that episode was your other academic work. You've been running a lab, the Laboratory for Computational Security. It's produced a lot of research. A lot of your students have come on this show to describe some of the work that they're doing, and actually, any subscriber to the zkMesh Newsletter may be familiar with your name because you've authored or co-authored a lot of the research that ends up in that newsletter. Quite excited to dig into all of this with you. But to start off, I'd love to learn more about the lab and explore what you and your students have been developing and exploring in the last little while.
Alessandro Chiesa:
Sure. I started working as a professor six years ago. And so it's now a good chunk of time and over this period, I've been working on ramping up and running a research program that touches on many aspects of cryptographic proofs, primarily SNARKs. What do I mean by many aspects is that the research frontiers in this topic concern basic mathematical questions in low degree testing, linearity testing. These are a bread and butter of where the succinctness of SNARKs come from. To other topics in cryptography. How would you take probabilistic proofs? Like PCPs, interactive proofs and interactive Oracle proofs. Then you convert these to arguments systems. There's the theory and foundations of these transformations, and then there's the applied aspect that moves further from cryptography into applied cryptography and system security, which is studying the practical aspects of these constructions and how to use them inside protocols to achieve interesting goals like private peer to peer payments like in Zcash. As part of running this group, I've been trying to get a strong students interested with a variety of backgrounds, maybe more on the theory side or more on the applied side and to learn as much as possible the ins and outs of probabilistic proofs, SNARKs, so we can push the frontiers across this whole spectrum in a way that takes advantage of the fact that not only we understand the theory but we also understand the practice, and so we can in a coherent way push the boundary of SNARKs.
Anna Rose:
Cool. How would you break it down in the lab? How much is theory, how much has actually applied? Would you do 50/50 or do you think it's more focused on the theory side?
Alessandro Chiesa:
It's a great question. What I've found is that it takes several theory papers in order to get a smaller number of good applied papers. And out of those, there's a smaller number of useful protocols in the real world. The way I've been operating that I find is useful is more biased towards the theory because there that's where we want to push frontiers of difficult questions and we don't want to be restrained by whether it's immediately practical or not. That might be a concern that will come later. It may never be, but at least we're making an attempt to push the front here.
Alessandro Chiesa:
Then usually, solving a hard question, in the theory world will inevitably force you to come up with new tools. Then what has happened time and again, is that some of these tools, maybe not necessarily as invented for theory, but morally, they're those same tools and they're being useful in applied contexts. A great example here is, actually my first paper I've wrote in 2009 was about recursive proof composition of SNARKs actually.
Anna Rose:
Wow.
Alessandro Chiesa:
And at a time it looked very much only theoretical. And in that paper, we formulated this notion of proof-carrying data, which was capturing what can be done with SNARKs in a recursive way. It took maybe till the end of my PhD, to have a first implementation that remained practical, so a running implementation of a recursing SNARKs, and it took another few years for these to become practical enough to be useful in the real world. But in the meantime, there was a lot of theory that was done to enable a little bit of applied cryptography to enable a few systems in the real world to take advantage of the recursive SNARKs. There's several examples of these that I can see over the past to now. I've been working in this rocket for 12 years. Six years as a before a professor and six years as a professor. There's many examples of these where I see that's it a lot of effort on theory. If you tackle the hard questions, then you will be able to later to pick up new techniques and then leverage them for advances in applied cryptography.
Anna Rose:
Cool. Nowadays, recursive SNARKs, you can actually see them very much in the wild too. I believe Filecoin is using recursive SNARKs, Mina is built around this concept of recursive SNARKs. Are those the same kinds of versions of what you had been working on or did you see people working on this in parallel? Or is this really like your work would have been the foundation for some of this stuff?
Alessandro Chiesa:
Well, the basic definitions such as what is proof-carrying data mostly remained the same. Even articulating what is the goal of, what you get once you recursive SNARK, it's actually not so simple. Once you see it, then that make sense. That's proof carrying data. The techniques to achieve proof-carrying data have evolved,-as it should be the case. It's not just in practice. Even let's say from a theoretical cryptography standpoint, what primitives sufficed imply proof-carrying data has become weaker and weaker? For example, maybe back in the day, in order to build proof carrying data, you would need what we would call a fully succinct SNARK. A SNARK that to when proving the correct computation of a machine running time t, the verifier will run in time polylog t, with no pre-processing or nothing, just like that. That was the first installation of proof-carrying data. But with time, we've been able to relax some of these requirements by having a more sophisticated and careful construction. For example, the first relaxation was from polylogarithmic verifier to merely sub linear, and that was not clear prior they would be enough. Another relaxation was saying, well, maybe we're not going to have a fully succinct SNARK. We're going to rely on pre-processing of circuits. That's something that wasn't considered in the beginning, but later, because in practice, the first SNARKs to be practical were the ones with pre-processing, we had to understand how to the recursion with reprocessing, so there was something that was understood first in theory, and then in practice.
Alessandro Chiesa:
Perhaps most recently, there was this line of work starting with Halo and then accumulation schemes, where even the fact that the verifier needs to be sub linear was removed and now we can do recursion, even with a linear time verifier, but there is no compression. But somehow there is so-called accumulation that you're never really compressing anything. You're just batching multiple statement into one new statement and then in the end you will have some verification check that is not smaller than any given node, as the size of each of the nodes you had along the way. But that's fine because you have it at the end and at least you're not paying for the length of the chain, for instance.
Anna Rose:
That Halo case, this is actually something I wanted to talk to you about, this interaction between industry and academia. So Halo came out of the Zcash engineering group. I think they were trying to solve a very problem right in front of them. They used engineering techniques. And then I know that there was a paper published that you're a co-author on this Proof Carrying Data without Succinct Arguments. I know that this came out, I think we published in the zkMesh back in February and I know that when that came out, it was the formalization of what Halo had done. This was this interesting moment where you saw a company or an industry propose something and then academia formalize it. Do you see that happening a lot or do you feel like most of the time it's the other way?
Alessandro Chiesa:
It's a ping-pong.
Anna Rose:
Okay.
Alessandro Chiesa:
It goes through cycles. Perhaps another, maybe perhaps even more serious interaction that impacted a lot of my research program was when we were collaborating with the Electric Coin Company team to deploy Zcash, became clear that this trusted setups in the real world, especially the ones old-style that were circuit specific and not updatable, were really, really difficult to deploy. This is not at all clear, with my say, classical cryptographic training as a graduate student initially, having a common reference string or a structural reference string how it's called today, was a normal thing that nobody would worry about. It was a normal setting. But once you were confronted with realizing this in a peer-to-peer setting, it became, this was a major issue. To me, it made me and my colleagues very interested in trying to address this, so this was a for example, a direct motivation for a paper that was written for a tailored multi-party computation protocol to sample the reference string for Zcash. This was the paper that was then the basis for the protocol that was actually run, that was further developed and designed a thing by Sean and Dara and then run in the real world. But more than that, I rekindled my interest into SNARGs based on almost nothing. Let's say random functions that have this bare bones setup as just a strong cryptographic hash function. In some sense, the birth of interactive Oracle proofs was a, I view it in my mind as some side effect of the struggles of dealing with reference strings. I was personally motivated and I believe some of my colleagues were as well, to go back to these, what were unfortunately very expensive constructions and what can we do about it? And somehow we found this way of saying, wait, what if instead of building PCPs, we build PCPs across multiple rounds? And this led to the idea of an interactive Oracle proof. Which ultimately, in a span of five years from formulating the notion of an IOP and then lots of research on IOPs has led to what are now quite efficient. SNARKs that not really rely on any structured different stream. All you need is a hash function. To me that was, it would not have happened or at least I don't believe I would have worked as hard on the direction of IOPs if I hadn't experienced and seen how ECC had to overcome this huge barrier of reference trend. We are much of a motivator.
Anna Rose:
Can you give me a little bit of an example of an interactive oracle proof version. Do you still need a trusted setup but they're just way, way more efficient or is this the replacement for a trusted setup when you say that?
Alessandro Chiesa:
An interactive oracle proof is the probabilistic proof that is used inside certain types of SNARKs. All SNARKs must have a trusted setup. There do not exist SNARKs without setups. We can only talk about the colors of setups. Some colors are nicer than others. The best you can hope for is just a little bit of randomness in the sky. Just say, 100 random bits or 500 random bits that we can all agree upon were random. For example, you could pick the first so many digits of PI and call that randomness. This is very easy to agree upon it and deploy it. And such types of SNARKs arise from using the random oracle model to compile either PCPs or IOPs which are transition of that. This is the most desirable type of setup because there is no ceremony to run. You could run a coin tossing protocol, but nobody's going to do that. What you're really going to do instead is say, I don't know, SHA of, I don't know, Alessandro Chiesa that we're going to give you some outputs and that's random looking enough and nobody's going to think that Alessandro Chiesa as a byte string holds a trapdoor for SHA. So that it creates some a weird looking random string that then the SNARK will be broken under.
Anna Rose:
I see. What you're saying is, there is always a trusted setup, but the trusted setup using IOPs is almost like it can actually be more safely run by an org. It doesn't necessarily have to do the multi-party computation with all these different participants.
Alessandro Chiesa:
Right. IOP-based SNARKs. Because like a PCP is just some idealized information theoretic model, you cannot really use it. I wish we could but we can't, because the verifier needs to do things that it cannot do in the real world. Then use cryptography to implement what the verifier should have had to do. You implemented using the random oracle.
Anna Rose:
Got it.
Alessandro Chiesa:
And in the real world, a random oracle is just something like SHA-256. As long as we agree that we're all going to use SHA-256, that is the setup. It's much easier to agree on the hash function than it is to agree on, I don't know, a bunch of powers of some group element that have some structure.
Anna Rose:
Can you give me some examples of protocols that maybe have been proposed that our IOP-based SNARKs? Would Marlin go in under this or Fractal.
Alessandro Chiesa:
One of them yes, and the other one no.
Anna Rose:
Fractal, yes. Merlin no, I think.
Alessandro Chiesa:
Marlin uses something that is called the algebraic or polynomial IOP, is essentially, it is an IOP where we give a little bit of extra power to the verifier that rather than accessing, let's say strings by querying locations, it accesses polynomials by querying their values. Unfortunately, we don't know how to compile, let's say in a practically interesting way, polynomial IOPs into SNARKs by just using hash functions, okay? We know how to do it using say, groups of one in an order, but I'm not going to get into the difficulties of practical aspects of them. Instead what we do is we use things like pairing or groups, primordial groups that the discrete logarithm is hard to compile them. And so that was Marlin is. Fractal is actually based on an IOP. It is a pre-processing SNARK R1CS that is obtained by taking an IOP for R1CS and compiling it with hash functions. It's not used in practice. I think it's a very nice protocol. It does what it's supposed to do, which is a pre-processing SNARK for R1CS. But perhaps more familiar to the audience will be systems deployed by StarkWare, so cryptographic proofs are SNARKs that are obtained from some type of IOP. Rather the IOP being for R1CS, it is for another language called AIR, algebraic intermediate representation, which is, you can think about it some sort of algebraic machine. There's no pre-processing there. There is some uniform model of computation that looks like some algebraic machine. These things are practical. They're running. They've been running on Ethereum mainnet for now maybe a couple of years. That's one of the nice things in that commercialization effort. There wasn't an issue about how you deploy the setup. Well, we just have to decide what the hash function is and that's it.
Anna Rose:
We just mentioned a number of works from student of yours. These are the kinds of works and research papers that I mentioned earlier in the show. But can we talk a little bit about who some of those researchers are and the type of projects, and even full-fledged protocols that have come out of the work that you've been doing? Because I think what's interesting also to our audience is we featured a lot of people who have gone through your lab on this show, and so it might be cool to make those links.
Alessandro Chiesa:
I have some projects that I very much enjoyed working on. One of them for sure is a Aurora. This is a project with several of my students. First one is a longtime collaborator of mine, Eli Ben-Sasson, which should be familiar to everyone in the audience, as well as Michael Riabzev, was a student of Eli Ben-Sasson, also co-founder of StarkWare. Then Nick Spooner, a former student of mine, Madars Virza collaborator of mine from my MIT days, and a Nicholas Ward, also another former student of mine. What was very cool about this project is that when we formulated the notion of IOPs, at the time we didn't really have very interesting IOP examples. At some point we were able to design some interesting IOPs. Primarily, they were for a machine computations. The problem with machine computations is at the time, we didn't really have very good ways to encode interesting problems that people would care about such as, at the time, maybe the example was the Zcash circuit. As a machine computation, just doesn't look like that. There's really, it was more of a circuit computation. And we did not have very good techniques for coming up with practically relevant IOPs for circuit like computations. At least in my mind, the Aurora paper was the first one where we're able to say, okay, here's a instance of R1CS and we have a direct protocol for it that we're able to implement and get some reasonable efficiency for. One of the things that we came up with in the protocol was this univariate sumcheck, which was solving the problem of how to convince a verifier that a polynomial sums over a big domain to a certain claim sum, even though the polynomial has one variable, unlike the multi-variate sumcheck well-known from the '80s. This univariate sumcheck was the, let's say the main engine of practicality inside Aurora, and it was later used in many other protocols thereafter. Both Fractal and Marlin ended up using it. It taught us a lot, even though Aurora itself, I view Aurora as an example or a theoretical exploration that actually did have some practical implementation but it was not quite practical enough for usage. For example, the verifier was not sublingual, had a linear time verifier. So it was not a pre-processing SNARG. But the techniques that we had to come up with to designer an IOP for R1CS were later very influential in mainly of the other works that we did in the research group. I was at the time, very, very excited. I felt like we finally were putting our hands on techniques that were going to be useful later on.
Anna Rose:
Recently we did a Study Club on some work you did with, sumchecks, was that in any way coming out of this original research? Was that the formalization of some part of it?
Alessandro Chiesa:
Actually it's unrelated, but I can talk about that because maybe it is interesting.
Anna Rose:
Pattern matching here.
Alessandro Chiesa:
Sumcheck-like techniques play unreasonably large role in the design of SNARKs, primarily because SNARKs contain probabilistic proofs, there are ways to formalize the statement. And sumcheck protocols play an important role in probabilistic groups, so that's why they show up. And broadly speaking, what you do as a sumcheck, how do you design a sumcheck protocol depends on whether you're tackling, you want to do the sum check for a multi-variate polynomial or a univariate polynomial. For the multi-variate polynomial, we have an answer already from the 80's and for univariate polynomial, we had this univariate sumcheck. We were interested in univariate techniques because primarily of FRI. FRI is a low degree testing protocol for univariate polynomials is very practical. That's why in Aurora, we were focusing on that. Right now you were mentioning a recent work that I'm actually very excited about. It came about from a quite different motivation. It has to do with a work that I was not involved in, which is the design of Bulletproofs. Bulletproofs is a logarithmic communication argument system with a linear time verifier, also linear time improver, based on discrete logarithms, so very influential work. It's a range proofs based on Bulletproofs for example, using Monero.
Alessandro Chiesa:
I was personally quite interested in the protocol because unlike most other SNARK constructions out there, it does not prominently display an inner probabilistic proof compiled into a succinct argument. There are some ways to maybe force it out, but it feels a natural. At least I think that one of the difficulties of coming up with a maybe post-quantum versions of Bulletproofs were directly related to the fact that maybe we didn't quite fully understand Bulletproofs. In a recent work with two post-docs, a former postdoc of mine, Jonathan Bootle, who is also co-author of Bulletproofs, and Katerina Sotiraki, who is currently a postdoc at UC Berkeley, formerly a student at MIT, with Vinod Vaikuntanathan. We were very interested in designing a post quantum analog of Bulletproofs, and the way we went about trying to tackle that rather than trying to directly design some lattice based protocol, we went back and tried to understand what was really going on in bulletproof and we made some, what I think is an exciting finding. Which actually ... One can view in retrospect, Bulletproofs, literally the sumcheck run on a cryptographically hard group with some consistency check at the end. You can actually phrase it like that. It was just a sumcheck in disguise that interestingly was not run on the usual domains like finite fields but in basically rings. But understanding the protocol at the higher level of abstraction, we were able to basically push it at a high enough abstraction layer that then we could then instantiate it with completely different domains, polynomial rings where there are hardness assumptions on latices, for example, the short integer solution problem is hard. I think all of us on the paper were very happy about this discovery or invention, because it somehow helped us explain in a coherent way, a long line of works that was working on Bulletproof style protocols, making all kinds of observations that were also shared by sumcheck-based protocols. They were talking about space efficiency or round by round soundness and all these things that Bulletproof style protocols were having, also sumcheck protocols have. But there was no real reason, no clear reason why they both have these properties. Now there is one.
Anna Rose:
Now you know, that's so cool. Actually, yeah. So we did a study club on that.
Anna Rose:
I'll add the link in the show notes so that anyone can dig in deeper to this particular problem. I do want to go back to maybe some other projects that came out of the lab that you would like to highlight. Because I know you had a couple in mind.
Alessandro Chiesa:
I'd like to maybe spend a little bit of time to talk about what I consider to be the simplest SNARGs. Those are SNARGs built just from random oracles.
Anna Rose:
A little question because you're saying the word SNARG and not SNARK. I wanted to make a quick distinction there, because I know in a lot of your work, you are using this SNARG. So it's S-N-A-R-G.
Alessandro Chiesa:
Well first, I'm not being careful in this discussion. When I'm being careful, the G simply means that I'm not highlighting the proof knowledge property and the K means that for whatever reason, I'm trying to highlight the proof knowledge property. But please don't pay attention to the difference in this high level discussion and I maybe saying SNARG or SNARK-
Anna Rose:
Okay. It's interchangeable then.
Alessandro Chiesa:
... for this discussion. Yes, from a technical perspective, it specifically means whether you require proof of knowledge or not.
Anna Rose:
Got it.
Alessandro Chiesa:
This goes back to SNARGs with say the simplest types of setups, just the hash function. I find these fascinating for multiple reasons. First, we will always have hash functions. Maybe 100 years from now, 500 years from now, 1,000 years from now, regardless of what happened with supercomputers or quantum computers or whatever computers, it is likely we will always have hash functions that are secure enough for doing a hashing of things so you get random outputs. It's not clear that the same will be true about the hardness of discrete logarithms, the hardness of factoring, problems about lattices, hard problems about Isogenies. We don't know. There's structure in this problems due to geometry or number theory that can be exploited by adversaries. But hash function, it is reasonable to believe we'll always have them. In some sense, progress in SNARGs based on hash functions is, I view it as most fundamental or at least in the sense that it will stick with us for the longest amount of time. The holy grail in some sense is understanding, what is the best possible SNARG based on hash functions that we can obtain? And if that SNARG based on the hash function is good enough for practice, then we can actually make it part of digital infrastructure without worrying about whether we'll have to swap it out in the future or not. Okay. And because you can actually study hash functions in this idealized model of a random function that is random oracle, we can actually make very precise questions about what is possible and not possible for such SNARGs. There is a line of work, for example, with a co-author, Eylon Yogev who by the time this AIRs might have already given a study club-
Anna Rose:
A study club.
Alessandro Chiesa:
... where we basically study the capabilities and limitations of these primitives. For example, we prove that any SNARG in the random oracle model must have a verifier that makes more than a constant number of queries to the random oracle. This in some sense is a barrier to recursion. Because whenever you recurse a SNARG, you will have to express the verifier in terms of elementary gates to prove things about it, right? Which means that if you have many calls to a random oracle, then you will pay more in the circuit of the verifier. That's like a barrier that says there are limitations to how efficient these things can be for recursion. A major open question that is I think one of my favorite questions right now, is whether do there exist SNARGs in the random oracle model with linear size? What do you mean linear size, linear in the security parameter?
Alessandro Chiesa:
We do know that the answer is yes in other models, for example, based on discrete logarithms or based on pairings, we essentially know that the answer is yes. We have linear or nearly linear proofs. That's also why in the real world, they actually have smaller size, because sympathetically, we actually see that.
Alessandro Chiesa:
For example, graph 16 has a constant number of group elements. Those group elements have size that is a linear insecurity parameter. You need to pick it embedding degree of the curve to be large enough, but once we do that, then it is linear. Asymptotically instead, the best SNARGs that we know today in the random oracle model are at least quadratic. In fact, the practical ones are closer to cubic. That's why things based on random oracles have proof sizes that are more tens of kilobytes rather than a few kilobytes. That is exactly why.
Anna Rose:
Is an example of that STARKs then? They're quadratic or even cubic.
Alessandro Chiesa:
Exactly. They're at least quadratic and a little bit more than that because there's other reasons. That's why you see this, maybe 30 kilobytes or 40 kilobytes rather than two kilobytes. Once you have ironed out your completely efficient constructions, they are synthetics speak and you will feel them in the real world. There's a fundamental question. Is it possible say to have maybe a long time from now, a one kilobyte STARK, right?
Anna Rose:
Wow.
Alessandro Chiesa:
That question is basically in some sense, are there SNARGs in the random oracle model with linear size? The common belief is that because you have to ask many queries to the IOP and each query needs to be certified by a path in the Merkle Tree used to commit to the oracles. There is an inherent quadratic barrier because you have to ask security parameter number of queries, and each query is certified by a digest that is at least size of the security parameter, you will get nuggets security parameters squared. The recent work with Eylon Yogev, the work that he will talk about or will have talked about in the ZK Study Club, we're able to construct a SNARK that has a sub quadratic size. Slightly sub quadratic. It's not linear. We just don't know what the right answer is. We broke into quadratic barrier, so some of the common belief that quadratic is inherent is not the right answer. We have something slightly self quadratic and the jury is out about whether this strange expression that we obtained is actually the right one or one can do even better and obtain linear. I think it's a beautiful open question and somehow any progress in this direction will be in some sense eternal. We will always have hash functions and any further understanding we have would benefit us forever.
Anna Rose:
What is the name of this work, just for reference point for anyone listening?
Alessandro Chiesa:
Sure. This work appears in crypto this year and the title is 'Subquadratic SNARGs in the Random Oracle Model'.
Anna Rose:
Perfect. Cool.
Alessandro Chiesa:
We remain very interested in this question and we'll keep thinking about ways to either prove that there is some inherent superliner lower bound on the size of arguments in the random oracle model, or we can improve the upper bound with better constructions. Right now we don't know.
Anna Rose:
This goes to the question of how you're picking what you're doing. But do you look at a protocol and then there's pieces of it and you start to optimize or think about specific pieces? Is it more like you're starting a clean slate and you're trying to build something brand new? I'm just curious how you approach some of these problems.
Alessandro Chiesa:
Definitely not a clean slate. That would be-
Anna Rose:
That would be wild.
Alessandro Chiesa:
... suicidal. I think a common theme that I use for myself and I try to... I think it's a powerful theme and I try to teach to younger co-authors, is modularity and deep understanding. Just the fact that you have a protocol you can prove secure and is doing something interesting does not mean you're done. If you just try to digest it and understand that, where does it come from, why does it exist, you will understand that actually it is not just some isolate or some cool hack from a prior protocol. If you're actually doing something interesting, there are going to be deeper reasons for why that exists. Time and again, in research that I've done with colleagues, it has always, always been the case, that when you try to understand something better, you are actually making a-
Anna Rose:
Making more problems for yourself or?
Alessandro Chiesa:
No, no. You're actually simplifying. It gives you a better platform to then make further understanding later. And also sometimes you thought you were solving a problem and you did, but by understanding it better, you actually solve the problem more. Maybe you understand that actually you achieve more than what you thought, or maybe that there is some beautiful connection with another line of work and by understanding that connection, you are able to import more techniques, maybe not in the same paper necessarily but in following one. This keeping yourself to a bar of not just saying, do I have a theorem that I can prove, but do I understand where the theorem comes from, why does it exist? All these sanity checks and the introspection of, why is the world like that has been enormously helpful into making sure that not only we're making progress but progress on the hardest and most fundamental questions as opposed to, I don't know, just random optimizations which are useful. They can be useful for all kinds of things but are not necessarily going to propel you forward into tackling the hard questions. The hard questions will be, we will win over them only through deep understanding.
Anna Rose:
Cool.
Alessandro Chiesa:
I'll mention another beautiful, open problem about SNARGs in the random oracle model, and that has to do with constructing SNARGs that are so called complexity preserving. Let me explain that term. Complexity preserving is a notion that I actually introduced with old collaborator of mine Nir Bitansky many years ago. And it says, suppose that you want to prove some computation that runs in time t, and space s. Ideally you would like the prover that is proving this computation to run in time and space that are close to that. What typically people do in SNARKs is to say, no, no, no, we want the time to be close and then we certify space. For example, anytime you use an FFT, fast forward transform, your space efficiency may go out the window because you will have to write everything down and then compute it. And so if your algorithm is running in a small space, the reputation you're proving was running a small space, then you've lost that. How can you do that? About 10 years ago, we showed the first construction that satisfies this property, not practical at all. It was using fully homomorphic encryption and multi-prover interactive proofs. It was even private coin, not even publicly verifiable. But since then, there's been a few more works that have made progress on that, reducing the hardness assumptions. So we know how to construct this under nice cryptographic assumptions, improving on the fact that it was only private coin. Now we know how to do it with public coin. But in some sense, the hardest setting of all remains to be where you have nothing but a random function and the other problem remains open. We don't know whether we have good SNARGs, there's time and space efficient or complexity preserving in the random oracle model. The reason is that we don't have IOPs with that property. If we did, we could have them. That's a fundamental question about IOPs that remains open. It's a beautiful question.
Anna Rose:
Is this a trade off that you're describing though, this idea of time versus space? If it's longer time, would the space actually be smaller or is it always quick time, big space?
Alessandro Chiesa:
Ideally you would like to preserve both and say that whatever was the time and space of the original computation, you want the prover proving it to run in time that is very close to that. For example, if the time was t and the space was s, then you want the prover of that computation to run in time, say order of t and space order of s, roughly that. Now, one could consider of course, relaxations of this ideal goal and say, all right, we don't know how to do that. Let's maybe place a in trade offs and you might take a hit in your running time in order to preserve space. In fact, that's roughly what is true today. We have constructions that are either linear time but large space, or slightly more than linear time and good space, and that's the trade off that we have today. But we don't know how to achieve it together even outside of the random oracle model.
Anna Rose:
But the work that you just recently did, does that address this question or is that more like, this is a question that arose because of the work that you did, and you realize that this isn't something that's quite solved yet?
Alessandro Chiesa:
This is moving on a different, before I was focusing on the size of the argument in the random oracle model, whereas here, we're specifically asking, well, get me some decent proof size. Maybe it's going to be, I don't know, cubic in the security parameter, something succinct. But the emphasis is on the prover resources. And why? Because time and space, they both matter. You cannot scale a SNARK to large computations if you have large memory constraints because you get stuck. Conversely, if you have good memory space complexity preservation but your prove time sucks, then you're going to have to wait too long. Both of these are limitations in the real world world. In one way or the other, they come up and they impose issues on real world systems. You can try to mitigate them with systems techniques, for example, one work that I can mention is DIZK from quite a few years back. This was with Howard, now at Aleo, Wenting [Zheng] - she's now a professor at CMU - Raluca [Ada Popa], and Ion [Stoica]. There we were exploring how to scale up SNARKs by spreading the prover computation on a cluster. But the fact that it can be done doesn't mean it's easy and convenient. The reason why we had to do it is because if you want to scale up a SNARK, you just run into memory constraints. And there's just no computer that has so much memory. And you have to put together a bunch of computers that collectively had enough memory but now you have to manage the distribution computation, guiding the results to produce the final SNARK. Ideally, if your original computation did not have such large memory footprint, you wouldn't want the prover to have such a large memory footprint either.
Anna Rose:
Do you think all of these things are possible, or do you think there will be a point where you just get the closest approximation to what you want? There's almost like, can you hit that equals too, or is that just like, that's the goal we're going to try to like optimize until we get close enough.
Alessandro Chiesa:
I don't know. I've definitely had wrong intuitions before. There was a period a few years ago where I was just not convinced that there exists linear time IOPs. It just looked like FFTs are there to stay. They were in the way of everything. Just wasn't clear what to do if you're not running an FFT on a computation. And it just looked like, all right, well, I guess we don't have linear time IOPs. But things changed. There was very nice work in Asiacrypt 2017 by several people, including a former postdoc of mine, Jonathan Bootle, that showed a linear time IOP with square root time verifier. Not really fully succinct but showing that you could do something that is towards that direction. Then in a paper with Jonathan and Jens Groth from a couple years ago, we were able to bring this down from square root to any root. So cube root or fourth root. And more recently were able to get poly arithmetic. Now we have a linear time IOP with poly arithmetic verification, at least for computations that are algebraic over large fields. The question for Boolean computations is still open. This was something that I did not have the intuition was possible, but somehow in the end we do have a good understanding why they're possible. These things have happened several times in my career and at this point, mostly I suspend my judgment. I just try to follow what seemed to be the most interesting leads for a positive or negative result. Sometimes it's a negative result, like with the paper with Eylon proving lower bounds on the complexity of the verifier in SNARGs in the random oracle model. That was bad news. Okay. But gathering intuition from, is it the negative result or is it a positive result, is actually useful back and forth. Because if you get stuck proving a negative result, that may give you a possible progress or attack or approach on the positive side, and vice versa. If you can really get stuck on a positive result, then you might be able to somehow formulate barrier that says, as long as you will use these techniques or this approach, you cannot do better than that. It's a very useful research trick or strategy.
Anna Rose:
It's almost like you're framing known possibilities but they don't work together yet, but this is a space that could be developed. We know we can do this in this context over here, can we bring that into the context that we're trying to solve in?
Alessandro Chiesa:
Yeah I'll finally mention maybe two works that are not about constructing new protocols, rather they're about proving existing protocols, let's say more secure and specifically having to do with quantum attackers, quantum adversaries. Here, SNARGs based on hash functions have been widely believed to be post quantum in some way. Because if the hash function is post quantum, why shouldn't the whole protocol be post quantum given that is the only cryptography that comes into the protocol? I will not talk much about these but I'll just mention that over the past few years, there has been nice progress into formally understanding the post quantum security of these protocols into two settings. One is a paper from two years ago, it's called 'Succinct Arguments in the Quantum Random Oracle Model'. This was with two students, Peter Manohar and now at CMU and Nick Spooner, where we able to prove that constructions in the random oracle model based on PCPs and IOPs for example, the ones used in StarkWare, are formally secure in the quantum random oracle model, which is the model where attackers can query the random function in superposition. Now we know, that gives us much stronger evidence that yes, the intuition of post quantum security of this protocols really is there. And more recently, there's a paper with Fermi Ma who is now a Postdoc at Simons Institute in UC Berkeley. Again, Nick Spooner and Mark Zhandry, where we studied the... So this is not in an idealized model, this is in a what cryptographers called the plain model. It's interactive succinct argument. This is called Killian's protocol, where you compile a PCP using a collision recent hash function into a full message public coin succinct argument. The classical proof of security involves rewinding the adversary but rewinding is difficult with quantum adversaries, because when you start going back and forth and measuring, then adversary notices that he's leaving in a security reduction rather than wherever he was leaving and then he stops working. Okay. In this work with lots of hard work by the two students Fermi and Nick, they really did a fantastic job. They developed new techniques for rewinding quantum algorithms and now we do understand that also in this interactive setting from plain collision resistant hash functions or in quantum, these are things are called collapsed binding hash functions, we do understand post quantum security, and this is a longstanding open problem. This is basically with time, we are obtaining better and better foundations, not just for now, but for the longterm security of these SNARKs, which are more and more part of digital infrastructure and we need them to not just be, it looks secure, or, I prove it was secure classically, but it should be Post-quantum security. We really need to go through and put all the bricks of all the foundations because if they're here to stay for a long time, especially the ones based on hash function, they will be with us for all I know, for hundreds of years. We need them to be really understood. I will not say more. If you're interested in post-quantum security and those are two works that you can look into.
Anna Rose:
Cool. This actually leads us a little bit into a topic that I wanted to talk about with you, and that is teaching all of this stuff, teaching SNARKs teaching zero knowledge techniques and basically breaking down these systems to be learned and understood. You just mentioned, there's certain things like hash functions, you can assume will be with us for a long time. But then there's also other techniques that are like, they come, they might not be used. They might be found to be less efficient. How does that impact the way you teach? How can you create a syllabus or maybe an online course that you would last the test of time, if this is changing so quickly?
Alessandro Chiesa:
These are complex problems and-
Anna Rose:
This is the problem.
Alessandro Chiesa:
... there are various things to say here. First, SNARKs are rather technical area in the sense that they involve knowledge that touches upon computational complexity, cryptography and possibly also apply cryptography. An understanding of how algorithms such as multiscale and multiplication or fast forward transforms or other things perform in the real world on. Because it touches on so many areas, it's a bit of a challenge to train students and bring in talent who is in principal interested to understand this area, but that needs to be somewhere initiated. In my own experience as a professor, I'm also an educator, both of my own students in the group and more broadly in classes, I've been confronted with this problem very much directly. In particular when I started as a professor, it was this tailored training where I would talk about what I know to students in my office, and then if a new student would come, I will have to say that again. And it would be a very time consuming process for students and for me to communicate this knowledge. Over time it has gotten better. I've had several efforts in this direction. One effort that has think come a long way is to basically take the part of SNARKs that only has to do with computational complexity, so that is designing probabilistic proofs. I view it in my opinion as the technically most complex part, and designing a course around it. This is a course that I've taught now three times in semester courses and a fourth time in a summer school. I've been trying to digest and reorganize a mix of a foundational techniques and modern protocols by trying to select for the ideas that seem to be most useful, most productive, most flexible, and leaving out maybe ideas that they were maybe useful and led to nice papers but did not seem as impactful. What gets selected and how does it get organized, it's mostly a question of experience and having an understanding of what's more foundational and what's less foundational. It's challenging but it will come a long way. Now for example, if you're interested in understanding probabilistic proofs, there is almost a course that is on autopilot that you can run. I guess Anna will put in the description, a link to this summer school from this past July-
Anna Rose:
For sure.
Alessandro Chiesa:
... where there are 20 lectures with 20 problem sets that will teach you about interactive proofs, PCPs and IOPs, or at least what I consider to be the most foundational techniques and ideas for having a precise and deep understanding of this area. Now, there's probably another 10 or 20 lectures on top of those that one could put with other very useful things, but not much more. This really does cover a lot. We had 75 students this past summer. The process of training and taking into this area new talent, not just from cryptography but also from complexity or mathematics, has become a lot more at attractable and I think we're going to see an influx, not just thanks to this but also thanks to this. Of talent who is able to design probabilistic proofs that are high quality with tackling difficult questions. And it would really would not have been possible otherwise because there's only so many experts, they have so much advising bandwidth and it doesn't scale. But I think that having precise, deep and technical courses that are not trying to cover everything, like this course does not talk about all of the whole SNARK landscape, but specifically a coherent set of materials that does not require too much background. Probabilistic proofs does not require any cryptographic background, will help with this periodical problem.
Anna Rose:
For sure. But you mentioned here, you're tackling one part of a SNARK, because that's the part that you feel could be described as foundational. What about all of the other stuff? The things that are moving quickly, how does one get into that? We've had that question come up a number of times in our channels, people just being like, they are technical, they want to jump in, they don't know exactly where to start. It sounds like your course would be a great place to start, but once they do that, how do they then understand the other parts that are changing?
Alessandro Chiesa:
Well, the rest is work in progress in a sense that it is work in progress by researchers because they are changing, but it doesn't mean that there will be always changing. More and more territories of the SNARG landscape will become well understood over time. That's one aspect. The other work in progress is that more pedagogical resources will become available. For example, Justin Thaler has a wonderful monograph that covers both SNARG compilers and a little bit of probabilistic proofs. I don't cover anything in my course about cryptography. I just touch upon Killian's protocol and using hash functions but I don't talk about Bulletproofs, I don't talk about how you compile linear PCPs into pre-processing SNARGs. Because that would require in the course bringing in or teaching about cryptographic background. But Justin has done a wonderful job assembling an extensive monograph that will help people to understand that the other parts that are not covered. And over time, this work in progress will mean that experts might be finding opportunities to systematize, digest and represent other pieces necessary for full understanding of this area. And it will get better and better now. But I would say there's no easy silver bullet here in the sense that the research is still very active, so it's not like people are saying, well, we got all the low hanging and medium hanging fruits. Let's just go and write books and tutorials because there's nothing better to do. In my research group, we have plenty of exciting things we're always working on and it seems that just they keep coming. And so the resources towards pedagogy are more an active effort but they're not the majority of one's time because that's how it is.
Anna Rose:
I feel like this thinking though, intersects with the idea of standards, like ZK standards, standardization like decisions that, this is what is going to be used for now. Maybe that is more on the implementation side of things, but I know that there's a group like ZK Proofs, they do workshops and they actually focus on standards. I did wonder from where you are sitting, what do you make of standardization bodies? Do you feel like those are good for research, bad for research? Do you want standards? Does that even factor in?
Alessandro Chiesa:
I think standards are wonderful but they're more for industry. I wouldn't say that from a research perspective they carry a large impact. You can see this also, forget SNARGs, standards for encryption, standards for digital signature. They're not so much about making the life of researchers easier. Rather it is more about coming up with references the practitioners can look up and not have to directly consult with experts of the area. And similar for pedagogy, sometimes getting certain ideas across is not about what is the most efficient protocol but rather what is the simplest protocol that captures the use of that idea to achieve certain goal, and that helps you understand what is the power of that idea. Okay. And so then you feel it and you're able to use it in other places. For example, you may be able to effectively use it to design practically efficient protocols. So in pedagogical resources, there is more of a push towards examples that illustrate challenges and how you overcome them with specific ideas.
Anna Rose:
Okay.
Alessandro Chiesa:
But then those have to be presented in the cleanest, most unconcerned place. You shouldn't have to worry about other issues, like how you going to implement it, is it going to be efficient? Because you're trying to get across the power of a certain tool or what is a certain definition about and how to understand it. For researchers and even practitioners before they get implementing, they won't understand what they're operating with, what they're working with, that's what you want. So that's pedagogy is about digesting, readjusting, readjusting, readjusting until it kind of explains itself. So for example, things that are topics in under guarded textbooks for standard courses are there because they have been digested a million times and there are very effective ways to understand and motivate them and explain them to the point that people in their second and third year university can absorb them. It does not mean they were always like that. They were potentially much more complicated than better in not clear settings, maybe with additional bells and whistles that turned out they were not so important in the end, but it was a core idea that was much simpler. So something like that has to happen for SNARKs to some sort of syllabus that will be widely accessible. What I've been trying to do with this course in probabilistic proofs is doing that thing, not for SNARKs, but specifically for probabilistic proofs, where SNARKs are one of the major applications. And the course has gone through various iterations and at least I've learned a lot about what works and what doesn't work, what is core material and what is less core material. It's a bit different than standards.
Anna Rose:
One kind of last question that I have for you is, what is your motivation in doing this? What is the thing about teaching maybe specifically this that excites you?
Alessandro Chiesa:
I would say impact on students. I think that these topics around the succinct arguments and the probabilistic proofs are beautiful, they're powerful, they're challenging. And there's lots of talent are out in the world. And if I can improve the mechanisms and materials that bring in more talent, more people to think about these questions and catalyze faster progress in this area to maybe answer open question earlier than later to then cause more technology transfer of these ideas, and maybe even far beyond block chains. So this impact, I find it exciting and I cannot do it alone, I need brilliant collaborators but the brilliant collaborators have to be interested in this and know about the topic. And so through the teaching and education, I can affect change onto others with respect to these topics.
Anna Rose:
Is it partly that, there are things that you just want to know? Are you looking at like, these are these unknowns that you just want to eventually know the answer to or see something in a direction? Or is there any other, maybe outside of math reason that you want to see more development in this space?
Alessandro Chiesa:
Well, so on the research side. Yes. There's lots of questions that I would like to know the answer of very visually because they're very exciting. But taking a step back these topics, I find them exciting because they're just the goals that are achieved are magical. So many people are attracted cryptography because it's in some sense magic. You're able to do all this crazy sounding goals like zero knowledge or exponentially fast verification, secure computation, obfuscation, all of these capture the imagination of people, they're evocative. And you want to know how to do them. In my particular case, I find these exponential speed up in verification or computation to be, it's beautiful and-
Anna Rose:
Awesome.
Alessandro Chiesa:
It's awesome and it's also powerful. You can actually do, you don't have to come up with convoluted fairy tales about how it's useful in real world, it's just useful. In fact, it's not just useful, it's the direly needed. Okay. And what's exciting here is that somehow making progress on this topic brings you back to basic mathematical questions in computational complexity, property testing and cryptography. And now you have this leverage, this powerful leverage that theoretical ideas can push boundaries, not of knowledgeable of technology. So it is something that you sometimes read about and you see in movies maybe but you never really feel, but when you feel it on your skin and you see power that foundational progress brings to technology later than this is definitely integrating and so yeah, to me that's a strong motivation. When I talk to students, especially young ones, I tell them there's an infinitude of well posed mathematical questions that one can work on it. It does not mean that one should work on any one of them because some of them are silly or just, I don't know. They're well posed but just why are you wasting your time on it? Given the limited amount of time that we all have, I would rather spend it on questions that I find interesting and I think they are impactful.
Anna Rose:
Do you ever have any concerns about some of the like cryptography? Would you ever be worried about something you built being used for bad? Does that ever come into your mind from the research side?
Alessandro Chiesa:
Yeah. It's not... Like an analogy, so people always say, oh, a knife. You can use it to cut vegetables or people. But I think the analogy here is more about the raw materials for protocols. So maybe the wood to the mix up the knife or the steel that makes up the knife. So SNARKs are so foundational that they are currently being used in certain protocols but I think eventually it'll be used in all kinds of settings. And there's such a-
Anna Rose:
Totally.
Alessandro Chiesa:
... basic building block that at least the way I feel it in my mind is just so foundational that I don't really come up with considerations of sort. Very different is to say, if you start designing protocols that are achieving specific things for example, there are legitimate questions to be raised about say private peer to peer payments, like a Zerocash protocol. And so then the way to address them is to say, well, maybe we should have some programability that introduces some accountability into the system. And for such types of directions, then it becomes I think a bit more subtle, still proponent of privacy and the fact that you start with privacy and then you build in accountability rather than you start to-
Anna Rose:
Transparent.
Alessandro Chiesa:
... with surveillance. I'll take always the trade off more on that side. But when you're not really building protocols for specific security goals of real world protocols, but basic cryptographic primitives then yeah, no, it's not something I really think about much.
Anna Rose:
Do you think there are real opportunities for zero knowledge proofs to live? This show and a lot of the community and everything, we tend to focus on the blockchain applications of zero knowledge. It's always in that intersection. Almost always. I think I only have like one or two examples outside of it. But are you also looking towards or working towards non-blockchain, zero knowledge? And maybe it's so foundational that you can't say, it could be used for blockchain or not blockchain. I don't know.
Alessandro Chiesa:
I think it's too early to tell. So look in principle, I am convinced that zero knowledge proofs and succinct arguments will find plenty of applications beyond blockchain. Now, what are those? I don't know yet. I guess if I knew maybe I would grab a few friends and start a company direction. But also technology moves slow, which whatever we think of it, it usually does move slow especially when it's existing industries. So still SNARGs are not at old. I'm not old and I've been working in SNARGs for-
Anna Rose:
They've been there.
Alessandro Chiesa:
... a long, they've been worked on. So it will take more time. There are companies that tried to explore the direction but go slow. And then when you have opportunity cost of saying you could invest as a startup energy in the blockchain space and have a much shorter time to market and cash flow-
Anna Rose:
And these interesting funding mechanisms.
Alessandro Chiesa:
Right. As opposed to something more traditional longer term. Maybe it even was going to work out but simply the fact that you had a choice means that the development of a technology in the more traditional world will be delayed by the fact that there is a greater opportunity potentially in the blockchain space right now. But over time, I think this will kind of, I don't know if it'll correct itself but just through the statistics and the serendipity more applications will come up.
Anna Rose:
Very cool.
Alessandro Chiesa:
So I am absolutely positive that they will, I don't know exactly when and in what form.
Anna Rose:
Nice. Ale, thank you so much for coming back on the show.
Alessandro Chiesa:
Yeah. No, thank you so much. It was lovely to be back. It's always a lot of fun to chit chat about SNARKs with you. And there's plenty of things that we could have talked and we didn't have time, but for the audience if they're interested to follow more of the research coming out of my research group, then you can Google my name and find my webpage. It's usually up to date in terms of recent papers. You can also look up the summer school and the videos and problem sets there. And I continue to the research and some of my students have graduated. And as always, I'm looking for strong and brilliant students in post-docs who may be also interested in probabilistic proofs and succinct arguments. So if you're interested to talk about opportunities to work with me, I can hire you if there is a good match. So write me an email and you can find the email on my web page.
Anna Rose:
Very cool. So I want to say thank you the podcast editor Henrik, the podcast producer Tanya and to our listeners. Thanks for listening.