Anna Rose (00:05): Welcome to Zero Knowledge. I'm your host, Anna Rose. In this podcast, we'll be exploring the latest in zero knowledge research and the decentralized web, as well as new paradigms that promise to change the way we interact and transact online. (00:27): This week, Tarun and I chat with Yi Sun, co-founder of Axiom and Daniel Kang, Associate Professor of Computer Science at UIUC, we talk about their academic background and what led them to get interested in ZK topics and ZK ML. We then dig into 2 recent works of theirs, entitled Trustless Verification of Machine Learning and ZK-IMG Fighting Deep Fakes with Zero Knowledge Proofs. This episode is all about ZK ML, which is a lot of fun to dive into. It's a new field for me, and I really like looking at how it's starting to intersect with the zero knowledge tech that we talk about on the show. So, yeah, let me know if you like it and if you want to hear more eps like this. Now, before we kick off, I just want to let you know about an upcoming event we're doing. This is the zkHack Lisbon event. It's an in-person hackathon happening in Lisbon the weekend before zkSummit9. That's March 31st to April 2nd. If you're already coming to the summit, this might be interesting for you. This is an event for developers and builders who've already built something in ZK or really looking to jump in. If this sounds like you do send off an application, I'll add that to the show notes and we hope to see you there. Now, Tanya will share a little bit about this week's sponsor. Tanya (01:42): Aleo is a new layer 1 blockchain that achieves the programmability of Ethereum, the privacy of Zcash, and the scalability of a rollup. If you're interested in building private applications, then check out Aleo's programming language called Leo. Leo enables non-cryptographers to harness the power of ZKPs to deploy decentralized exchanges, hidden information games, regulated stablecoin and more. Visit developer.aleo.org to learn more. You can also participate in a Aleo's incentivized testnet3 by downloading and running a snarkOS node. No signup is necessary to participate for questions. Join their discord at aleo.org/discord. So thanks again, Aleo. And now here's our episode Anna Rose (02:26): Today, Tarun and I are here with Yi Sun, co-founder of Axiom and Daniel Kang, Assistant Professor of Computer Science at UIUC. Welcome to the show. Yi Sun (02:35): Thanks for having us. Daniel Kang (02:36): Thank you for having us. Anna Rose (02:36): So yeah, I think as a starting point, it would be really great to get to know both of you. Yi why don't we start with you. Tell me a little bit about your research and the work that has led you to ZK and now your work on Axiom. Yi Sun (02:49): Yeah, so I recently took a leave from a statistics professorship at the University of Chicago where I was doing work on probability and machine learning to start Axiom, which is a new ZK project. It's goal is to bring the power of ZK to smart contract developers so I got interested in crypto actually quite a few years back around 2017. So I just finished PhD in Math from MIT where I spent a year doing high frequency trading in the middle and so crypto to me just seemed like the perfect combination of markets and cryptography. Back then I quickly got pretty interested in the consensus protocols and what can enable crypto to really become something scalable. And ZK always seemed like the holy grail for that. But at that time it did seem pretty much like moon math. So the procedure for starting anything and zero knowledge would be number 1, you roll your own crypto and invent your own zero knowledge protocol. Anna Rose (03:49): Yeah. Yi Sun (03:49): Then maybe that takes you a year. Number 2, you actually implement that in software. Maybe that takes another year and 2 years in. You get to really start working on your actual application or infrastructure tool. Anna Rose (04:01): And at some point you also have to build your own ZK DSL probably along the way. Yi Sun (04:06): Yeah. Step 3, complete the trifecta. Anna Rose (04:09): Perfect. Yi Sun (04:10): So I kept following the space helping out a couple early stage projects like Gauntlet and Scroll. And in late 2021, I realized that the ZK space had really matured and in particular the tooling was ready to start building applications straight away. So I got pretty interested and sucked back in, and I spent early 2022 working on a bunch of open source work to write various primitives in ZK. So I worked on some libraries for elliptic curve cryptography and also reading from Ethereum data structures in ZK. So over the summer I realized that that work could be actually really scaled up by porting it to a more performant proof system called Halo2 and that's when I realized that something like Axiom was even possible to build and decided to work more seriously on it. So we just launched last week and we're excited to see what we can do. Anna Rose (05:12): Nice. So are you saying though, that like your first true foray into the blockchain space was actually ZK? Like you had been kind of observing it before that, but yeah, was that your first step in to actually build? Yi Sun (05:25): Yeah, definitely. I've been sort of poking around the space for quite a while as through and can probably tell you but ZK was the first thing where I felt like my technical skills could be a good fit for what's needed in the space. Anna Rose (05:38): Cool. Yi Sun (05:39): And I think ZK is kind of the perfect mashup of, you know, you need to know some math, but you also have to make a performant production system. Anna Rose (05:46): What kind of professor were you or what department were you in before? Was it computer science or was it math? Yi Sun (05:51): Yeah, I was actually in the statistics department. Anna Rose (05:53): Oh, yeah. Yi Sun (05:54): And people always ask this, but my previous research has very little to do with cryptocurrency or even ZK. So I was working on these theoretical problems in probability that originate in statistical mechanics and also trying to make some connections between those problems and modern machine learning. Anna Rose (06:11): Cool. Yi Sun (06:11): So working on some theoretical deep learning. Anna Rose (06:13): Nice. All right. I know we're going to come back to the topic of machine learning and more on the ZK front, but Daniel, why don't you tell us a little bit about what your research has been all about and what got you excited about ZK stuff? Daniel Kang (06:25): Yeah, that's a great question actually, my foray into crypto actually started way back in I think almost 2011 or 2012. It was almost a failed story. Back then I knew this eastern European hacker whose name I don't actually know, and he told me that I should work on this thing called Bitcoin mining and it would change my life and it sounded like a humongous scam. At this point in time, I was in high school, I was really into assembly coding. He told me I should optimize assembly miners. Obviously I did not do this since I clearly make bad decisions in my life, I learned later that the group that he referred me to was mining so profitably on AWS that they crashed AWS I think in around 2013. (07:11): But since then I've decided that I should keep an open mind, and if people tell me that they're interesting things, I should at least pay attention and interestingly enough Yi has been telling me for years that there's this thing called ZK and when I first heard about it, it literally sounds like magic. It literally sounds like something that shouldn't exist. So I was a bit skeptical, but I kept an open mind but at that point in time, this was, I think maybe 20- I don't remember the exact year, 2018 to 2020 or so. I was doing my PhD at Stanford where I focused on deploying machine learning in a variety of different applications, but primarily for analytics. And one of the trends I noticed when I started my PhD was around 2017 when I first got involved with machine learning and statistics. (07:56): It was around 2013, 2014. It really went from people being able to deploy their own models to literally all the models that you hear about nowadays are basically gated behind APIs. So for example, OpenAI, Chat GPT, Google has announced them as well and a variety of other models like this. And one of the things that I've been wondering about for a long time is that as an academic and especially as these models are proliferating, what are some ways to actually influence what's going on? And one really basic question is that if a model is behind an API, how do you actually know what's running? Anna Rose (08:36): Yeah. Daniel Kang (08:37): And to me, like this problem literally sounded impossible. So I just, I thought about this and I was like, all right, well, I guess this isn't happening, but I would really like to know what what I'm being served and then after several months of cajoling, Yi convinced me to start working on this space and I noticed the same thing that the tooling was actually, it was actually feasible to start to deploy applications today and I got really excited about that and I dived right into it and I've been working this space, I guess for about half a year to a year at this point and it's been a lot of fun. So that's how I got involved with with ZK ML. Anna Rose (09:12): Cool. You've produced papers together, right? You've written, you've done research together. How does that collaboration work? Is it sort of like someone has an idea and then you just want to dive into it? Or is this like officially within your working purview? Yeah. How does that work? Yi Sun (09:29): Yeah. We've actually been working together for pretty long time. I first met Daniel actually through one of our mutual friends who is my roommate in grad school. His name is Tatsunori Hashimoto, and he's now a Professor of Computer Science at Stanford and so it's actually through that context, we just became friends and started working on machine learning together in, it's a long time ago, I think maybe 2017. So we started working on adversarial examples in deep learning. It was really one of the first exposures that honestly, both of us had had to machine learning and especially empirical machine learning. So we wrote a paper together about trying to defend against different forms of these exotic adversarial attacks on vision models. So what that means is you can tweak the input to one of your machine learning classifiers, and it's a non-trivial result that it's possible to fool almost all deep learning models today and one of the biggest puzzles in the space is whether you can design a model to prevent this from happening. Unfortunately, our work did not make that much progress on that. We essentially showed that this is a very hard problem and in several ways of doing it don't really work. Anna Rose (10:43): Okay. Tarun (10:43): Yeah. And for listeners who've listened to a lot, episode 246 was about this topic, Florian, who does a lot of like in the wild attacks against systems, people claim our adversarially resistant, but then turn out not to be. Yi Sun (10:59): Yeah. We definitely read a lot of Florian's papers. Tarun (11:02): Yeah. Anna Rose (11:03): So I want to talk about one paper that the two of you were authors on and I know I've seen the blog post, which is entitled Trustless Verification of Machine Learning. I want to explore a little bit how that work came to be. I think it ties into what you just said, Daniel, about kind of the problem space and yeah, what were you trying to solve or figure out with that work? Maybe you can tell us a little bit about it. Yi Sun (11:26): Yeah, I can maybe jump in with a bit of the backstory. So at the beginning of that work, I guess I had been working on ZK for a while and thinking about how to really scale up ZK proofs for things outside of machine learning, very basic data structures appearing in Ethereum and elliptic curve operations. And when I thought about what really went into that work, it's a lot of low level optimization as well as understanding from a ground level the computation you're trying to do. But there was some people in this space thinking about applying this to machine learning, and I immediately realized that Daniel would almost be the perfect person to work on this. So that's why I thought of him to even try to really cajol him to get into the space and the goal of that work was to really scale up ZK machine learning models for the first time. And I thought Daniel's the perfect person to make that happen. Daniel Kang (12:18): That's very flattering. and as you were saying one of the things that actually turns out to be really important in machine learning in particular for ZK, is how you represent computations and going back to my story about assembly hacking in high school, it turns out a lot of the skills almost directly poured over to ZK ML. Anna Rose (12:37): What's the actual problem though that this, like, introducing, let's even take a step back into like ZK ML? Like what problem is it trying to solve? Daniel Kang (12:48): Yeah. So ZK ML tries to solve an abstract problem of which there are many specific concrete sensations but the abstract problem is to produce a zero knowledge proof that a machine learning model ran on some input and this inherits all the nice properties of zero knowledge proofs and in particular we focus on zk-SNARKs, so things like succinctness, zero knowledge obviously, and the completeness properties and sinus properties of zero knowledge proofs. Anna Rose (13:13): Is this sort of like going back to what you were saying about a lot of these models being created or like a lot of the activities happening behind an API, is this an attempt to sort of prove even if it's behind an API that it's being done correctly? Daniel Kang (13:28): Yeah, that's exactly right. That's one of the concrete sensations of the ZK ML problem. So for example, an ML provider might have some model weights that they want to keep hidden and so when you send them let's say an image or a piece of text, you can then use the techniques from ZK ML or the API provider can use a techniques from ZK ML to then prove that they've ran the model that they said they ran honestly Anna Rose (13:55): Why would they not? Like what benefit is there in not doing that, right? Like is there something malicious or something to be gained? Daniel Kang (14:03): Yeah, so I think it depends on who you talk to if you talk to people. Okay, so I guess maybe the, the, the most realistic application of this, I think is that the cloud provider or API provider might have been hacked. Anna Rose (14:14): Okay. Daniel Kang (14:15): So for example, if you're serving medical predictions, and there might be some malicious actors who want to say serve incorrect medical predictions for very specific reasons, say state actors. And in this circumstance you can, if you send the proof, then you know that the model is run correctly. But beyond that, there might be bugs in the serving system. So you might trust the model provider to actually do what they're, try to do what they're saying they want to do but they might have a say like a version mismatch and they might send you predictions from the wrong version of the model and finally, if you're dealing with a trustless setting the model provider might, might just be lazy, it's quite expensive to run ML models and if you can just avoid running them, say by running a smaller model they might do that Anna Rose (14:59): Like just give you lazy data, like still output something, but not on sort of the, the sophisticated level that they had promised or something. Daniel Kang (15:08): Yeah. So concretely, if you think about OpenAI, they have a model called GPT-3 and a variety of different forms of it and they have smaller versions of that model, for example, GPT-2 and so they might run GPT-2 on your input, if they are for whatever reason feeling lazy and you know, you might trust OpenAI in particular, but if you say you want to outsource your computation to say a startup that might not be as proven, then you still might want to have these kind of guarantees that you get from zero knowledge proofs. Yi Sun (15:38): I think it's particularly challenging in areas where as a human you actually can't evaluate the model output. For example, if the model is just evaluating, classifying images saying whether something is a cat or a dog, that's fine for humans to post-process and check. Anna Rose (15:55): Yeah. Yi Sun (15:55): But if you're relying on machine learning to do something a bit superhuman, maybe make some prediction. Well, the whole point is that you as a human shouldn't really be overriding the model and so there, you really want the gold standard model, not a cheaper approximation. Anna Rose (16:11): It's funny, Tarun, I'm just thinking back to when we had Florian on and we talked about, I was confused. I think I listened to this recently, but I was confused about what the model means. ML in general for me is a really new space. So like I'm still getting familiar at with the language at all. But yeah, in this case, so this is actually making me understand a little bit better what's happening when you're querying these models. Like the model is do, it's running something new. It's not just like the whole thing is finished behind an API and it's just outputting what you ask for, right? Is it creating a model as you ask that question somehow? Yi Sun (16:49): It kind of depends on what sort of systems you're using. Some models are really just, you can think of it as a function with some parameters. And so when we say that OpenAI trained a model, it just means they ran some, you know, very fancy computational intense procedure to determine those parameters and when you run the model on your input, you give it, let's say an image and it simply evaluates that function. So from the point of view of zero knowledge, you're just proving the correct execution with maybe some hidden information of that function. Anna Rose (17:23): Okay. Tarun (17:24): Yeah. It might be actually worth trying to talk a little bit about some of the problems in the deep fake paper. If we kind of zoom out a little bit. So, you know, there've been a bunch of papers on identifying deep fakes or proving that you correctly generated, it's an authentic image versus like one generated. But you know, one of the problems I guess in like some of the earlier papers like if I remember correctly, like the Boneh paper and others was that they required the image to have public data to generate the proof. And so a lot of what you did was getting around that. So maybe let's, maybe we could walk through like how you dealt with this idea of like, hey, we want to generate the proof without revealing much of the initial input data and, you know, the idea of like, image transformations being able to be performed in ZK because I think that part is like the most interesting part from the technical side is that people kind of assumed you would always have to be providing the input data and yeah. Daniel Kang (18:20): So one problem for both attested image edits and also for machine learning is that you might want to hide some parts of the input. So for example, let's say I take a photo of my of my desk, or let's say I take a photo of the situation room for when Obama was overseeing the operations for the Osama bin Laden situation, then you might want to hide or redact some of that information and you also might want to edit the photo for clarity but because some of this information is private, you don't want to reveal the original photo, but you still want to be able to say that these edits happen honestly from this original photo. One way to go about doing this, which is what we introduce in our paper, is to compute a commitment, in our particular case, it's a hash of the original image. (19:08): If you assume that you have attested sensor, you can verify that hash, the signed hash from attested sensor, and then only reveal the hash and the edited image at the end and so this way you can actually preserve the privacy of the original image and this general technique can also be used for machine learning as well. So all the work in the ZK ML space previously doesn't actually commit to the weights so they can prove that they ran some computation but because the weights, these things we call weights are the parameters of the function are actually critical to what the output is. We don't actually get any guarantees that they ran the model that they said they ran whereas in our work, we also introduce this for the ZK ML space as well, where you can compute a commitment in our part, in our case, a hash of the weights and reveal that and because the the commitment is binding, it forces the API provider to hash the weights then you can be assured around the correct model, Anna Rose (20:07): You just mentioned something like in attested sensor. What is that actually? Daniel Kang (20:12): Yeah. So attested sensor the most common form of this is attested camera is a hardware device that signs the sensor data, for example, pixels of an image immediately upon capture. Anna Rose (20:23): Okay. Daniel Kang (20:24): And it signs it with a tamper-proof hardware device. So in particular, the private key is kept hidden on the tamper-proof harder device, and you destroy the private key a immediately after the camera is produced Anna Rose (20:37): After the picture is produced, you mean? Daniel Kang (20:39): So you destroy the private key immediately after the camera is made. Anna Rose (20:45): Oh wow. Daniel Kang (20:45): So only the private key is only ever on this hardware device? Tarun (20:49): Sorry? You mean you destroy it on like computer external, the hardware device still has it to sign with, right? I still need Daniel Kang (20:55): Yes. Tarun (20:56): Yeah. Okay. Daniel Kang (20:56): Yes, that's right. Tarun (20:57): That's, that's like we still got to be able to like, you know, multiply that group operation, right? Daniel Kang (21:01): Yeah. Yeah. So the way this works is that a hardware vendor, like say a camera vendor will produce the private key and put it on the hardware device and then delete the private key from their computer and so this private key only exists on this signing device and then furthermore, the signing device is protected by essentially if you try to change the voltage on it, it'll basically destroy the private key and so that's how you ensure security. Anna Rose (21:28): Interesting. So in a tested sensor on a camera, it can be something else. I guess anything that's taking like real world input and turn it into something digital, and I guess that's always because you have to be able to, like, you couldn't, you know, use this on any sort of analog photo. It has to somehow be in digital form to start running these mathematical things on it. That's interesting though. I mean, you used the example of a photo a camera, but are there other sensors that you've considered this for? Daniel Kang (22:00): Yeah, so we chose camera specifically because you can purchase an attested camera today. Anna Rose (22:04): Okay. Daniel Kang (22:06): I think with the rise of generative AI methods, so for example AI methods like stable diffusion and the recently announced audio and video models from Google it would be great if we move to a world where basically every sensor was attested, so your mic, your camera, your webcam and even your keyboard. That's the world that I want to move towards but currently I believe that attested cameras are the most common form of attested sensor. Anna Rose (22:38): So like most devices would not have this feature, like most current digital recording devices, all of them aren't. Like, is there no way to somehow timestamp it or somehow prove that it came from that device at that time and that it's like accurate? Daniel Kang (22:53): Yeah, that's a great question. And it depends quite a bit on basically the level of security that you want. So a tested sensors are in some sense the most secure version of this because you need to have the actual, a hardware device to produce a signature. There are other forms of technology, for example enclaves which can help with this problem but there are some issues with getting basically the sensor data into the enclave securely. And depending on how much you trust the person who's taking the photo, and also basically the software systems of, say, your phone, then you can still do some form of this using enclaves and signing. Anna Rose (23:34): And this work that we're talking about, just to, for the listener, it's the ZK-IMG. Right? This is the fighting deep fakes. This is another work that the two of you collaborated on. Were there people also involved in that work, or was it just the two of you? Daniel Kang (23:48): So actually the, the friend that you mentioned Tatsunori Hashimoto was involved as well. Anna Rose (23:52): Okay. Daniel Kang (23:53): And my post-doctorate advisor Ion Stoica at Berkeley. Anna Rose (23:56): Cool. Daniel Kang (23:56): The reason all these attested sensors and cameras are especially relevant with the rise of generative AI is I think whether an image or a voice is authentic is now turning from a somewhat objective problem that you could use. For example, before maybe you could classify with a machine learning model, whether an image is real, like from an authentic camera or generated by stable diffusion but as these generative models get better and better, the way they actually work is by being trained to fool a classifier. Anna Rose (24:32): Oh, wow. Daniel Kang (24:32): And so it's almost turning from an objective problem to a somewhat subjective problem that we need to use cryptography to encode, you know, somewhat human level judgments about and that's why, you know, to get a source of authentic input data for ZK ML, it's kind of important to have these attested inputs and maybe just to bring us back to this adversarial examples discussion from a while back in the discussion the challenge is basically if you're using a hidden input in any machine learning model where you generate a proof, well, that hidden input might be chosen adversarially, and then you can make the output of your model essentially be whatever you want. (25:17): But because the input is hidden, the downstream user can't tell that you actually ran an adversarial attack and so that's a bit of a fundamental constraint for any application of ZK ML. Anna Rose (25:29): In this particular example though, you are kind of creating a way to keep the initial source private. Why does it need to be private? Like why would the, like with the attested sensor, I mean, you can always use the ZKP to prove providence or something like that, or non providence. Yeah. Why does it matter if the underlying one is private? Daniel Kang (25:49): I think it depends quite a bit on the application. So for the applications that we've been talking about so far, it doesn't really matter that much but I think with the rise of generative AI one of the things that's going to become really important is basically authenticated biometrics. Anna Rose (26:06): Ooh. So scary Daniel Kang (26:10): So one of the ways that you can authenticate yourself today is you can basically send a photo of your face to some service, and then they'll classify you as real or not and you might have actually used this on several applications. So I think that some social media applications ask this for verification, but if you do this, you're basically trusting the service to handle your information correctly. So maybe I don't want to send a photo of my face to Twitter, if I want to say anonymous and having the pipeline from attested sensor to the ML model could potentially allow privacy preserving biometric identification. Anna Rose (26:50): Hmm. So you could almost use your like anon PFP as your KYC face or something. Daniel Kang (26:59): Yeah. That's a future that I want to move towards. I want to be able to preserve privacy while keeping all the grid parts of biometric authentication. And that's one of the reasons that we worked on these two papers, which seem quite separate, but they're actually quite related. Yi Sun (27:13): Yeah. Maybe you could prove that your anon anime character was the result of a generative algorithm run on your actual photo. Anna Rose (27:21): Ah, but no one knows what you really look like. Yi Sun (27:24): Yeah. You got to hide your identity. Anna Rose (27:25): Yeah. Tarun (27:26): I mean, it depends on how much you use a stable diffusion thing that keeps a lot of the original data, but that then it's your, it's your choice, right? You've chosen the particular generative output that you're willing to use. Yi Sun (27:40): And that's how we have to attest to the input. Otherwise you could just put, you know, a celebrity's image as the input. Anna Rose (27:46): Hmm. That's wild. It's funny, like in planning for this, I didn't actually think about the attested sensor and now I'm sort of fascinated with the fact that that doesn't exist normally and that we have to create the thing that makes it like tamperproof, this is a true documentation of this moment in time by a sensor. Yeah. Tarun (28:07): I mean, you have to understand embedded systems historically, like cameras and things that had small devices. They generally, because they're battery powered, try to avoid any excess hardware surface area especially for things like TEs which are horribly expensive, like power-wise. So I think like a iPhone, if you remove the TE you can like 2x your battery life, depending on how often you're signing, it's actually like one of the more, it's like comparable to WiFi and Bluetooth. Anna Rose (28:38): Whoa. Tarun (28:38): So a lot of these devices were built, you know, the people who are building them hardware wise, their entire objective is not how secure is it? Theirs is like, hey, we need the battery life to be 20 hours, or we need like to be able to take 5 million pictures at this resolution. It's not about privacy. You have to actually like, somehow get people of very different backgrounds to agree on that being like a necessary usage of resources, which is, you know, until you had so much stuff like that looks like deep fakes or generative pictures and stuff like that, people didn't really care to to add that into devices. Anna Rose (29:15): And I guess with just doing it with a private key though, that's not as energy intensive, right? Like, or is it still to create this attested sensor Tarun (29:23): You have to sign on every subset of pixels? Anna Rose (29:25): Oh okay Tarun (29:25): Yeah. But the point is like, you have to deem that a value you want in your device and then add it in. Yeah. Anna Rose (29:33): The actual private key creation and all of that, like that is not novel cryptography. Is it? This is is this old cryptography that they've just started to add to these devices? Daniel Kang (29:43): Yeah, it's old cryptography. Anna Rose (29:45): Okay. Daniel Kang (29:45): Very, very tried and true. Anna Rose (29:47): Cool. Cool. Tarun (29:47): Yeah. Anna Rose (29:48): Daniel, maybe we can talk a little bit about the merging of these two works. You sort of just hinted at it that there, that there is a connection point. Daniel Kang (29:56): Yeah. Happy to talk about that. So I think that this entire pipeline of going from attested inputs from various sensors to feeding into inputs to machine learning models and algorithms is actually going to be really powerful in the future and I think there's a lot of different kinds of applications. So we talked about one already, which is biometric identification and one of the things that I've been thinking about quite a bit is this problem of adversarial inputs when it comes to biometric identification. If you take a picture of your face and you like draw a box, for example, with something that will fool a machine learning model as an auditor the service can just can take a look at this picture later and say like, okay, well they're clearly trying to fool us, so we should do something about this. (30:41): But if you never reveal the the image, then they can't do this. Anna Rose (30:44): Ah Daniel Kang (30:44): I think one of the things that'll become really interesting is what happens when you combine different kinds of sensors. So we mentioned cameras, we mentioned microphones. One of the other kinds of sensors I want to mention are lidar devices or depth cameras. So if you imagine your FaceID on say an iPhone, it uses a depth camera to do the identification and if you can combine that with, say, an image and also a clip of your voice, maybe that'll be much more secure and much more difficult to fool than it would be of just a plain image. So the service providers that want to authenticate you will probably have much more trust if they can verify from a variety of different sensors. I think this is one application that's going to be really exciting, but also there's, it's going to be a lot of technical work for for this to be enabled. Anna Rose (31:33): So this is the idea of kind of creating that trust that what's happening behind the API is still correct through ZKP, but also combining that with the fact that you can keep that initial image in one of these kind of like providence proving, I don't know if you really call it providence, cause it's like you're trying to prove that something's a deep fake or like you're trying to prove that it's the same, would you call that providence? Actually? Yeah, I guess so Daniel Kang (32:00): Yeah. Providence. Anna Rose (32:00): Okay. So but keeping that original source private, that's the combination that you're talking about here. So one is proving that what's happening under the hood is accurate and the other is like keeping that data private. Daniel Kang (32:11): That's right. Anna Rose (32:12): Interesting. I actually wonder how do those two things combined, because one is like behind the API, how do you are you sending, like what data would you have to send? Like say it was actually that, like there's some big model, like there's an API in front of it, you are on the other side as a user, you have the biometric data. How are you getting it in there without revealing it? Daniel Kang (32:36): Oh, yeah, that's right. So I should have actually mentioned we're working in a different setup for biometric identification. Anna Rose (32:42): Okay. Okay. Daniel Kang (32:43): We're actually in a setup where, let's say I want to authenticate myself, I as a user will actually run the machine learning model on my side. Anna Rose (32:50): Oh, I see. Daniel Kang (32:51): And send the result. Anna Rose (32:53): Okay. Daniel Kang (32:53): But then here, the, let's say the social media website that wants to verify that I'm a real human being wants to know that I ran the model, honestly. Anna Rose (33:02): Oh, okay. Okay. Daniel Kang (33:02): But also do it in a privacy preserving way. So it's actually flipped in this situation. The social media website doesn't trust the person who's sending the biometric identification from, so ZKPs are kind of interesting because they can enable both settings where basically a consumer might not trust a API provider and on the flip side, a social media website might not trust some random entity that claims they're an actual human being. Tarun (33:29): Yeah. The funny part about this is you say social media, but the types of applications that have been the largest consumer of these, like very weird, like liveness KYC tests are actually like dating apps. I think they actually, like, force people to do them now, like verify whatever. And there exists startups that like literally just pay people in like scale.ai style like in the Philippines to like, you get on a call and you move your face around enough for them to like believe that it's you. So this is an alternative version of that Anna Rose (34:03): Oh Tarun (34:04): I guess it's like anti-catfishing protection Daniel Kang (34:09): I have heard that there's been a lot of scams on dating apps recently. So hopefully in the future you won't have to move your head around to prove that you're a real human being. You can use these attested sensors and zero knowledge proofs. Yi Sun (34:22): I mean, nothing like zero knowledge proofs to set the romantic mood Tarun (34:27): This Valentine's Day by your loved one in attested sensor Yi Sun (34:34): An attested PFP, of course (34:37): I do think the fact that sometimes you have to do these proofs on your phone really points to the need for efficiency for the proof generation because obviously your phone has all these you know, battery requirements and probably the memory and CPU are not running as fast as some sort of cloud server and so with this, and also a lot of the cloud applications, we found performance to be a really big limiting factor in what's viable versus what's not. Anna Rose (35:06): In that example that you walked through Daniel, where you'd have the sort of the model on your own side, does that already happen? Is that something that can already happen or is that like still in a theoretical stage? Daniel Kang (35:19): I guess one way to say it is that it can hypothetically happen as in like, you could actually run this today. You just need an extremely powerful computer that you can hook up your mobile phone too. Anna Rose (35:30): Okay. So doable, but unlikely to be done. Daniel Kang (35:35): Yeah and a big part of my upcoming research is to basically bring down these costs. There's a bunch of different techniques coming from computer systems to cryptography to ML where I think we can actually bring down the cost by maybe three to five orders of magnitude, and I can actually be executable on your mobile phone. Anna Rose (35:54): Is there not another kind of ZK ML combination which is using the properties of like compression to make that, maybe it's not ZK specifically, but like, is that one of the techniques? Daniel Kang (36:08): Yeah, yeah. That is one of the techniques we're actively exploring. Anna Rose (36:11): Wow. That's cool. Let's see. How would that work then? You'd be using the ZKP to sort of like make the computational need of the model smaller or more compact and then also using a ZKP to prove its accuracy or something like that? Daniel Kang (36:27): Yeah. If you want to go into the technical details, one of the ways that you can go about doing this is to basically split the computation of, for example, the image edit and the ML model into separate proofs and then combine them using recursion or proof aggregation or what, whatever you want to call it. Anna Rose (36:44): Oh, cool. Yi Sun (36:45): This ties into a pretty generic technique in ZK where if you want to do something hidden suppose you want to do a big computation and part of it involves hidden information, but maybe doing the zero knowledge proof on the user side for the whole computation is computationally infeasible. What you can try to do is to generate, isolate a part of the computation that has hidden information, do it on the user's laptop, browser, or maybe their phone, and then feed the rest into a big cloud server where you don't really care as much about computational load and so getting that interface right is pretty challenging and I think it's a pretty active area of both academic and empirical work. Tarun (37:27): Yeah. Actually one sort of question that I guess from your paper I had is like, the generic technique is like you commit to a hash and you can validate that the hash is correct and that the hash was produced by the same thing that like some sequence of transforms were were put on as a like alternative to actually having to be like fully homomorphic or have like stronger properties. One question I have is like, how much does that impact the sets of transformations you can do? So like in your paper, you did benchmark a bunch of different transformations, at least for images. But let's say I took a more generic view of this. Let's say I just said like, you can do any linear algebra operation and compose it. Like is there, is there some sort of like limitation performance-wise in terms of like how much you can compose and generate these kind of like aggregations versus this kind of like commit hash plus sequence of transformations? Does that kind of make sense? Like there's, it does seem like there's some trade off surface here, and I'm just kind understand kind of what it is. Yi Sun (38:29): Yeah, it's definitely pretty generic. Like when you do the competition in ZK, it's almost divided into exactly the two pieces you mentioned. So you have some hidden information that's committed to, and the first step is always to decommit that information. For example, if it's a hash, you prove that, you know, the original image, which was the pre-image of that hash. Then you apply a bunch of operations in ZK, which actually has full knowledge of what the image is. And then at the end you can hash that, the output to get a commitment to the output. So what happens in the middle is actually very decoupled from the in original image commitment, which is used to preserve privacy and one thing that might not be obvious is in ZK, the performance hit from hashing is significant. So if you're running something on your laptop, you don't really consider hashing to have any performance impact. Whereas in ZK, you know, there are even the custom tailored hashes for ZK take a substantial amount of the proving time. Daniel Kang (39:31): Yeah. Concretely for images that can be like 20x more expensive than the actual transformation itself. Tarun (39:36): Okay. Cool. Yeah. And I guess the question I have then maybe is like, let's suppose I take a language model, which has sort of some notion of like loops and recurrences in the architectures. Does unrolling those change kind of like if I were to do the same kind of transformation of like hash the initial input and then give you the sequence of transformations, what's sort of the difference performance-wise and how should like someone think about, like, you know, if you were to try to do the same thing for GPT-3, what are kind of the obstacles or could it actually just work directly? Daniel Kang (40:10): One of the problems with large machine learning models is well there's two problems. One is that the weights are very large. So for example, the weights of GPT-3, there's 175 billion of them and computing commitment to that is very, very expensive. The other problem is that there's actually a lot of computation that happens internally within the model and representing on that within ZK is also very expensive. One of the details that I can go into is that basically like there's unfortunately two types of computations that happen in the machine learning models, these linear algebra operations, and then these things called non-linearities. The basic problem is that the proof systems that are really good for matrix multiplication aren't good at non-linearities and vice versa. So this is actually a really big challenge when it comes to doing ZK ML and it's also one of the things we're actively exploring on how to mitigate in terms of the performance implications of that. Tarun (41:04): Actually, maybe a very stupid naive question, but you know, in the same way that I guess, you know, if we go back to 2016 or 2015, like the idea of switching kind of like the smooth non-linear functions like a sigmoidal activation with like a soft max or just max type of, you know, relay type of thing. Do you think that there's going to be some like tailored non-linearity for generating ZK proofs that actually ends up being like adapted to the proof system? Like it's like some piecewise linear thing that's easy to use, so like maybe it doesn't get the same performance, but it can, it's like you're like trading off, like proving time versus model accuracy and like you end up changing the architecture to do that. Yi Sun (41:44): There's some challenge to doing that. The reason is that in to generate a zero knowledge proof, you have to somehow transform your computation to one where every variable is an image or a module, a very large cryptographically chosen prime. But the core problem is that doing arithmetic, say addition or multiplication over that prime field is not really close to a differential operation and the core premise of machine learning, or at least steep learning, is that your model should be differential or at least a discreet approximation to a differential model. So although different non-linearities could have somewhat different costs to implement in ZK, you're always going to have to pay to somehow reconcile this fundamentally non differentiable prime field object with in deep learning land, something you're fundamentally using to discretized a real analytic defense. Tarun (42:41): That's true. But then, then I always think about the fact that like, Nvidia moved everything to 8 bit and you know, like it's not really as smooth as you think it is as, you know and you could argue that the 1980s theory of neural nets got everything wrong because it assumed you had to be extremely smooth everywhere. And like Ethereum only works in like certain limits, right? Whatever. But there's some sense in which like maybe there's some, there's like some extra room wiggle room there. That's, I guess that's sort of maybe my question is like, do you think that there is that wiggle room where like you custom design architectures to be proof friendly or to like have sort of certain properties? Daniel Kang (43:22): I think there's actually two answers to this actually, but going specifically to your end video point as it turns out, the weights are often stored in 8, but the, or say they're flowing 0.8 version. But the activations are often basically blown up to higher precision in the intermediate step, especially for the non-linearities. For example, if you're doing softmax, there's been a lot of work that have been, that's been shown that the soft max is very numerically unstable, so you need to higher precision in the intermediates. I actually do think that there are potentially ways to basically bridge the gap of like differential ML and also ZKP. But one of the challenges is that if you want to deploy this, you're going out to convince practitioners to use a different non-linearity and machine learning is somewhat of a, like, it's kind of black magic and convincing people to use like a different non-linearity that hasn't been tested over the course of like 10 years. It's going to be a real challenge. There's basically like five or six that people use and making them switch is going to be quite challenging. Yi Sun (44:22): On the bright side I would say some of these quantization techniques are already the focus of a lot of work for people doing machine learning on edge devices, basically to save power on your cell phone and it turns out that very roughly speaking, the difficulty of implementing inference in ZK can be proxied in some way by how much battery power a model actually takes. So people have been working on quantizing these models and reducing the compute, and we can actually leverage a lot of that work to pick the best model to put in ZK. Anna Rose (44:56): I actually wanted to ask you a little bit about the, I know you have new work and you sort of mentioned like at this point to actually run these things locally would be kind of impossible, but the other two works. Are there tools that exist today that can actually do that? Or are those also theoretical? Like given the tooling that exists? Yeah could we, for example, prove that like a larger service is actually querying the model correctly behind an API? Daniel Kang (45:24): So we can do that for certain kinds of models. The models that we can do this on today are models that people actually use in practice, but they're not, for example, GPT-3 and we're actively working to bridge that gap as well so for example, for small classification models, we can just run this today and that's one what we showed in our, on our initial paper in ZK ML. Anna Rose (45:45): And what kind of tools do you use though, in terms of like the ZK tools? Like are they the tools that we, you know, know well on this show? Like is it the Circoms and the Halos or is it something else? Daniel Kang (45:57): Yeah, so the proving system that we used was the proving stack that we used was the Halo2 proving stack. Anna Rose (46:03): Cool. Daniel Kang (46:04): It supports lookup tables, which are really helpful for the non-linearity use that we discussed a few minutes ago. Yi Sun (46:09): One thing that we realized is that these generic proving tools that weren't really designed with machine learning or neural networks in mind actually have reached a point where you can kind of adapt them and actually get to some scale for neural networks. So a lot of prior work in ZK ML would often handcraft a proof system for, you know, matrix multiplication or other basic operations and obviously that gives you a huge advantage but what we realized is that using something actually put into production for engineering like Halo2 it gives you access to better tooling and sort of real world implementations that, you know, the ZKVM teams and other ZK teams are trying to run for performance and somehow empirically we found that that makes up for the trade off. So data bit in context to what Daniel said on the image classification, we were able to scale some things up to a model that can handle a data set called ImageNet. So this is probably the most cited data set in all of machine learning, and it's essentially 224 by 224 size images of different real world objects and for context, all prior work was dealing with models that worked on much smaller images things like a data asset called CFR 10, which is 32 by 32. So although 224 by 224 is still not where we really want to be it's a significant step up in scale that really was enabled by these more mature engineering for Halo2. Tarun (47:45): You know, you brought up this, this point about in instead of actually evaluating a function, you just like store pre-computed look up values and nearest neighbor, it sounds like you're basically doing like round to nearest neighbor and then look up function value. How long do you think it'll take for like a DSL like a Circom to actually have those features? Like you write an non-linear function, but it converts it to polynomial, you know, like rational approximation or something like, you know, like if you look at a lot of math libraries in the rest of numerical computation, there's a ton of stuff that's like hidden from the end user, which like, does all this type of stuff, like, does some of the pre computation, does some of these like polynomial approximation type of things, do you think it will get to a world where like the machine learning applications drive the library composition behavior, you know, like kind of like you have with PyTorch and TensorFlow? Or do you think that it's too, still too abstract from that? Yi Sun (48:42): That's actually something we're thinking about at Axiom. We're targeting the on chain user and they're, you know, although maybe we could offer neural networks, really people, what people want is, you know, math functions like exponentiation or squaring and we're trying to write exactly fixed point mathematical libraries to do this sort of thing and I think it's still a ways away from a point where the ordinary smart contract developer is going to write, you know, X times Y and then other under the hood. It does some crazy quadrature thing to develop a lookup table for that but I think in maybe one to two years we will reach that point. Daniel Kang (49:22): Yeah. And for ML specifically I'm actually working on open sourcing the library that we built for basically taking models in TensorFlow and turning them into zero knowledge proofs and we have some of the work that you, or basically some of the techniques that you described in this library. Tarun (49:38): The reason I ask this is like, when you think about how programming languages evolve, sometimes it's like the first applications that are built around a programming language dictate like everything that's in it, like in Solidity, right? Like why are address maps like the primary data structure? Well, it was like that just happened to be like that's a historical vestige of like the first applications being tokens in some ways that like, that gets all the special priority and the compiler and stuff like that and so I'm just kind of curious, like when you think about the, the reason I was asking this was more like, hey if ZK is moving in a particular future, which application dictates like how the languages get developed? And it sounds like to some extent it might be these math functions first rather than like other data structures that people use. Yi Sun (50:25): Yeah, definitely. In Circom it's heavily because it was developed to write the Hermez zk-rollup and you can even tell by the functions in Circom lib, there's a lot of very specific elliptic curve verification, a lot of very slip specific indexing into a arrays and there's definitely no math. You can add some numbers, multiply some numbers, and, and that's it. I think with the newer proof systems like Halo2, the user base is a bit more diverse and so we're now seeing many different groups develop libraries on top of Halo2 for their specific application and I think it's exciting to see the proof system be flexible enough to really handle all of these. Anna Rose (51:10): It does make me think that there might be a light coming down the pipeline, some sort of Halo2 equivalent that is maybe built more with this, more in mind. Interesting. I mean there's, there's Plonky2, but I don't know if that is more suited for that or less. Yi Sun (51:26): I would say at the proof system level, Plonky2 and Halo2 are pretty close in adaptation to machine learning but they're, I think these proof systems that are more tailor made to operations like matrix multiplication are going to make a resurgence now that I think there's a bit of production interest in machine learning. There's more of an incentive to develop the tooling around those systems and maybe make a more fair comparison between them and, you know, PLONK based systems like Halo2 or Plonky2. You definitely don't want to be winning just on engineering. Anna Rose (52:00): Are you thinking about stuff like Spartan? Because I think that's only MSM now, right? Yi Sun (52:05): Yes. Stuff like Spartan as well as some of these more, sum check related systems that actually have matrix mo as a primitive operation. Obviously those are, that's not generically useful unless you really want to do some sort of, you know, machine learning or numerical type algorithm. Anna Rose (52:22): Cool. So what else do we see kind of in the future of ZK or ZK ML? What else are you guys thinking about? Yi Sun (52:29): Yeah, so Axiom, we're talking with a lot of smart contract application teams about their on chain needs and so one obstacle to actually deploying something like a big neural network on chain is that if you want to prove that you correctly did inference on an image, you have to be able to access that image on chain and if you've written any solidity before, you know that accessing Ethereum state is the most expensive operation of all on chain. And so we think the models that are going to make sense on chain going forward are much simpler ones, perhaps not at the scale of ImageNet, but much closer to things like linear aggression or just traditional ML things like, hey drink algorithms or PCA. And so the space is still pretty early for that. I think developers haven't really realized that's even possible but we're pretty excited to explore what people come up with. Anna Rose (53:24): What era, like if you were to look back in history of ML, what year are we at on, on the blockchain or in ZK ML? Is it like 1986 or something? What is it? Yi Sun (53:37): Maybe better than that. Maybe 2001. Anna Rose (53:39): Ooh. Yi Sun (53:40): There's a standard book, you know, elements of statistical learning that prior to deep learning was the canonical reference for what you would learn if you wanted to do applied machine learning and I think actually most of those algorithms, many of those algorithms can be applied on chain today with the scale of data that is available. Anna Rose (54:00): Cool. Yi Sun (54:00): And so we just need developers to actually want to put them into their applications. Tarun (54:04): We don't have the hasty book yet, though, for, for this. We still kind of need that booka and then the online course that maps to it. I don't think the Berkeley ones come anywhere close to how much the, at the influence of like that kind of Trevor Hasty and Tibshirani book had. Yi Sun (54:26): Yeah, I mean, it's still early days, you know, to even implement the log rhythm or exponential on chain is a tough task these days and to do it in ZK is also pretty challenging. So we have a long ways to go before we're all doing, you know, logistic regression. Tarun (54:42): Maybe it might be good to kind of conclude by talking about other applications, maybe outside of just purely kind of some of the work you've done so far and sort of imagine what happens if we zip forward to 2012 in ML when, you know, sort of ImageNet was created. We kind of reached that nexus. What does the world look like? What are the applications that exists? Daniel Kang (55:08): Yeah, I think if you look at the history of ML 2012 was sort of a seismic shift, so it's really difficult to guess what will happen next, but I do think that there's some crazy things that you could imagine. So one thing that I've just been thinking about toying around with is that you might imagine taking this idea of biometric identification and literally just putting it on chain. What you can do is you can basically put a hash of say some information about your face so that people can identify it and then use that to authenticate a smart contract if you have some really sensitive operation that you want to do. But beyond that, I think there's also a bunch of interesting applications. So for example, you might want to have a data marketplace where you sell private data to different customers. (55:50): But if you want to prove how valuable this data is today you need to show them the data, which sort of defeats the purpose of selling the data. They can just take it and run and similarly, I think you can also do things like prompt engineering marketplaces. So if you have some really cool prompt for stable diffusion, you can prove that you've ran stable diffusion on that prompt. And then do something with that as well. We're very early days, so the applications are just, these are just things I've been towing around with. I'm sure there'll be plenty of other ones. I'm really excited to see what people come up with. Anna Rose (56:21): Cool. All right. I want to say a big thank you then to both of you for coming on the show. Thanks for sharing with us sort of your histories and also all the work you've been doing on ZK ML and what might be coming up in the future. Yi Sun (56:34): Yeah, thanks for having us on. Daniel Kang (56:34): Thanks for having us on. Anna Rose (56:35): I want to say thank you to the ZK team, Henrik, Rachel, Adam and Tanya. And to our listeners, thanks for listening.