[00:00:05]: Anna Rose: Welcome to Zero Knowledge. I'm your host, Anna Rose. In this podcast, we will be exploring the latest in zero-knowledge research and the decentralized web, as well as new paradigms that promise to change the way we interact and transact online. This week, Guillermo and I chat with Ying Tong and Bryan Gillespie. We get to know both of these researchers and what brought them to work on ZK specifically. We talk with Ying Tong about her work on Zcash, working on the Halo2 system, and some of her subsequent contributions at Geometry Research. We learn from Bryan about his journey to working on ZK, also via Zcash and his work today at Inversed Tech. We then dive into their recent work, which examines programmable privacy in distributed systems. We tease out the classifications and framework that they introduce to better evaluate and compare different applications that offer limit order auctions using ZK or related technologies. Now, before we kick off, I just want to let you know about our upcoming event, ZK Hack Montreal, happening August 9th through 11th. In case you don't know, ZK Hack is another project that I'm involved in. It's a hub for ZK learning, which is very much community driven, and as part of this we produce virtual as well as IRL ZK-focused events. The next one of these events is ZK Hack Montreal, bringing our IRL hackathon series to my hometown. Now, ZK Hack Montreal will be happening as mentioned, August 9th through 11th, and it promises to be a very special one, being the first time we host our hackathon series in North America. Some of you may be attending SBC the week leading up to the ZK Hack weekend. Luckily, it's just an hour flight from New York to Montreal, so you can easily do both events. Plus, you get to hang out in beautiful Montreal for the weekend and hack on ZK tools, meet experts in the field, and maybe meet your future collaborators or co-founders at ZK Hack Montreal. You can apply and also find out any updates about partners and judges over on zkmontreal.com. I've added the link in the show notes. So yeah, we hope to see you there. Now Tanya will share a little bit about this week's sponsors. [00:02:17]: Tanya: Aleo is a new Layer 1 blockchain that achieves the programmability of Ethereum, the privacy of Zcash, and the scalability of a rollup. Driven by a mission for a truly secure Internet, Aleo has interwoven zero-knowledge proofs into every facet of their stack, resulting in a vertically integrated Layer 1 blockchain that's unparalleled in its approach. Aleo is ZK by design. Dive into their programming language, Leo, and see what permissionless development looks like, offering boundless opportunities for developers and innovators to build ZK apps. This is an invitation to be part of a transformational ZK journey. Dive deeper and discover more about Aleo at aleo.org. And now here's our episode. [00:03:00]: Anna Rose: Today. Guillermo and I are here with Ying Tong from Geometry Research and Bryan Gillespie from Inversed Tech. Welcome both of you to the show. [00:03:09]: Ying Tong: Hi. I'm so glad to be on ZK podcast. [00:03:12]: Bryan Gillespie: Yeah, it's awesome to be here. [00:03:14]: Anna Rose: Hey, Guillermo. [00:03:14]: Guillermo Angeris: What's up? [00:03:16]: Anna Rose: So, Ying Tong, I feel like this episode came about because... I mean, we started talking about having you on. I've known you for a few years, but I'm really curious to find out from you what got you excited about cryptography in the first place. What was the thing that brought you to work on these kinds of problems? [00:03:33]: Ying Tong: Well, I was trained as a physicist, and I really got into cryptography via the Ethereum project and just the shared hallucination of a world computer. And when I first joined the Ethereum Foundation, I worked with Barry Whitehat on implementing, I think, one of the first zkRollups. [00:03:57]: Anna Rose: Oh, cool. [00:03:58]: Ying Tong: From there, that was the gateway drug really into zero knowledge. And when the Zcash team published their findings about Halo and recursive proofs without succinctness, that got me really excited about the possibilities of zero knowledge. I joined Zcash, worked on Halo2. Really, a lot of the workshops and educational outreach was centered around Halo2, and that was really fun, really productive. And I'm continuing that journey now at Geometry Research, as well as continuing to work on these problems of programmable privacy, the design space that Zcash remains very interested in. [00:04:51]: Anna Rose: I had not realized that you had actually first started at Ethereum and then worked at the ECC. I remember meeting you, I think it was for a ZK Summit, you gave a talk, and at the time, you were more part of the Zcash world. Going back to that early experience at Ethereum, though, how did you connect with those folks coming from physics? [00:05:15]: Guillermo Angeris: That was about to be my question. I was like a fellow physicist. How did you come to the dark side? Or the light side, depending on how you look at it, I guess. [00:05:23]: Ying Tong: Well, the Ethereum Foundation had a headquarters in Singapore where I was studying, and they came around my school recruiting not just computer scientists, as you would expect, but also physicists. They explicitly stated that they wanted physicists. And I think a lot of the physics training is not directly transferable. However, their reasoning skills and the ability to pick up new concepts with a high level of rigor, I think were very valuable and helped me a lot in learning cryptography and zero knowledge. [00:06:04]: Anna Rose: Nice. That's so cool. I want to come back to, for example, the Halo2 workshops and stuff, but I want to first ask Bryan to share a little bit about his background. Tell us a little bit about what you've been working on and why you got excited about this. [00:06:20]: Bryan Gillespie: Yeah, absolutely. So I'm coming from the other side of theoretical academic backgrounds. I have a mathematics PhD and eventually decided that cryptography was the best thing ever. [00:06:32]: Anna Rose: Cool. [00:06:32]: Bryan Gillespie: I, too, have a connection with the Zcash side of things. The blog posts by Ariel Gabizon and others on the SNARK technology in Zcash was one of the first things that inspired me to start looking in this direction. Kind of the first and second year of the pandemic I realized that this was just the coolest thing ever, and so I took the time to pick it up and learn some of these things. Taught a one semester course, upper division undergrad at the university near me, and then I got a job through Twitter. [00:07:07]: Anna Rose: Through Twitter. Nice. What kind of math were you doing before, though? [00:07:11]: Bryan Gillespie: So my PhD was in the area of Algebraic Combinatorics. This is looking at kind of discrete mathematical objects, objects that you can enumerate and count, and trying to understand them by looking at algebraic constructions like polynomial spaces and more complicated things you can say about them. And so it actually really relates a lot to the constructions that you run into in ZK technology. I'm finding that actually my background really relates fairly well to the sorts of things that we do when we're studying zk-SNARKs or multiparty computation, or other types of cryptography that you're encoding data into these mathematical forms. [00:07:53]: Anna Rose: So cool. [00:07:55]: Guillermo Angeris: So, a random quick question. So just to get a little bit more sense, algebraic combinatorics would be things like generating functions or generating polynomials would be one object that you might study in this particular field, I assume. [00:08:08]: Bryan Gillespie: Yes, absolutely. And the idea is to use heavier algebraic constructions, like people talk about algebraic geometry as a basis for cryptography. You can also use algebraic geometry to understand these other discrete objects in a really kind of deep and useful way. [00:08:27]: Anna Rose: What was the first company or project that you got involved in when you jumped in? [00:08:31]: Bryan Gillespie: My current job is actually my first job in the space. [00:08:33]: Anna Rose: Oh, nice. [00:08:34]: Bryan Gillespie: Aside from that, course that I mentioned earlier, and I taught at university. So Inversed Tech, it's a relatively recent cryptography consulting upstart. And I just reached out on Twitter and said, hey, just wanted in case there's anybody out there looking for somebody with my background to say that I'm available and looking for work. And my boss, Daniel, reached out and we got in touch. [00:09:01]: Anna Rose: Nice. [00:09:02]: Guillermo Angeris: Incredible. Don't just let your dreams be memes. It's very important that you really can keep posting through it. I love it. An inspiration for us all. [00:09:10]: Bryan Gillespie: Absolutely. [00:09:10]: Guillermo Angeris: That's amazing. [00:09:11]: Bryan Gillespie: Yeah, I think this was before Twitter became X, but the point remains. [00:09:15]: Guillermo Angeris: It's still Twitter in our hearts, I think. [00:09:17]: Anna Rose: Yeah. So this is Daniel, who was also one of the people who kind of started the ZK proofs event series. He used to be part of a group called QEDIT, but, yeah, so, I mean, we've known Daniel for a long time. That's really cool that you landed there. What kind of work do you do at Inversed Tech? You sort of mentioned it's consulting, but is there any particular problem spaces you're tackling? [00:09:40]: Bryan Gillespie: It's a really wide variety of different specific types of work, but broadly, it's in the privacy space. We're really interested in contributing to open web technologies that present opportunities for people to maintain their personal privacy. So this research project that we're working on for programmable privacy is part of that. And helping out with other technologies that are trying to provide people with the opportunities to stay private and keep their data out of the panopticons out there is really like a big part of what we care about. [00:10:14]: Anna Rose: Cool. All right, I want to now circle back to you, Ying Tong, and talk a bit about those Halo2 workshops that you mentioned. Because to me, when you were hosting those, I think you were doing it with the 0xParc folks, right? I think your workshops influenced a whole batch of projects who have since at least started building on Halo2 after learning it from you. I just want to go back to how that workshop idea even came about, what it takes to put something like that together. Like you were part of the team that developed it, so I imagine you have... you're in it, like you're already deep in it. But yeah, what did you have to do to that material to turn it into a workshop? [00:10:54]: Ying Tong: Yeah, I think 0xParc has this incredible energy for finding talent and making these resources available to them. And really, I was just one person in a really motivated team. So, yeah, what it took really was just reaching out to people. A lot of them came from universities, a lot of them came from the open source projects that we're already familiar with. And just a lot of like random hackers as well, preparing the resources and just getting the people together to show up every week, be it virtually or at a physical location. We had a few pop ups in New York City as well as Boston and San Francisco, and it was fueled by this enthusiasm and this sort of eagerness and willingness to learn, to explore, to build together. So at that time, yeah, Halo2 was perhaps one of the more advanced ZK proofing stacks out there. And since then we've had lots of variations on the original Halo2. We've had significant forks that are popularly adopted, and we've had completely new proof systems become open source and available, such as Plonky 3, for example. So, yeah, I do hope that this culture of learning and sharing continues to grow, and I'm very excited for the next generation of proof systems. And I'm excited to see, for example, client-side proofing become a reality. Definitely this tradition of building in the open and sharing information and resources with each other is something very precious in this space, and I look forward to seeing it continue. [00:13:08]: Anna Rose: That's so interesting. Yeah. Just as an aside, I just today actually met with Zac from Aztec, and we were talking about that idea of the building in that open way. And I was realizing that we live, like we're kind of in this very special moment in ZK where there's well-funded efforts to build in a certain direction. And then because there's not an air of scarcity in terms of just people feeling funded and they're not scrounging for being able to live, they actually have the space to live, and then they can just do the research they want to do, you've had this acceleration on the ZK front. But it's not replicated in other industries. It's not even necessarily replicated in other parts of cryptography or math. It's just this really cool moment where people have been able to drive it really quickly. But I like that because I think it shows something. It's kind of an example, potentially to other fields that there is this way that you can accelerate things that you think are impossible much, much faster than you would have imagined. And it's not just by having big companies do it, which I think in the past, you kind of required the capital-intensive large corporation to make something like this happen. Here it's happening from lots of different companies, lots of different individual contributors, people also of all ages, who are kind of empowered to do what they want. Yeah, it's really cool. [00:14:32]: Ying Tong: Absolutely. I think it's a competitive advantage to build in the open, and to allow easy adoption of your libraries and your products. And there's so many benefits. For example, more eyes on your code means more bugs will likely be surfaced. That's one of the top advantages of making your code open source. I think as the industry matures, we're moving more and more towards standardization. And having your library accessible means that you're part of the standardization conversation. It means that your tools are not going to be just obsoleted with the next wave. So building in the open, building collaboratively, is such a competitive advantage. I think it just makes sense to do it that way. [00:15:23]: Anna Rose: I do think there's one other piece, and we've brought this up a couple times on the show, but this is about patenting things and locking them down. There are even adjacent fields where you see it's just more of an effort to patent early. And I feel like history sort of shows that a lot of times in fields like this, if you patent early, all it does is slows down that part of the research. And ZK, luckily, has sort of accelerated past any previous patent efforts and other patents, like... what was it like? Fiat- Shamir or... [00:15:58]: Ying Tong: Schnorr. [00:16:00]: Anna Rose: Or Schnorr. Like the patent expired, and all of a sudden, you could use it. And it just like you saw this boom happen. I think seeing that happen play out, should be an example to all practitioners in these adjacent fields, that maybe be careful with those heavy patents. Even if your company is locked down, people may not be able to use your tech in the long run. It's a topic that I'm seeing more and more clearly kind of come into view. And I feel like ZK is the counterexample of like this is what happens if you don't immediately try to lock this down early, but you actually leave it for a larger community to play with. Before we move on from the Halo2 work, I have sort of a dumb question that I'm pretty sure has been answered at some point on this show, which is, is Halo2 using FRI? Is there a connection between FRI and Halo2? [00:16:48]: Ying Tong: The original Halo2 uses the IPA, polynomial commitment scheme. So that's Zcash Halo2. Whereas there have been sort of forks of Halo2 that have adapted the front end to work with FRI on the backend, to work with small field FRI in particular. So that's one of the advantages of having a modular proving system, is that you can swap out the backend, and you can, for example, put in different lookup arguments. You can put in different fields, different polynomial commitments, but the original Halo2 used the IPA polynomial commitment. [00:17:35]: Anna Rose: Okay, this might be why, actually, I didn't know the answer to that, is like I probably have asked, maybe back then it didn't use FRI. But then I've heard it did use FRI. Okay, this is exactly why. I don't know if you're still following that, but did you see the STIR work? [00:17:52]: Ying Tong: Yes. [00:17:53]: Anna Rose: Yeah. I'm just curious if you have any thoughts on it, given that it is sort of interrelated with some of the Halo stuff. [00:17:58]: Ying Tong: I've heard that STIR can just work as a drop-in replacement of FRI. I think it would be really a great improvement if someone were to implement that. I will say one caveat. It's also why standardization is so important. So when composing different elements from across the proving stack, we have to make sure that certain security invariants are preserved. For example, whether or not your commitment scheme preserves zero knowledge. So these questions are what standardization efforts aim to guide and aim to provide a template for. But in general, it's been inspiring to see just the amount of experimentation that's been done on Halo2. Yeah, there's really any number of backends that have been swapped in in Halo2, even multilinear backends. [00:19:04]: Anna Rose: What was the system that you feel captured mindshare after, just in terms of the chronology of it, just timing-wise. [00:19:13]: Ying Tong: What's caught my eye is Plonky2, Plonky3. [00:19:17]: Anna Rose: Okay. That group. Yeah. [00:19:19]: Ying Tong: Yeah. For example, I think SP1 and Valida both use Plonky2 or Plonky3. I think, really, the speed-ups from the small field optimizations are very compelling. I do think that Halo2 belongs to an earlier generation of proof systems. I would like to see how Halo2 can be adapted to newer proof systems that are more efficient. [00:19:52]: Guillermo Angeris: By the way, very dumb question. When you say small field FRI, what does that mean? Is it a small prime that's below to the 32 or something? Or is it some binary field? What is a small field? I guess in this case, just for the people who have very little idea what ZK is. [00:20:09]: Ying Tong: The small field really can go down to 32 bits. That's the Baby Bear field that RISC Zero uses, for example. And it's called a small field because we're used to working with 128, 256, or 381 bit fields in cryptography. Fields over which the Schwartz–Zippel lemma is sort of very secure. Whereas small field FRI pushes the boundaries of that. And the reason they're able to get away with that, is that they don't rely on discrete log hardness. They don't rely on having access to a cryptographic group, and because they just use simple post-quantum primitives, namely hashing, they can get away with very small fields that are efficient on most computer architectures. [00:21:09]: Anna Rose: So I want to shift gears a little bit and start talking about a recent work that you released. It was focused on programmable privacy in distributed systems. This is also partly where this episode comes about. Ying Tong, when I reached out to you, you were like, let's talk about this work and we're bringing you on, Bryan, to kind of cover this? Yeah, I want to sort of dive into what this is. To me, it reads a little like us. I mean, it's a really good explanation of a lot of the ways of thinking about different systems, the choices that you're making in order to actually evaluate various projects, like companies using variations on some of these systems. I think of it as a bit of a survey. So, yeah, tell me a little bit about this work. What was the motivation? Where did the idea come from? And then we can dig into some of the topics that you cover. [00:21:59]: Ying Tong: So this SoK originated from the Zcash Foundation, wanting to explore options for introducing programmability onto Zcash. And really this survey was to get us up to speed with the state-of-the-art in the industry and also to make recommendations for the path forward for Zcash. [00:22:25]: Anna Rose: Got it. [00:22:26]: Ying Tong: So in my mind, there's kind of three flavors of programmable privacy that are sort of dominant in people's minds. The first one is this idea of selective disclosure, meaning that you can program what you reveal and to whom. So sort of the logical conclusion of selective disclosure would be witness encryption. Like, you specify exactly what condition someone must fulfill in order to decrypt some data. And this metaphor of selective disclosure to me, is closely related to purpose limitation and access control, which might be more familiar metaphors outside this industry. And the second kind of programmable privacy is you're programming the attributes that you verify about private data. So, for example, in Zcash, we want to verify ownership of a note, we want to verify that the note hasn't been spent before, we want to verify some range checks on the value of the note. And Aleo or ZEXE generalizes this statement to basically any NP relation. So programmability, in the sense that you can verify whatever you want about private data. And this is also the notion that's used in Aztec and Mina, for example. [00:24:07]: Anna Rose: Is that still in the second category? You have the first one with selective disclosure. The second one is this, like attributes, but like, to me... [00:24:17]: Ying Tong: It's like programmable verifiability. [00:24:19]: Anna Rose: Okay. Programmable verifiability. So you are putting Aztec, Aleo, all in the second category still? [00:24:25]: Ying Tong: I am. [00:24:26]: Anna Rose: Okay. Okay. [00:24:26]: Ying Tong: And I'm also putting all the zkRollups in there as well. [00:24:29]: Anna Rose: Oh, wow. So what's the third? [00:24:32]: Ying Tong: The third one is programmable computation over private data. So you think like homomorphic encryption, FHE, MPC, that you have some structured encoding of your data, on which you can actually perform any arbitrary computation. And this third category was actually the focus of our paper. We call it the Mediated computation phase. You'll hear more about it from Bryan later. But yeah, these are the three metaphors for programmable privacy that are most common in the industry. [00:25:14]: Anna Rose: I think it's really good that you just teased out those three too, because programmable privacy, yeah, I've heard that referred to lots of different kinds, and I think it's cool to see them really outlined here. You focused on this programmable computation over private data, and you sort of mentioned some of the other systems like MPC, FHE that are used for this, but I know that in the work you also do highlight ZK projects. So I guess there is a way to do this even in ZK systems. Is it more complicated? I know we're going to probably get to that, but does ZK actually offer an advantage or disadvantage when you're trying to tackle this problem? [00:25:56]: Bryan Gillespie: So, I think that the real bottom line is that ZK is a powerful toolbox for doing a certain type of computation, which is private off-chain computation, where you have access personally to the private data that you're computing over. So in some sense, the ZK class of protocols allows generic computation over your own private data. But what we found is that there's kind of a different class of cryptographic techniques. We mentioned FHE and secure multiparty computation and different approaches. [00:26:32]: Anna Rose: Maybe TEEs go in there too. [00:26:34]: Bryan Gillespie: Maybe even TEEs. Yes, yes. A little bit of a complicated debate over their usage, but there is a usage for them, but which allows computation over private data from multiple parties. So the real distinction is whether you have the ability to do this computation in such a way that the compute nodes themselves can't see the private data, which is very relevant in a lot of DeFi sorts of applications in particular, where you need to kind of combine these pieces in a synchronous way from multiple parties. [00:27:06]: Anna Rose: When you say off-chain, do you mean client-side? Is this like the proof and all of the privacy is remaining in the hands of the individual? Or do you mean it's in the hands of this off-chain system, but it could still be shared. [00:27:22]: Bryan Gillespie: It's kind of both, actually. zkSNARKs, I view them as kind of a glue between different regimes of trust. So one computational engine can do computations, and in that computational engine, you know certain things, and then you want to relay that to a different computational engine, say the on-chain blockchain virtual machine or something like that. The SNARK proofs allow you to demonstrate that the computation was done correctly without revealing any more than you need to. It's a connector between these different computational engines that allows you to separate out the different privacy regimes that are important. [00:28:00]: Anna Rose: So this work in particular, when I started to read through it, your focus was on limit order auctions, or DEXes more broadly. I don't know if you know this, but Guillermo, who's here with me, when we met, it was in 2021, and Guillermo, Alex, and Tarun had done some work, kind of almost like debunking that you could even have a private DEX. But now, in your work, you definitely showcased that there are efforts to do that. Guillermo, I don't know if you want to weigh in. [00:28:29]: Guillermo Angeris: Yeah, I mean, that was... The original version of it was just... I remember people being like, oh, just take a DEX and just put privacy on it. Like the Elizabeth Holmes do a chemistry version. That was cool, but it just totally doesn't work, because you can just back out from really basic information what the underlying thing was. Of course, in that work, we suggested some random things, like you should batch together trades, and that will give you some guarantees, depends on the size of the trades, blah, blah. And then we have a different work suggesting even crazier method, which cuts up a trade and then does a bunch of stuff and blends it together using some PRNG, and then you have some differential privacy. Which is like, that's like super... [00:29:09]: Anna Rose: Yeah. That was the differential privacy was sort of your... [00:29:12]: Guillermo Angeris: So that one gets quite deep into... Like very lost in the sauce in that case. But it's much... It's if you don't have this nice batching, you have a little bit more annoying type of thing, like how do you do it? And then Penumbra was also similarly, I think... After this work was Henry realizing that you can use kind of these, I believe it was some partially homomorphic encryption to kind of construct these batches automatically. And so you really can get private batch kind of exchanges. But it has weird properties, and so we have another paper analyzing what that means for arbitrage. But anyways, the point is, yes, it's very funny. This is actually how we met, and I feel like we've come full circle on this one. This is like a callback, you know? [00:29:57]: Ying Tong: Yeah. We saw this batch auction design in Penumbra, as you mentioned, and interestingly enough, also in Secret Network. Basically any implementation of an AMM where the prices are public. So in Secret Network, if you're trading on an AMM and your trade is the only transaction in a block, then it'll be super obvious exactly what the amounts of your trade were. [00:30:26]: Anna Rose: Oh yeah. [00:30:26]: Guillermo Angeris: Yep. [00:30:27]: Ying Tong: And yeah, it was interesting to me that it's the same batch anonymity set guarantee as Penumbra. So in Penumbra, if you're the only one in your batch, similarly, you're not going to get any privacy. Yeah, but I will say, that the order book implementations were... had different privacy properties. So we looked, for example, at Renegade.fi. So they implement an order book DEX where the prices are private, and you run a matching algorithm in a two-party computation. And if your orders do match, then you form a balanced transaction that is submitted on-chain that conceals the amounts of the individual orders. If your orders don't match, then nothing happens, and either party in the two-PC learns nothing about the other party's prices. [00:31:23]: Anna Rose: Interesting. You just mentioned three examples there. You had Penumbra, Secret, Renegade.fi. Was there another project that you looked at, or was it just those three that you were kind of comparing in this work? [00:31:35]: Ying Tong: The final project we looked at was ZSA swap. Similarly to Renegade.fi, they implement an order book DEX, where the prices are kept private. So ZSA swap is not a real protocol yet. Very much, it's in the design phase. [00:31:55]: Anna Rose: Okay, okay. But that's cool. I mean, let's tease out even more the differences between these, and I know, because you also introduced these other ways of thinking about computation. I don't know if each one of these projects uses a different type of computation, or if they're all kind of playing in the same area. But, yeah, maybe, Bryan, you can share a little bit about those three categories that you had in there. [00:32:18]: Bryan Gillespie: Yeah, absolutely. So we've mentioned all three of the techniques that these protocols make use of right now. So the Secret Network is leaning heavily on TEEs, or trusted execution environments. Which are secure hardware enclaves that allow computations to happen in a privacy preserving way, because it's very difficult to exfiltrate any data of any sort from within the hardware that's doing the computation. Some people have discussions about how much one should trust the hardware implementations from companies that maybe don't have the best track record of having totally bug-free hardware. But in many cases, this trusted execution environment approaches is very effective at just keeping a trustless network from having to trust too much. And so Secret Network does all of their main computations in the trusted execution environment, and that's how they allow private data from multiple parties to come together. The Renegade.fi approach is a really interesting one, as it uses secure multiparty computation. These are cryptographic techniques that allow different computational nodes to use mathematics of polynomials and cryptosystems to combine their data in a way that nobody sees anything except the expected output of the computation. It's really quite magical. The downside is that there are some inefficiencies in doing these sorts of computations. A lot of communication costs in particular between the parties tend to add up. But Renegade.fi's approach is just to do a secure multiparty computation between two different parties. This is another aspect of their protocol, which is limiting. Two party computation is often reasonably doable. If you want to do computations with more parties than that, then it gets complicated. But it does a really good job of doing a particular matching protocol for limit orders, and it kind of accomplishes the privacy characteristics that they're aiming for very well. Penumbra is really an interesting protocol from my perspective. What I really like about it is it puts together just the smallest kernel of computation in this, what we call Mediated computation phase. It's just a little part that allows you to do something with private data from multiple parties to achieve this differential privacy that Guillermo was mentioning earlier. All it does is it takes the requests for trading some type of asset for another type of asset, and it adds them all together in this very small multiparty computation. It's just a threshold decryption scheme. And so everybody encrypts their amounts to the same public key, and then a threshold of validators are required in order to decrypt the outputs after they've been added together. So kind of all of the trades that happen at the same time are added together before they're decrypted. And so it relies on the validator set, not deciding to decrypt early as a group. But trusting a group is easier than trusting any individual validator by themselves. [00:35:23]: Guillermo Angeris: So, a question you alluded to this part, I don't recall what you called it, it's Mediated computation. [00:35:28]: Bryan Gillespie: Yep. [00:35:29]: Guillermo Angeris: I think you have this very nice framework in the paper, which I really appreciated, which is, it seems that all of these protocols, kind of the way they work, if I recall correctly, I could have misread this, is they have kind of three... I believe, three phases for how they're doing... And I thought this was a lovely framework. So if you want to just go into how they all have the same structure, that I think is a really lovely framework for thinking about these things. [00:35:54]: Bryan Gillespie: The real big picture here is that there are different cryptographic techniques that can accomplish different effects in this distributed, trustless space. Ethereum started out by exploring kind of like the public replicated virtual machine, the global computer, and no privacy properties included. And then this ZK thesis of chains allow off-chain computation to occur because of either specialized signature schemes or general zk-SNARKs, which allow you to verify computation that happens off-chain. And kind of these two regimes of computation have a lot of study put into them and kind of have well understood cryptographic properties. But then there's something else that's needed to combine private inputs from multiple parties. Those two techniques can't accomplish this by themselves. And the problem ends up being that the techniques to do this are difficult. They're complicated, they're subtle, they take a lot of cryptographic sophistication to implement properly. And so what we found is that most computation should be put into the zk-SNARK off-chain computation side, or the on-chain global computation side, for purposes of cryptographic suitability and for purposes of computational efficiency. But new privacy chains are incorporating kind of some special sauce of some sort. And generically, what is needed is a separate step in kind of forming a block or forming a new computational unit, which does this extra task of combining private computations or private inputs from multiple parties. And so what we realized is that how blockchains are doing this is starting with off-chain, doing some kind of Mediated computation phase, which either is some separate protocol that's not even related to the blockchain that it's ending up on, or perhaps it's something that's provided by the blockchain, by the validator set, or by some separate software that validators also run in order to combine these inputs and then produce something that can be ingested by the on-chain systems for computation. [00:38:03]: Ying Tong: Yeah. As Bryan said, we notice this structure of three phases, Independent, Mediated, and Global computation. And the Independent computation happens off-chain, where only the user's private data is involved. And the Global computation happens on-chain, where the consensus and the network's validators are actually verifying, actually guaranteeing the correctness of the state update. And we noticed where these protocols differ most significantly is in the middle phase of Mediated computation. And the scarce resource here always is the network validators. It's the resources that the whole consensus has to be involved in. And, for example, Penumbra actually involves all of its validators in this threshold decryption effort. Secret Network as well, every validator runs a TEE. Whereas if you look at ZSA swap, they relegate the Mediated computation chain to some off-chain solver that is not creating this extra strain on the network's resources, but at the same time, the privacy guarantees that the users have are solely depending on this off-chain operator. So we did notice in this three phase structure that there is a trade-off happening in the middle phase of privacy versus expressivity, that if you move your private Mediated computation off-chain, you're free to perform a more expressive arbitrary computation. But your privacy guarantees are weakened and they rely on some entity that is not secured by the network's consensus and by the network's economic security. Whereas if you choose to perform the Mediated computation on-chain, you are placing burdens on every validator in the network. Burdens of synchronicity, burdens of compute. But from that you get this very strong privacy guarantee, for example, a k of n threshold guarantee that you wouldn't get in the first case. [00:40:37]: Anna Rose: Wow, that's an interesting finding. So you're choosing the sort of privacy expressivity access, but are there any other characteristics? Is there any security? Or is there other things that get affected by the choices that are made here? [00:40:51]: Bryan Gillespie: So one of the interesting characteristics of these systems that we observed is that very conveniently, the design can be put together so that the privacy properties are the main thing that is on the line for how the Mediated computation is designed. Generally, because computations are consumed at the consensus layer using some kind of proof scheme, the correctness of the applications involved is not at risk. [00:41:19]: Anna Rose: I see. [00:41:20]: Bryan Gillespie: So when you're looking at the Mediated computation techniques, a lot of the questions is, is it possible for this private data that users are submitting to be revealed to the compute nodes or to other parties? [00:41:32]: Anna Rose: Okay, cool, cool. So it's how private is it is going up and down, depending, but it's... yeah, the proofs are still proofs. [00:41:39]: Bryan Gillespie: That's right. [00:41:40]: Ying Tong: Another trade-off that we noticed was expressivity versus security, but on the developer experience level. So... [00:41:50]: Anna Rose: Like how easy it is to add a bug or not, kind of by accident. [00:41:55]: Ying Tong: Exactly. [00:41:55]: Anna Rose: Okay. [00:41:56]: Ying Tong: The more freedom, the more access you give the developer to this low-level privacy primitives, the greater the chance that you expose them to a foot gun or some pattern that is unintentionally disclosive. So, for example, if you let them vary the size of the encrypted inputs and outputs, that could leak information on what these inputs and outputs are. If you let them vary storage access patterns, for example, and your usual almost side-channel attacks on timing frequency, the size of queries. Another possible example is this branching pattern whereby, let's say in an auction, in a private auction contract, you have some winning bid that is updated every time a higher bid comes in. So how you would naively implement that is, if the new bid is higher, then update the secret bid. But the very fact that the commitment to the secret bid has changed tells me that the new bid was indeed higher than all the other bids. And so patterns like that that are trivially disclosive may be exposed with a too expressive interface. [00:43:21]: Anna Rose: Oh wow. [00:43:21]: Ying Tong: And this is true as well in MPC, you can write trivially disclosive MPC circuits. So yeah, this is another trade-off that we noticed. It was not the focus of our paper, but I would say for all practitioners and for anyone who's actually writing programs on these programmable privacy systems, this is absolutely priority for developer experience and for user privacy. [00:43:53]: Bryan Gillespie: Yeah, I honestly think that it's an important research priority to try and identify good design patterns and good language designs, which make it harder for developers to introduce these sorts of bugs, and also to understand the kind of computational engines that go along with difficult to introduce bugs systems. [00:44:14]: Guillermo Angeris: After this exploration, do you have any strong opinions about how languages should be designed to avoid such things? I don't know any high-level like, here are some invariants that it would be very cool if your compiler would check for you or your type system would check for you. I have no idea, I'm just... but I'm curious. [00:44:32]: Ying Tong: We actually made a start on UC style definitions of a bunch of protocols in our SoK. UC means Universal Composability. [00:44:45]: Anna Rose: Okay. [00:44:45]: Ying Tong: And it's a simulation-based proof paradigm. Basically, it simulates the view of an adversary in an idealized functionality versus the real-world protocol, and tries to ensure that the adversary cannot distinguish which version it's interacting with. So we made a start on UC style definitions, but we didn't actually get to UC proofs. I do think that an interesting line of further work would be to look at UC proofs and to look at them in a machine checkable way. So if we could somehow write our program in terms of their UC ideal functionality, then we could check the desired information leakage and check these privacy invariants. [00:45:45]: Anna Rose: This would make the coding on a developer's experience much better, because they could have something to just make sure it catches those things that you were worried might be happening. [00:45:55]: Guillermo Angeris: Like the code could output... The compiler could output guarantees about what's actually being upheld versus things that are being leaked, for example, is, I guess, what this is getting to. [00:46:03]: Ying Tong: Yes, exactly. If we recenter the developer experience on privacy invariants, then that would make very explicit the information leakage that's intended versus unintended. There is a line of work on privacy notions based on game-based proofs. So this 2018 paper called On Privacy Notions in Anonymous Communication. It defines a bunch of privacy properties, for example, Sender-Receiver Linkability, Sender-Receiver Observability, Sender-Message Linkability, Receiver-Message Linkability, and so on. It defines this in sort of this tree of implications. And we could make use of these privacy notions and have the developer specify which notion they're interested in preserving and which notions that in turn implies. The Zerocash paper also had a bunch of notions around indistinguishability. So, as Bryan said, it's a very interesting research question of how to incorporate these notions into DSLs, into interfaces that developers would find intuitive, into interfaces that would be safe for users. [00:47:34]: Anna Rose: Interesting. I want to just ask a little bit on who the developers here you're talking about are, because in the examples that you highlight, the ones that you really look at, Penumbra, Secret, I don't actually know the other two very well, but are you talking to the developers of those systems or developers that are interfacing with those systems? And we are talking also about this programmable computation environment. So I just want to make a slight distinguishing of who can make the mistakes or who are you speaking to with a lot of this stuff? [00:48:08]: Bryan Gillespie: Yeah, I think the real target audience for these sorts of tools are developer teams that do not have as much cryptographic sophistication as some of the primary developer teams of these protocols. [00:48:22]: Anna Rose: I see. It's like someone building another system like this. [00:48:24]: Bryan Gillespie: The guys over at Penumbra know exactly what they're doing with all of this cryptography, so we're not designing a DSL for them. But more so for protocols that are aiming to make these sorts of functionalities, these sorts of underlying computational engines available to development teams more broadly, who don't have an in-house cryptographer, or who don't have a big budget for that type of security and still want to produce applications that make use of the Mediated computation phase. So one interesting point where we see an opening in the space right now is that the main place where real programmable privacy is being implemented in actual blockchain protocols is primarily using TEEs as the foundation. So it's kind of skipping the cryptographic scheme entirely. If you you're thinking about multiparty computation and homomorphic encryption, it's just saying, let's use secure hardware and it'll work. But the problem ends up being that giving full access to the underlying compute node opens up these sorts of timing attacks or branching logic attacks, and it puts some of the burden of security on the development teams that are using these things. So, for instance, the Secret Network has a really useful and interesting page on things that you should keep in mind for making sure that the privacy of your application is maintained. And it's like very good stuff to keep in mind. But ideally, a DSL would abstract that complexity away from developers, so they don't have to think about it. [00:49:56]: Anna Rose: Cool. You sort of talk a little bit about intents, or that comes up, and I know the focus was this limit order auction. Maybe that's why that was mentioned, but there was one system that was not mentioned there, which is like Anoma, I know Namada is launching... it's not a DEX, it's not doing that. So I get it. But I was kind of curious if you even explored what they're building, or if you really see that as so different that it wouldn't have fit into this. [00:50:24]: Ying Tong: Anoma was certainly an inspiration for us in the SoK. The reason we chose limit order auctions as the specific motivating example, was to just minimize the complexity of the applications that we're studying, is to capture a minimal set of functionality that's still non-trivial. So as far as I understand it, the privacy of intents on Anoma is orthogonal to their main protocol. It's extra protocol, and it's sort of delegated to their solvers, who are matching and solving these intents off-chain, then submitting them as balanced transactions on-chain. So in that sense, Anoma is similar to ZSA swaps, Zcash calls it asset swaps, in that the privacy of the intents and the actual Mediated computation of matching and solving them is off-chain or extra protocol. [00:51:30]: Anna Rose: Got it. Cool. So I feel like... I mean, I've reached the end of the questions I had on particularly this work, but now that this work is completed and you're kind of looking forward, where do you see a lot of the systems that you just described, like the MPC and the ZK and the interactions and this sort of Mediated computation area, what do you see kind of coming from there? [00:51:52]: Ying Tong: One outcome that we want from this work is to further discussion on best practices, and safe designs in programmable privacy systems. So one trend that we did notice is that zero-knowledge proofs are used as a primitive in almost every one of these designs. So, for example, in Renegade.fi, the matching algorithm is carried out in a two-party computation, but both parties at the same time also produce a collaborative zk-SNARK of the validity of their computation. And this is posted on-chain and is publicly verifiable. So we see this again in FHE, where the encrypted input is accompanied by a proof of correct encryption, so that the validator nodes know that they're not receiving garbage, that they're receiving some input that was actually properly encrypted to them. And lastly, in TEEs, we've again seen proposals for getting the TEE to run a ZK prover and basically removing this dependency on the TEE for integrity. So we're limiting the responsibility of the TEE to just confidentiality, and we're asking it to produce as a publicly verifiable ZK proof so that anyone outside the TEE can check that it actually ran the program it was supposed to run. And there are multiple other ways to stack these primitives. So defense-in-depth best practices. This is one of the sort of outcomes we would like, one of the takeaways we would like to communicate with this work. Programmable privacy systems actually consider how they can make full use of these different cryptographic primitives. [00:54:03]: Anna Rose: Nice. I mean, it's interesting because the combinations of the ZK and FHE, ZK, MPC we've covered on the podcast before. But it's interesting here that you're positioning it much more as ZK being used as a tool. And not necessarily... because, I mean, we've also seen MPC be used in a ZK scenario. If you talk trusted setup or something like this. But here you're sort of saying in the MPC itself, you're using the technique of ZK to prove something about some computation that's happening in that private environment often. Yeah, I like thinking about it in all of those ways. That's cool. [00:54:38]: Ying Tong: Yeah, yeah, that's cool. [00:54:40]: Anna Rose: Before we sign off, I want to hear a little bit about maybe other future work that you yourselves are working on. Like kind of directions that you want to explore even further beyond what we've already talked about on the show. [00:54:52]: Bryan Gillespie: So one of the directions that I think is really interesting that I would like to spend some time investigating at some point, is looking at, in particular, more dynamic MPC-based Mediated computation. So, using multiparty computation with a trust based on a threshold of the compute nodes to provide some general computational functionalities. In particular, where you see MPC used in a lot of protocols right now is very specialized functionality, threshold decryption or some particular circuit that needs to be computed. And part of the reason for that is that you can really optimize the multiparty computation in that context to get good efficiency and good security properties. But I don't think that people have spent enough time thinking about efficient generic computation in MPC and how that can be used to really accentuate the privacy properties of these systems for distributed blockchains. [00:55:51]: Anna Rose: Cool. What about you, Ying Tong? [00:55:52]: Ying Tong: I'm actually interested in a taxonomy of classes of applications that would benefit from programmable privacy and relatedly, the best practices and the privacy-preserving design patterns that correspond to these different classes of applications. In this SoK, we chose to narrow our focus to limit order auctions. And I do think that covers a great deal of interesting dynamics in programmable privacy. But yeah, I think it would be really interesting and really valuable to define more generally the types of interaction required by different classes of so-called privacy-preserving applications, and then correspondingly the cryptographic primitives that they require at the minimum to realize. [00:56:50]: Anna Rose: That's interesting. I was actually going to ask you exactly that about looking past just the limit order auctions. But can you just share maybe a handful of types that you have already identified even. Like what would they be called? Are they always in the context of DEXes or trades? Or would it go into stuff like message passing or something that isn't DEX related? [00:57:13]: Ying Tong: I've seen lots of applications that are hidden information games. So someone once described it to me as you're hiding in a bush and I throw a spear at you, and I want it to be reflected in the global state that this event happened and that your health decreased. [00:57:35]: Anna Rose: Okay. Like for a game, I guess. Yeah. [00:57:37]: Ying Tong: Yes. So while games are so useful in surfacing these dynamics. Well, another application that's interesting to me is MTCS or Multilateral Trade Credit Settlement. And this is sort of extending the interaction to sort of a cycle of participants instead of just a pairwise. So this sort of atomic settlement in a cycle of participants. Yeah, it's an extension of what we've studied in this paper. And I guess we came across a bunch of applications that had really fewer requirements, so they had only commutative updates. For example, I think voting, you can achieve that just using zk-SNARKs. Each time you're just adding something plus one. Auctions, you could do that similarly. So yeah, there's certainly applications with greater and lesser requirements on interactivity of private state, but I haven't quite seen a rigorous classification of them yet, so I'm very motivated to make one. [00:59:02]: Anna Rose: Cool. That's awesome. [00:59:04]: Guillermo Angeris: Sounds fascinating. [00:59:05]: Anna Rose: And maybe that will be something we can talk about on a future podcast. [00:59:10]: Ying Tong: Yes. [00:59:11]: Anna Rose: Nice. All right, I want to say thank you to both of you for coming on, sharing a little bit about your backgrounds, what got you excited about this work, and then obviously, our dive into the Programmable Privacy in Distributed Systems, SoK. Yeah, it was so great to have you on. [00:59:29]: Ying Tong: Thank you for having us on. This was lots of fun. [00:59:32]: Bryan Gillespie: Yeah, it was awesome to be here. [00:59:33]: Anna Rose: Yeah. Thanks, Guillermo. [00:59:35]: Guillermo Angeris: I'm just here. Don't mind me. [00:59:37]: Anna Rose: All right. And I want to say thank you to the podcast team, Henrik, Rachel, and Tanya, and to our listeners, thanks for listening.