Sean: Hi, and welcome to PodRocket. I'm Sean, and with me today is Joyce Lin, Head of Developer Relations at Postman, here to talk about our recent conference talk, Reverse Engineering A Private API. Welcome to the podcast, Joyce. Joyce Lin: Thank you. Glad to be here. Sean: Yeah, really happy to have you. But before we get into how to reverse engineer a private API, would you mind talking a little bit about your work at Postman and why you decided to concentrate on API development and helping other devs develop their own APIs? Joyce Lin: Yeah, so I work at Postman. I think a lot of devs know Postman. They're familiar with it as an primarily HDTP, but now we're multi-protocol HTDP client. We're an API platform used by 20 million devs around the world, and so I run the developer relations. I think Law Rocket has a pretty robust developer relations program, so some people don't know what Dev Rel is. And my fun job is occasionally I get to go to conferences, I get to write blog posts, participate in podcasts like this. And so, APIs are super huge, super popular. Any dev needs to know an API. And so that's been the bread and butter for Postman. And occasionally, I'm running the Dev Rel team now, but occasionally I get to work on something a little bit fun like the reverse engineering project was something that I personally wanted to learn about, and so I'm excited we're talking about that today. Sean: Yeah, no, and I'm definitely one of those 20 million using Postman, but haven't considered using it to do any reverse engineering yet. What is, in the world of web development and APIs, what is reverse engineering? Joyce Lin: Yeah, so I gave this talk at Cascadia and I polled the audience, the entire audience. I said how many devs, most people were web dev, how many devs have actually reverse engineered an API and been successful at it? And only five people raised their hand. And I was initially a little puzzled, but have you done it before? Sean: I have not. No. Joyce Lin: Well, so exactly. I'm kind of surprised. But then afterwards I was thinking about it and I think devs are usually tasked with building something, or they imagine something that they're going to build, and then they make it. And people like security engineers, test engineers, there's a whole different type of engineer that in their mindset they're like, "I'm going to break that thing, or I'm going to break it down just to see how it works." I'm not like that. I'm not a puzzle solver. I don't actually care how it works behind the scenes if it works immediately. So this was a stretch goal of mine. And I think a lot of devs don't think about reverse engineering unless they have something that they want to get to. Sean: Yeah, I think most people think of what does the happy path look like of using this API? And then behind the scenes, there's all these other use cases that, like you're saying, unless you have that mindset you might not immediately be thinking about... And so, does hacking play a role in this? Joyce Lin: Yeah, I actually quoted a prominent security, I don't know, influencer Alyssa Knight. And she said, "Hacking a lot of times, that there's a negative connotation." It's like, "Ooh, you're a hacker, you're doing something bad, maybe illegal, but it's just figuring out how something works behind the scenes." And so, hacking, reverse engineering, it's all the same thing, except there's also legitimate occupations like penetration tester or something like that, that are doing the same thing. Sean: Yeah, so hacking more in the exploratory sense, and the term hackathon too doesn't necessarily mean it in the hacking, in the exploiting way. So yeah, definitely it's not just a bad connotation associated with hacking. Well, okay, so someone's made this API and thinking about reverse engineering it, what kind of advantages would there be to doing that, versus just using it how they might have intended? Joyce Lin: Yeah, so it's an interesting question because some people will never come across the need to reverse engineer something because something... A company offers a public API. It's so beautifully documented, and it's so rich and robust, all the functionality's there. But that only happens in a teeny tiny sliver of instances. A lot of companies don't offer a public API. Shame, shame, but they don't. Or it's a little bit, here's two end points. You can get everything, but you can't actually post anything. And so one of the examples I had was TikTok'ed. I'm on TikTok and they have an API, but it's not what you would expect it to be for. So I want to potentially be able to post TikToks, I want to get information, get metrics about my TikToks. And they don't have a robust public API to control your TikTok data. However, they have a mobile app, they have a website and that means they have web APIs that you can sniff and figure out what's going on. So in that case, I selfishly want to poke at TikTok's web APIs to figure out if I can, again, selfishly build something for myself. Sean: Yeah, that's really cool. There might be more functionality behind the scenes that they aren't publicly talking about, but it could help you use the app in a better way. Or in your case it seems like just a more privacy focused way where you had better access to your data. Joyce Lin: Yeah, so there's a couple layers. I think the title of my talk talk was reverse engineering a private API, but private API could mean internal and then private API could also mean an undocumented public web API. Which is of course I would never try to hack beyond their firewalls and hack their internal APIs. But the ones that are undocumented that they're using that are perfectly visible to any developer that knows how to look at it, they're available. Sean: Yeah, and it also brings up a good point of making sure that we're reverse engineering and API in a way that adheres to the company's policy or something. So what kind of things should we consider with that regard? Joyce Lin: Well, anytime I take a look at something and maybe run... Like get calls or post calls that are not super well documented, I always take a look at the terms and conditions. I always make sure, okay, if I get thrown off of this platform, how painful would that be? Just because maybe I ignorantly didn't know that I was overstepping my bounds. And then of course, I mean I'm not a great reverse engineering person, I don't spend a lot of time doing that. But for the people that do, if you find a security vulnerability or some huge data leak, you always want to talk to the company and give them a chance to fix it. Because otherwise there's criminal and ethical liability. Sean: Yeah. Of course that would always be good to disclose that information to them first so before it gets out to the public, they have a chance patch that and keep people's data secure. Joyce Lin: Exactly. Yeah. Sean: And there was a fun, I saw in your talk there was a fun post on Hacker News about someone reverse engineering and API to get quote, unquote, extra bacon. Do you want explain what that one was and how that person benefited from reverse engineering a private API? Joyce Lin: Yeah, this was, I think an old hack. So delightful because everyone can relate to. I want something that the company, and I think this was Papa John's at the time, Papa John's doesn't want me to get too much bacon because maybe they think it wouldn't make a good pizza, but no, I know myself, I want extra bacon. And so just being able to inspect their data model by seeing, okay, here's the bacon object, but then here's also the add-on object. And then there's a couple different ways you can possibly go in laterally to get more bacon. That's a fun one because I think everyone can relate to maybe the UI, maybe their mobile app isn't great or they're just trying to keep me from my bacon. Sean: Yeah, I think it also hits home because there's a literal real world event that takes place because of modifying the data, sending that API, like someone at the pizza shop physically puts more bacon on the pizza, which I think is really interesting. Yeah, I guess, and so what are some techniques that we could use to start, you said coming in laterally, I guess, do you want to explain more about what that means or other ways that someone could start to investigate how an API's working behind the scenes? Joyce Lin: Yeah, so I think part of it is just having visibility. So if you're a web dev, you totally know opening up maybe your browsers dev tools and you can see network calls. If you're just getting started with development and you don't even know how websites work, that can be super eye-opening, seeing that stream of network calls coming through. And if you're really savvy with network tools, sorry, network tab in dev tools, you should be able to filter, you should be able to manipulate that data quite a bit. But I work at Postman, a lot of developers use Postman to fire off API calls. And so being able to copy a network call. A specific maybe suspicious, maybe juicy network call in dev tools and then bringing in a Postman. And if you have, TikTok has a ton of query parameters, they have very complicated expectations for payloads being sent to the servers and also being returned. So Postman lets you parse that a little bit more easily and see everything broken down into key value pairs and such. Sean: That's really nice. So once you copy it into Postman, you can then kind of modify it as you will without having to necessarily catch it while it's happening on the website if you're looking at the Chrome dev tools? Joyce Lin: So Chrome dev tools is going to be a stream of network calls, but if you want to just pluck one of those out, or actually I think there's HAR files nowadays, HTTP archive, I forget what the R stands for, but a stream of network calls, you can actually bring in session data and just kind of save it. So in Postman, yes, you can tweak it, explore it, send off API calls, but you can also save a sequence of calls if you wanted to. So you can replay it later. Sean: Okay, can I import a HAR file into Postman and that will kind of unpack, unarchive the requests? Joyce Lin: Exactly, yeah. So you'll import the HAR file and then you'll see all the behind the scenes API calls that are going on that you might not have known about. Sean: That's really cool. All the juicy details under all those requests. Joyce Lin: Exactly. And it is juicy too, because a lot of those are, well the vast majority, let's say 99.9% of those network calls are not being reflected some way on the website or on the mobile app. This is all marketing analytics, it's going to be retrieving data, and it's a lot of tracking, to be honest, at least with the case of TikTok. Sean: And also with the case of TikTok, well I guess you can use it if you're not logged in, but some sites require you to sign in first. Does that come into play when we're reverse engineering? Is there a way to, within Postman or other tools, sign in as if you are using that site through the app or through the website, but then also make the requests authenticated? Joyce Lin: Yeah. So that's going to be up to the server, right. The service that you're trying to hit will have their particular flavor of auth. And so there's a really fun way of showing cookie based auth. Cookies associated with your web session. And you can either manually upload your cookies into Postman or you can sync them. So I think I showed, I want to say GitHub. So if you go to github.com, you're going to see a landing page, a prompt to sign up. But if you're logged in, you'll see all of your GitHub reposts, maybe an activity feed. And so if you can capture those session cookies and then have them in your API calls, include them with your API calls, you're able to see a lot more information than you wouldn't as a logged out user. Sean: Okay, I see. That's really neat. And then are there ways to code it or script it within Postman to have chains of API calls, the one that logs in, then grab the cookie and then down the line do more requests and sequence? Joyce Lin: Yeah, this is the next step beyond reverse engineering. Reverse engineering is figuring out how something works. But the next time you want to replay the sequence, again for selfish reasons, say it's that bacon on the pizza scenario and you're like, I'm hungry, I want bacon, bacon, bacon pizza, being able to replay in a programmatic way. It could be a bot, it could be... You can hook it up to a button that's sitting on your desk, but being able to fire off that entire sequence and then for example, pass along the cookie, well I'm saying cookie with pizza, but pass along that session cookie and then have the next, and then add bacon to the cart, add another bacon of the cart. Cycle on that call until you've max out on bacon and then complete the order. Yeah, you can totally do API sequences. Sean: We're just going to take a quick second to ask. If you're enjoying the podcast, please consider following us on Apple Podcasts. Yeah, I've definitely seen posts online about people doing similar things with bots that will buy something before it goes out of stock or something. So you can kind of race the shopping cart and be the first to check out. Joyce Lin: Yeah, if like sneakers go on sale, you want to be the first one to purchase sneakers for yourself, but maybe you want to flip them and resell them. So how are these sneakers selling out so fast? Well, the answer is bots. And so I shared a story about... This is one that I'm kind of proud, but I also feel deep shame, so please don't give me any cruft about it. But I wanted to go camping at Yosemite. And Yosemite has good campsite and they have bad campsite, well not bad, but less good camp sites. And it's so hard to get the good camp sites. They sell out within seconds and it's because of all these bots. And I got so angry that I had to reverse engineer the website, the .gov website to figure out how to reserve those camp sites. And so that's one that I do feel deep shame about because I am so privileged to be in tech and know a little bit about this. But also I'm kind of proud because I got those camp sites. Sean: Yeah, it really is a leg up to have that inside scoop, that inside knowledge of how to gain the system a bit. I've even seen Chrome extensions now that will bot certain sites and I think some of, at least the retailers are finally catching up where they'll like make you wait in a queue if you want to buy something and they'll like send you a notification if it's your spot in line. And so they just kind of do it in an order. So hopefully I think that the botting and things going out of stock really fast, that problem will be solved. And then once that's fixed, people maybe won't look at botting and reverse engineering in a negative light when they associate it with not being able to buy a PlayStation five or something. Joyce Lin: Yeah, don't ruin it for the rest of us. Sean: Exactly. So how do you see reverse engineering APIs affecting how developers create APIs in the future? Do you think it might enhance the quality of APIs if people expect them to be reverse engineered, they might just make them the way that the public wants to consume them in the first place? Or do you think it could have other effects potentially? Joyce Lin: Yeah, I came across a couple instances. The CouchSurfing and Starbucks APIs both got, I was going to say hacked, but I mean reverse engineered and publicly talked about. And it was in... A lot of the desire there was, Hey Starbucks, hey CouchSurfing, please offer a public API. You could do so much more with your service. And I love Starbucks, please let me do something with your API. And so reverse engineering could push the company to offer a public API or offer a better public API. But sometimes you don't see that happening because the pressure's not there, they don't need to. There's no incentive for the company to offer an API. And so I'm not sure if reverse engineering helps in that sense. But I think knowing how to reverse engineer something is a very, very powerful tool because developers can kind of feed themselves. They can... Not feed, that's not the right verb, but they are able to be more resourceful with what's available. They don't need to rely on what the API provider decides. You can have this functionality but not this other functionality. Sean: Yeah. I think that's a really creative use case of it too, because people can almost like... It's almost like they're developing on top of an existing product. If you go in and say, Oh, I want to combine these, make a new API call that combines these two different ones or accesses one of the undocumented features, it's kind of like modding in a way. So yeah, I think the culture around that is really cool in terms of just adding features before the company that offers a product makes them. Joyce Lin: And in that Yosemite example, I want to say that there's a couple companies, startups, that started up because there was such this pain point of trying to find camping or be notified when it's available that people did build entire companies off of those web APIs. However, then the danger for your startup is if the Yosemite or .gov site decides to change their web APIs, you're kind of... It's going to be a breaking change for your business model. Sean: And I think sometimes we see the businesses kind of give in, or at least with the aggregator sites like Expedia or Google Flights has their own product now, where the airlines will kind of... They just will collaborate and send the data to those aggregation sites. Because you don't want to be the one who is locked down their API and isn't sharing their data, but now you're kind of missing out on that new traffic. So I guess potentially it's changing how people develop in that way where they're more forthcoming with how they share data. Because people are going to kind of get it anyways. Joyce Lin: Yeah. And that's leverage. If you have that level of traffic coming through the aggregators, then the flight providers need to comply. Sean: Yeah, I guess Southwest still is now participating in those Google Flights or Expedia. I wonder what the business [inaudible] there is, I'm curious what they're thinking. Joyce Lin: They're saying all roads lead to our website, you're only able to book flights on our website. Sean: Yeah, exactly. and then there was another part of the talk that we haven't touched on yet regarding web proxies. So how does that come into the picture here with reverse engineering APIs and developing APIs? When might you want to use a proxy? Joyce Lin: Oh, that was my favorite part of the reverse engineering, because the base case is visibility. Being able to inspect and replay a single API call. We already talked about HAR files, that's like an entire session captured in a single file. But web proxies allow you to capture real time data. It's like sniffing, I don't know if that's the proper word, but web proxies are so cool. When you have one set up, you feel like a real hacker and you see a whole stream. So let me back up and say what the web proxy is. But the example that I went through was I set up Postman as a proxy for, I think it was TikTok, again. I was browsing TikTok on desktop and I was sending, like filtering those API calls through Postman and I could see those coming in real time. And part of that is I can say, Okay, stop recording. Okay, now I just got the last five minutes. Let's sift through all the network calls. I can use filter search, that kind of thing. But also it was like 90% of those TikTok calls were posts. I wasn't doing anything. I was scrolling, I was barely clicking. What's the posts TikTok? And being able to drill that. It's marketing tracking, that's the TL;DR, but 90% plus were posts. Sean: That's really interesting. Yeah, you're not necessarily sending them any data, but they're somehow sending data. So I guess it's exactly what you're doing. Maybe you think you're tracking it. So in that scenario, is the web proxy also acting as a server running on your computer? So you're sending data to it and then it's kind of forwarding it to TikTok? Joyce Lin: Exactly. It's a substitute server. So it's just kind of getting in the middle, like the man in the middle attack, it's getting in the middle between the client, which is going to be browsing TikTok on let's say Chrome browser and then TikTok servers. Sean: Okay, interesting. Yeah. And then the man in the middle thing is interesting because just I guess as a PSA for people, if you're in a coffee shop or on a public unsecured network, people could be doing the same thing there where they're sniffing the data that's going out. So yeah, keep that in mind, the man in the middle thing. But in the proxy scenario, that's really cool because you can kind of filter down what requests that you're sending to TikTok. So I guess, so you mentioned Postman. Are there other tools that can act as web proxies? Are any of them free open source? Joyce Lin: Yeah, so I did have a slide, and again, reverse engineering is not my wheelhouse, but I did suggest some very, very popular ones. Postman is of course free. There's man in the middle proxy, mitm-proxy and Fiddler and Burp suite. There's a lot of very popular, very used by the developer community proxies that you can use to sniff traffic, pause traffic, step through traffic, modify it on the way to the server. So there's a lot of different options out there. Sean: Nice. And that's super useful, again, from the development perspective to be able to take your time, understand what each request, then kind of move on to the next one versus Yeah, again, if you're just looking at the Chrome dev tools, the stream all happens at once and it's kind of hard to catch what it's doing while it's happening. Joyce Lin: Yeah. Sean: So something else you mentioned in the talk was one of the takeaways regarding reverse engineering, it's called Hyrum's Law. Do you want to talk about that, what that law is? Joyce Lin: Yeah, so this is the part of the podcast where we get to feel smart and really nod our heads because Hyrum's Law is something that I give a whole gamut of different types of talks. And Hyrum's law really comes into play quite a bit. And once you see it, you can't unsee it. But Hyrum's law, let me just paraphrase. It's a Google engineer called Hyrum Bright that said this and said APIs at scale, when you're at scale, any functionality that is available will be relied on by somebody, even if you didn't intend them to rely on it that way, it will be relied on. And that's what I'm talking about with the web APIs. The web APIs are something that, for example, the TikTok engineers are using. And yes, they know that you can access it, but they're not expecting you to use their web APIs. Not in that way. And so anything that is potentially offerable and available will become depended on. And so that's the businesses that are cropping up. To do something with TikTok traffic. And it's like if they change their web APIs, it's going to break somebody's workflow. So that's Hyrum's Law. Once you see it, you can't unsee it. Sean: Yeah, that makes a lot of sense. At scale, there's people who are going to be testing everything they can possibly do with the API, and if something works, people are going to start using it, building workflows on it, like you said, building businesses on it. So I guess that kind of, it makes you think you got to design the API first, or one way of doing it would be to think about the API first and then kind of going and writing underlying code, because the way the API looks is how people are going to be using it anyways. Joyce Lin: Sean, did you hear a different talk I just gave, because I literally just gave a talk last week about API first. Is Log Rocket API first. Sean: So some of our newer APIs... We've used Open API, which is a really cool spec for designing APIs. And what I love about that is you can import it into Postman and then you can just start hitting send on those requests. So you don't have to manually make them, you can just define the open API spec and import it right in. But yes, I might have done a little bit of research and seeing that you gave that talk. I think that's also a really interesting topic. You don't necessarily need to go too deep into it today, but for our listeners, if you're curious to see more of Joyce's work, you can check out that one as well. And yeah, before we leave, Joyce, is there anything else you want to plug where people can find you or Postman related things online? Joyce Lin: I guess I'm on Twitter. Feel free to ping me on Twitter, especially if you have questions about reverse engineering APIs. I'm also on TikTok as of recently, clearly. But don't tell TikTok that I'm reverse engineering their APIs. Please. Sean: We'll keep that one on the down low hopefully. Joyce Lin: Yeah. Sean: Awesome. Thanks for joining us today, Joyce. Joyce Lin: Yeah, of course.