The following is a rough transcript which has not been revised by Vanishing Gradients or Mark Saroufim. Please check with us before using any quotations from this transcript. Thank you. hugo bowne-anderson Hey there, Mark, and welcome to the show. mark saroufim Hey, thanks for having me hugo bowne-anderson such a pleasure to have you. I'd love to get started by just getting going on a journey with you getting an idea of your interests, how you got into the data world and how you ended up where you are today with respect to machine learning in particular. mark saroufim All right, yeah, that's sort of a long winded story. But it's kind of maybe it's interesting. Let's see. Yeah, so I sort of started out, I would say, in AI, like, pretty early on in my career, I just like, took an elective and undergrad. And I was like, this stuff sounds cool, like cooler than, I don't know, communication theory or whatever. So I took that I got very interested. And then I applied to UC San Diego for for a master's program, because I felt their neuroscience program was very strong. So I wanted to go into like neuro engineering, build artificial brains, like I was sort of interested from a sort of philosophy of language perspective, and a lot of these problems. But sort of like very quickly, I would say, like, about like, three months ago, when I started to become a lot more interested in like the math of machine learning. So this was before deep learning frameworks, like, you know, Theano was just like, this new kid on the block. And my background back then was actually more of a theoretician. Like I felt very comfortable doing math stuff. I actually had a great experience at UCSD. But then I decided to like go get a real job, because I'm, uh, you know, let's see what that's like. So I worked at Microsoft as a product manager there. And then the pitch that was told to me, it's like, oh, like, you're like the entrepreneur, like you go, like, convince other engineers at Microsoft to work on stuff, and you know, get them excited. So it's like you're at a startup, but like all of Microsoft is your talent pool? I don't think that was entirely accurate, to be honest. And also to be clear, like, I don't think I was like, particularly good at that job. I think I wanted to do more development stuff. hugo bowne-anderson And did you have computational skills? Because you mentioned when you're at UCSD, you were doing a lot of theoretical stuff. But were you interested in computers and programming that type of stuff at that point, as well? mark saroufim Oh, for sure. I've been interested in computers since I was a kid. And if I'm being totally honest, like I only felt like I started to become a competent engineer. During my spare time, after my day job as a pm at Microsoft, I sort of started to take things very seriously, like look at open source, like spent a lot of time there. But initially, it was very much like an uphill battle. And because of that, like because I didn't really start coding when I was eight or something. Like many times in my career, I've done like, Well, should I really be in this like coding thing and stuff like that. But it's just like something I keep coming back to like, I enjoy it. I think there's a lot of deep stuff there. I also find it more fun than math because I've always said that, like a piece of paper cant tell you that you're wrong. So they use get uses the right stuff. And it may be it makes sense to you. Maybe you can convince other people that it makes sense. But it may not actually be true. And also it's not easy to necessarily visualize the implications. So that's kind of what got me really hooked to programming and interested in it. hugo bowne-anderson It actually reminds me of the old was a who wasn't maybe pally who read a paper and said, it's not even wrong. mark saroufim Yeah, like I had a philosophy professor at a joke that was like a physicist just needs like a lab, like a piece of paper and a trashcan, like a mathematician needs a piece of paper and a trashcan and then a philosopher only needs a piece of paper. So that like always stuck with me that like I do think you need tools to keep you honest. Otherwise, it's easy to do whatever. hugo bowne-anderson Yeah, it comes down to where you're building things as well. Like if you're building things conceptually, what does that world look like as opposed to if you're building software, or hardware and these types of things and how it manifests in our shared reality I am interested in, you're going from neuroscience to product, which seems like an interesting pivot, particularly as it then took you to working in tech and thinking about products and about a lot of machine learning science as well. So what prompted that decision to go into product and then what happened after that, that led you to where you are today, mark saroufim it was a very strange decision making, if I'm being honest with you, like probably my friends or family or something I don't regret. But I do feel like there was sort of like a misjudgment on my part, which was, so this was pre the deep learning, boom. I loved the deep learning and machine learning so much, that I figured, well, I love this so much. And I could probably spend my whole life doing this. So let me go do something else and see what it's really like. And if I really love it, like, I'll come back. And the problem like with this reasoning was that I had to spend a lot of time to keep my tech skills up to date. Like it was just like, it was really hard. Like I had to like, basically like work two jobs, both of which I was doing like, okay, like I wasn't shining at both. But the reasoning was like, I always had this like, I guess, desire to be like, self made and do my own thing. And I believe the pitch around like, Oh, this is like a startup but you go find employees. I'm like, oh, like, this is like a startup but like zero risk. Of course like I should do this like this is totally a great decision. And I guess like my take on product, it's hard to say I regret it because it gave me sort of like a different outlook on a lot of stuff. Where generally my take today is that I think if you're young and you want to go work in tech, I would not recommend you become a PM, like early career, I think it's a mistake. I think though any engineer would benefit from having pm ish skills as in, like, are you organized that's useful? Can you organize groups of people, that's also a useful skill? Are you a good writer, also another great skill. And they're all multipliers for your technical ability. But they're multipliers. And I think if that's all you have, early in your career, one, you don't actually know how to work with people yet. Because you're young and you're dumb, you know, you have no idea what you're talking about, you have no experience working with people. And at the same time, you don't necessarily have the technical chops to have like a really like profound vision. So at least in technical products, like let's say, maybe from a consumer standpoint, maybe you could, but if it's like, let's say, I don't know, I'm working in data infrastructure, that's kind of hard to have vision about unless you go learn about all the existing tools and what sucks about them. So I definitely think experience has an advantage. hugo bowne-anderson Yeah. And what sucks about them is like, understanding the trillion paper cuts, and the suffering involved in using tooling is how you start to really think about building good tools as well. mark saroufim I mean, like, it's all old problems through like, you know, it's just like the sequel is just like this incredible technology. It's like lasted for like decades at this point, like, probably could last another 50 or 100 years. For all we know, at this rate, like it's just, that's the kind of staying power that I think some technologies have there. So going there and saying, like, oh, like, I think I can do this better. I think you need a lot of upfront knowledge to do that, for sure. hugo bowne-anderson So what happened after Microsoft? mark saroufim Yeah, so also an interesting story. So I was a PM, and I'd switch teams through like a, like a team that's doing like our AI with Outlook. And I did that. And what does that mean? Basically, sort of figuring out what are important tasks in an email? Is this coming from your boss? Is it really important, so like, just the Priority Inbox like features cool at the time, and again, I think they're I did that for not too long. If I'm being honest, like maybe eight months. And then I sort of had the same problem where I'm, I don't think my problem was working, not working in AI, because I was still a pm there. It was not being like a tech person, like a scientist. And so I went to like my deaf counterpart, like Sebastian, who, like I love and, you know, really admire a lot to this day. And I told him, like, I'm thinking of switching jobs, I kind of hate to being a PM. And he's like, No, No, you shouldn't switch, like, why don't you just like, switch to my team, and you can be a scientist. So by that point, I sort of did that for a couple of months. And, again, we were working on a ranker, basically, just a totally new like inbox feature where you would just like come in, and it would prioritize your email by importance instead of by date, like, show you all sorts of interesting metadata. And so because I found myself doing like a bit of iOS development, some maths and science, some product, I'm like, This feels like the skillset of an entrepreneur. So maybe I should give it a shot. And so that led me to the other like, sort of major journey of my life. That was, I learned a lot. This was probably one of the most unpleasant experiences in my life, which was starting like my own company, which was URI AI. So the story of URI was, initially when I left Microsoft, I didn't really have a concrete plan of what to do next. Like I figured, okay, like, maybe I'll spend like two years doing something. If it doesn't work out, like worst case, I'll go get a job. But in the meantime, let me see like what I can do. That's interesting. So the reason I got into programming in the first place was to make video games. And I realized that after six years of being in tech, I have made zero video games. I had a unity book on my bookshelf, and I'm like, oh, like, I should pick up that book. And then I realized it's been on my bookshelf for two years, without me opening it, even though I thought about it all the time. So I'm like, something's obviously wrong here. So I started to make like some bunch of games in unity. And then I realized it was really, really hard to make good AI for games. And then I'm like, Well, you know, maybe that's the company that maybe it's a reinforcement learning service for game developers that I could build. And again, that turned out to be like a giant mistake for a few reasons. So I can tell you about like all the ways not to start a startup at this point. So I don't want to add the technology itself wasn't to the point where it was really plug and play, you needed a lot of data, you needed to have lots of domain knowledge about both the games and various algorithms that work. You needed to change Game Engine developers code to be RL friendly, not something they're all like, willing to do. I was doing cold reach out to these large companies. And of course, I don't want to answer me because like, Who the hell was I? And when I would reach out to startups, they would tell me like, we're not really AI isn't really that important to us right now. So great. So I was really stuck. I felt like things weren't progressing. Like What year was this? This was four years ago, not three years ago, something like that. hugo bowne-anderson Okay, so and there was a general buzz around reinforcement learning at that point, as well in a way that yeah, that was the Dota two point. Yeah, exactly. In a way that they isn't current, like it seems to have faded away slightly for the time being, right. Yes. Yeah. mark saroufim So there's a few reasons for that. I can tell you about the pitch, right? Like the reason why it would be interesting is imagine a game like Dota. So in Dota, you can have like, let's say 100 characters. Can you just remind the listeners what Dota is? Sure. So Dota as a competitive multiplayer game, where five players play against five other players to get objectives on the map, so you want to like destroy the enemy's base. But to do that, you have to do multiple things, you have to kill the enemy, you have to get gold get items. So you have lots of like miniature goals. That's like a very complex like planning game. I compare it to like, sort of like chess on steroids. Because it's like live, you don't really have time to think. But the decisions are sort of equally heavy. So open AI specifically built the added this incredible achievement, and they built like the open AI five, which was a bot, that was able to be the world champs, OG at the time, this absolutely blew my mind, because there's a few applications of this. One is, well, you can now train people to become really good at a game like I don't need to have access to players like OG to become good. I can automatically balance the game. So you can see what would a really good player do in this game, and then figure out how to change the space of strategies in a game. But also you can say, well, let's say someone disconnects from the game, you could swap in a player have like similar skill sets, and they can keep going. So there's tons and tons of applications to this. I think it's died out because people have realized that it sort of works if you're willing to put in a lot of work for it. But I would say still, for the majority of companies, it's probably not that useful. And back to startup mistakes, I think I got really married to this idea, when it would have been much better to sort of explore it, make a small bet, like bomb the time, doesn't pan out, do something else. And the reason I have this perspective now is because about like a year and a half into this COVID happened. And like most of my money was in Microsoft stock. And so I saw like my net worth, like collapse by half. And I had no income. So I'm like, fuck, and then I started to apply to the jobs. And the other people that were very willing to give me a job pre-COVID, I mean, we kind of did a hiring freeze now. And I'm like, freaking out. Like I'm seeing sort of this negative, I sort of lost confidence in my startup. And I felt like like something needed to change, I needed to go get a job quickly. So it's almost like a consolation prize. I wrote an ebook called The robot overlord manual. My motivation for that book was literally, I was getting like nightmares from imagining myself talking to recruiters. And being like, well, I did this thing for two years, and there was no outcome for it. hugo bowne-anderson And I also we might get here, but I will put something in the show notes, you've got a post called the myth of objective tech screens, which is, I assume, based on reality, but a set of dialogue between you and hiring managers, which is fascinating, brutal, beautiful, in different turns. I'd love to think about recording them with you at some point as well. And releasing that maybe, but I'll share that in the show notes anyway, mark saroufim for sure. I will say that prevailing I think one thing I really love about like American culture is that people like admire risk takers. But that's like more in principle, what I quickly realized, like interviewing hugo bowne-anderson hiring managers are not risk takers. I'll tell you that. Yeah, not really, mark saroufim because people would maybe not mistakenly assume that because I failed at something like this, that I would probably fail on whatever other ambitious project they had. The reason I found this, like very disingenuous is because for the majority of the people that interviewed me and gave me this brutal feedback around all the mistakes I made, and how if I did this one thing, things would have worked out better. And none of them had started a company, you know, like, so? Well, maybe I'm fast forwarding here for but the best interview I had in my life was by this like research engineer called Jacob at the fair, when I was interviewing for my job at pytorch. And it was literally the first time in my life, someone interviewed me telling me great, like, let's assume this worked out. Now what tell me about all of the infrastructure you would have built? Tell me about all the components. Tell me about the impact. Tell me about how you saw these modeling problems. And I was just like, Thank you, like, Finally, someone who like, gave me the benefit of the doubt. But I do think in interviews, like that's something I try to take to heart, which is that it's best to sort of show people in their best light instead of they're worse. Otherwise, you're sort of punishing, I think people who try to do things that are a bit different, which is I don't think the desired outcome. hugo bowne-anderson in a lot of ways you sit in the conversation up for failure as well, by design. mark saroufim Yeah, I mean, a lot of these interviews are designed to sort of like, I guess, like, not hire people who may fail, as opposed to increasing the rates of hiring people who may succeed. hugo bowne-anderson Absolutely, yeah, they sacrificed or they accept a lot of false negatives in order to reduce the amount of false positives, right? So the big filter or whatever we want to call it, right. mark saroufim I think it makes sense for their purposes. I think from a startups perspective, I think there's, excuse me almost like an opportunity, because you can find people that are mispriced in the market. And actually, as part of doing a lot of my work in open source. I find these people all the time that I find them all the damn time. These like 17 year olds that are like 10 times better programmers than me, that are like absolutely gifted at writing and coding. And because they necessarily weren't born in the right place, or they didn't go to the right universities, it's very difficult to actually get them hired through traditional pipeline. Even things like portfolios, like don't have like a good way of being measured unless they're exceptional. If your portfolio is like I invented pytorch, that's great. But if your portfolio is like, I invented this random deep learning framework in Haskell, that's not that very useful, but pretty interesting. I think that counts for a lot. I think that counts more for more than a degree. But it's actually not that useful. Therefore, I can't use it that much as a signal. hugo bowne-anderson Yeah, absolutely. So now you're at pytorch. And working full time there. mark saroufim Sorry, I skipped two very important steps actually, before there, okay. But before pytorch, I wrote an ebook called The robot overlord manual. And so at the time, I was a total anon on Twitter, no one knew who I was like, even at Microsoft, I didn't necessarily have this widespread reputation, like maybe some people knew me on the teams that I worked with, or whatever. And a couple of like, my book chapters made it to the front page of Hacker News. And that sort of started to bootstrap a reputation of someone that explains technical stuff in a very clear way. And for the most part, it was, like I said, a consolation prize. And I guess gave me enough so that when I applied to the nice company, I worked at Graph core, like they were willing to have me. So graph core like this was my first time doing I would say, like a non traditional role. My role there was sort of like on the field, helping customers on board to graph cores hardware. So I got to be much closer to the sales pipeline than I usually was in my career. And that also gave me a unique perspective about how long the damn sales cycles are, like, you're selling something really expensive. It takes months, and also gave me a perspective of a lot of times, I would think about wanting stuff to be self serve. But I also realize like, actually, people don't necessarily want that they want you to be on call and solving their problems and proactively making things better. And that's just not a perspective I had before working on product or dev teams. So late into my graphcore career was when I wrote like my first big substack article called, like, the great stagnation, hugo bowne-anderson which we'll get to in brief a bit later, before we move on to more mark saroufim recent work. So a lot of people talked about that article. And like so much so that it changed my life, I started to meet a lot of people on the internet, doing all sorts of interesting stuff. And eventually, that new reputation, I think, laid the seeds for the role that I now I have a pytorch. So yeah, alright, that's what I really do. Before hugo bowne-anderson we get there. What I'm interested in teasing apart there is that what I'm hearing correct me if I'm wrong, is that your online presence and reputation became, I suppose, a large asset in hiring flows and getting jobs, mark saroufim it became real, because three out of the five people that interviewed me had read my articles before. So that was kind of interesting. We'd have a conversation, I would say something like, I actually said this exact same thing, an article that I wrote, and they'd be like, Yeah, I know. I read it. Yeah. And I was like, interesting. That's never happened before, right? Before, I'm always sort of trying to justify myself, like, here's why I'm not a dumbass. Here's why I'm useful. Whereas here, I felt like I could be myself and say, like, look, I have these opinions. Like some of them are strong. I think they're correct. But who knows? I'm happy to take criticism. Yeah, but they were interesting enough to start a conversation. Obviously, though, like their interview processes, there are still like very much the boring stuff we all know about, like coding interview, the design interview, the behavioral interview, the manager interview, so I did all that stuff. The main difference is that when it came to those, like last few minutes of discussion where you're trying to build a rapport, I felt it was easier than past interviews that I've added in my life. Okay, because of what I've written. Yeah. Interesting. hugo bowne-anderson So what do you do the pytorch? Yeah. So at pytorch, mark saroufim I work on a team called like the applied AI team. And our main focus is to make it really easy for people to deploy pytorch in production outside of Meta. At Meta, specifically, it's actually really easy to deploy pytorch Because we have really smart like highly paid people building like really nice, complex, scalable infrastructure. And it's not necessarily too obvious how to open source all of that. And so you end up with a gap, right? Like pytorch is great, but then how do you deploy it. So specifically, when it comes to my day job, like I spent most of it maintaining towards surf, which is a model inferencing tool. So that basically means a tool that lets you take a pytorch model, and basically run model that forward. But do it in a scalable way, do it with batch sizes, do it in Kubernetes, do it with metrics and all that cool stuff. And I also contribute to torch x, which is you can think of like an ML ops tool for job launching. So I work very closely with any scale team that a high level if I were to summarize, like what I do, it's I try to find areas where I can collaborate with people in open source to make pytorch easier to deploy. So sometimes this involves writing code. Sometimes it's involves like writing blog posts, and that freedom is something I really appreciate. I think this is not something I felt like I've had in past jobs I've had, where I felt things that were more hierarchical. Like as then the CEO said this, the product team said this, you go execute this. Yeah. Whereas I feel like called the culture at pytorch is a lot more bottoms up. Like it's a lot more, it's much more like the engineer comes up with a prototype for something gets people excited about it, sees who's excited about it externally. So I get to do that with a focus on the outside world. hugo bowne-anderson Awesome. So that bottom up focus really does seem in a sense to from open source import processes or whatever, right like it is, it is a very open source mentality. So I suppose my question is, I have two questions you can answer either or neither of them. The first is, what is matters interest in having open source technology, do you think and the other is? Why is Open Source important to you? Sure. mark saroufim Yeah, I can like briefly answer the first I'll answer the second and longer form. I think, like, at the end of the day, like AI is sort of like I guess, too existentially important to purely outsource, there's a lot of benefits to being like an active maintainer of a project. hugo bowne-anderson And when we're using the term AI now, are we talking about deep learning, essentially, or just machine learning more generally, or mark saroufim I think for now, like, for all intents and purposes, like, yeah, we're talking about deep learning, we're talking about like language models, and we're talking about vision models. And these are like, sort of so important to the products that matter, that it makes tons of sense to basically in-source those things. I think that there's like some side benefits. And this may be keys off into why I'm interested in it is the pytorch brand is a very well loved brand. And it's probably justified my reasoning for wanting to join meta, because I just I think Pytorch is like amazing and huge hugo bowne-anderson growth in the past several years as well, which is it's not just like the absolute value of brand and vibe and all of that. It's the slope of the curve. Right? We're accelerating into it. Right? mark saroufim Yeah, I think Karpathy sort of said it busts, right, like sort of, we're in software 2.0. If you imagine that ML is going to be as commoditized, as I don't know, like learning a build system, then this sort of makes sense, right? It's this sort of, there's a lot of problems that you can solve, there's new architectures that we're figuring out, people are figuring out how this fits into their existing infrastructures. So to me at least, and obviously, like, I have somewhat some skin in the game here. But I'm in pytorch, because I also believe that machine learning will grow exponentially in the future. And I think Pytorch will play a big part in that. hugo bowne-anderson Great. So what's your interest in open source? More generally? I mean, pi torches side? How do you feel about open source? Why do you like it? What are your concerns? mark saroufim No, not at all. I mean, I guess I have a few like, this is maybe like a longer answer. So let me start tackling it one by one. Sure. From a personal perspective, before I get to my sort of ideology, like just Marks's perspective, one thing I really love about open source is that my portfolio is public, I love the idea of working in public, because it sorta like gives you this reputation that transcends the specific place that you work at. So it becomes a question of people can see your value, regardless of what you're doing, or your specific relationships. I think the other benefit of it is I get to meet lots of interesting people. One thing I really struggled with early in my career, this was, again, as a pm at Microsoft is like, my interests weren't totally aligned with what other people were interested in. So when I would talk to them about stuff, they weren't necessarily too interested. But over time, I think that's okay. Like, it's a bit me expecting, like, I don't want my mom to be interested in machine learning. Because I am, it's like, well, not really, like we should find other topics, like go find some other friends that are interested in talking about this. And that's exactly why I like open source is because it's sort of really easy to meet people that I would love to work with, like people that I genuinely admire that I'm like, wow, like this person is like, 10 times smarter than me. And because I work in open source, I always have an excuse to go work with them. And I love that. So that's benefit. Number two. Benefit. Number three is I think open source software tends to be better written than closed source software. So I think it makes me a better programmer. The reason that's the case is that I've worked a lot of an internal tooling. So you're a team that builds something for other teams at your company, your products aren't necessarily being used. The problem with this is that you're sort of a monopoly by default, which means that as long as your product works, your customer is okay with it. Maybe they'll complain that they don't like some stuff. But that's fine. The problem with open source, though, is that if your product is okay, people will either tell you that your product sucks in a GitHub issue. That's if you're lucky. And if you're unlucky, that I should go use something else. So I actually think.... and so back to again, like let's talk about sort of ideology. There's nothing about biters specifically. I think that if you're like a smart engineer, and you want to get product market fit for a product, and you happen to be in the dev tooling space, I think this is a very reliable algorithm to do it. Step one. You learn about open source projects and write about them. You use what you wrote to meet people, figure out what they're contributing to contribute to their projects. Now Get users for free. If you can actually get users for free, that's a pretty strong signal that maybe you can build something where you charge people. So I think of it as the best practice, like I got in my early career, people told me, Oh, pm is a good way of training yourself to be an entrepreneur? I don't think so I really think that building dev tooling and open source is that because it's full stack, back to my day in torchserve, I wake up, and I see all these people complaining about stuff on GitHub issues. So I have to go read them. And it's not, they're not always trivial, I need to go get a repro I need to come in maybe there's like some gnarly bug, I need to debug get set up environment. But then, as a side effect, I become very efficient at setting up my environment. And then I'm like, wait a minute, if it's so hard, then how can I make it easier? Like how do I remove this whole class of questions, with better tools, or better utility functions, or just improving or pivoting on the product? What are like related asks that people have that I can group up and turn into a sub offering within the main product? So often people tell me like, oh, like, what should I build? And I'm like, why are you asking me like the all this information is there, go to the forums, see what people are complaining about, you know, then build something for them. And that's a dev tool, and maybe a company. So yeah, that's why I really love open source. hugo bowne-anderson I love it. And one thing you're also speaking to is understanding the suffering and all the paper cuts that people experience when using tools, right. And being able to trial that in, I suppose, a non like you're not putting your product on the line yet. I mean, it's a different type of risk strategy, I suppose in a lot of ways, and you're retaining optionality in some ways as well, mark saroufim for sure. I mean, like, it's a good environment, because you have a lot of other smart people around you, you're paid well. But then you also get to sort of explore what it would be like to be an entrepreneur, the dev tooling space, because it's just you, right? Like, you don't need to go get permission to go fix an issue, users go fix it. Yeah. The other thing, though, is I think, and maybe this is something we can talk in longer form about is that I think that open source projects are more likely to survive than closed source ones in the long term. What one reason is the paper cuts problem, which is, by virtue of being easier to use and fixing more bugs, because people can see what those bugs are, the product improves. But then as a result of it improving, you make other people dependent on that product by contributing to it. And so everyone now has skin in the game, they have a vested interest to make sure that this project survives. And they're willing to stick it out, like when this product like has like issues or like, major design flaws that were uncovered, that becomes a problem that we solve together, as opposed to, hey, like, why haven't you solved this problem? Like, we're gonna cut our contract if you don't solve this problem? hugo bowne-anderson Absolutely. And this is something where I think we will get to later is really, I mean, something you're speaking to that is community building, worldview, building, developing a sense of shared reality with a select group of people, and having them essentially be evangelists for you as well. One thing just in all practicality, when telling us about your story, you mentioned, there was always a challenge, keeping your tech skills up to date. And that's something that resonates with I think, everyone I speak to all the time, I always feel behind, especially when packages, and the stack is developing so quickly, the tooling landscape is as tragic as too many tools and not enough tools at the same time and all of these things. Is there any like words of wisdom or failure modes to warn people away from just in terms of trying to keep tech skills up to date? mark saroufim So I have an answer, but I'll caution you that people don't like it. But I mean, even better. So there's something called the Lindy effect that was popularized by Talib It's like my favorite author. He wrote The Black Swan a lot of books on probability and thinking about your fooled by randomness. Yeah. Oh, exceptional books. Yeah. So he has something called the Lindy effect, which is, if something hugo bowne-anderson isn't the one where the longer you're alive, the more chance you are living longer or I mess it up. Okay, mark saroufim exactly. It's basically let's say something we've been reading Shakespeare for hundreds of years. That means we're likely to be reading it for another couple of 100 years. Whereas like, let's say, a random self help book that was published this year. Do you really think it's going to be read in another 100 years? Probably not. Right? So some of the really in dev tooling people always think that new is somewhat better. But my perspective is like, what are the sort of the Lindy tools? Like what are the ones that stood the test of time? You mentioned SQL earlier, right? SQL is the king. I don't know if any other maybe Unix and Linux. But I don't know any other technologies that have survived for this long and continue becoming better. And hugo bowne-anderson I think Excel is a nice example as well. mark saroufim So Excel is amazing. By the way, I think pivot tables, and I constantly use pivot tables for aggregations. It saves me from having to learn how to write like an actual group by also like I do all of my data pre processing in Excel. If I'm salary, checking small data. It has been an absolute lifesaver in my career, and I just don't like sure you could do it with Python scripts. It'll take you like half a day or it takes me like 10 seconds to do it. I'm done. And I don't need to scale that, because it's usually some sort of one off analysis, which happens quite often. So similarly, I think, if you look at like, let's say, ml frameworks, for example, I don't know necessarily like which ml framework, I kind of have no idea. But I do feel like Python would likely survive another 10 to 20 years at this point. And the reason I think that's the case is because people have all sorts of opinions about Python, where it's like, well, it's, it doesn't have types. Well, now it does. Well, that's not a real type system. That's pretty good. Certainly getting better. And a lot of libraries are nowadays using it. And they may say, well, Python is not performant. Well, turns out, you can build like custom kernels. For example, you can write like, called C++ code from within your Python. But then you're like, well, but I don't know how to write C++. So great. Then you have people like the open AI Triton folks, like where you can write CUDA kernels like in Python. So I'm seeing more and more developers bending and making Python do what it's not supposed to. And I think people that are very purist about software, look at this. And they're like, that's not the right way to do this. And my answer is only been better. That seems what people seem to want to do. And by doing this, you get tons of users and tons of feedback. Maybe at some point, it becomes unsustainable, and we ended up having to build a different language. But maybe not like look at JavaScript, people kept making fun of that language. And it's probably going to be like the language like powering like everything on the planet. And that's just like a bitter. It's a bitter lesson. hugo bowne-anderson Absolutely. And also, the transmutation of tools is really important. I mean, a lot of the time tools aren't used what they were originally designed for, and they change in usage. And we co evolved with them. Right? mark saroufim I think the reality is that, I think bjorn the C++ creator had the best quote, like people, like there's sort of two kinds of languages, like the languages that people hate and the languages that no one uses. Yes, I 100% agree with that sentiment. Yeah, it's very simple. Like, again, back to the GitHub analogy, like, let's say, you are building this like, very, like beautiful library, and you have like zero GitHub issues, but like, your code is perfect, and you have 100% code coverage, and everything's amazing and right, like, maybe you're very proud of yourself, and, and maybe you will build a community at some point, but you haven't, like beyond, you have not done that yet, maybe you will at some point in the future. However, if a library has like 1000s of people complaining about its performance, that's actually a good problem, because it feels like 1000s of people are using this library so much, where it's becoming a significant chunk of people's like cogs, where they now need to worry about making this thing performance. And you know what, and if they do that, if they actually do figure out how to make it performant and open source a blog, that people will read it, if they create a tool around, people will use it. So it's just like, again, like your complaints become exactly the problem, you need to solve back to the power of open source. So yeah, I love it. And I think there's also a high level challenge, right, in terms of people not having great risk assessments with respect to adoption of tools, open source and otherwise write, but a business that adopts an open source set of tools is taking on some form of risk that I don't think they've actually quantified properly in terms of their business a lot of the time, is that something you think about? So there was this, I think, this funny package, I think, on NPM, at some point, it was a package with a single function that added, multiply the number by two, and that was a package. And then there was some sort of outage with NPM. And so millions of systems all over the world failed. So it goes to show you like a lot of software is very brittle. And I think a lot of it is brittle, because of its complexity. Realistically, nowadays, I wouldn't be able to really tell you what really happens if I'm running some piece of code, I can look at that. But and the best I can do is profile things. I can get a macro thing. Use print statements, print statements, which actually sometimes for prod code, you don't have a debugger you you have print statements where you can't even print stuff, because it may have like real user data. So you're it's like good luck. And well turns out it's like multiple systems and like, what if the way your logs are gone? Good luck. Things are very complicated. And I think part of that is that well, maybe that's normal, like you wouldn't expect the random mechanical engineer now to fully comprehend how a modern car works. I think the important thing here is maybe more of an educational point where I do think it's important to continuously build libraries that are valuable from an educational perspective. There's two that I'm thinking of right now like Min GPT, and micro grad by Karpathy. So one is a 200 line implementation of a transformer. The other is 100 line implementation of an automatic differentiation engine like pytorch. Do they work? No, not really. They work for very simple use case I should use this in prod. No, you should not use this in prod. But the fact that this knowledge is accessible means that whoever can access this can maybe build another version that's more usable, that's more maintainable, easier to debug. And there's always like rooms for improvement. But it's very hard to do that. I think when things get very complicated because then you need a lot of time to ramp up and know what those problems are and by that point, maybe you have a vested interest in not changing what that tool is, because you feel like hey, I can fix it. Whereas maybe a newbies perspective, as I know, this is not fixable, this is a total mess, I need to go rewrite it from scratch. hugo bowne-anderson I love this. And I love the idea of tools for education as well. I do think I'd like to think we chat about this later. But I'd love to now I think some of my favorite tools are tools that are built, that are production ready, quote, unquote, and performant, but also beautifully designed, and their intention is to educate as well. So examples of this fast AI and Keras, I think, but where I want to go with this as well is what we can learn from the video gaming world, right? Because video games, I'm not a huge gamer, but the games I have played some of the best games I've played, teach you how to play them within the game and all like they may have a little playground where you can't die and that type of stuff. But there's also some sort of cross gradient of at the start learning is a priority and risk is not, and then they slowly crossover, right. And then over the course of maybe several hours, you start playing the real game, there's less learning. So I'm wondering if you have this vibe with some tools and what we can learn from gaming with respect to education. mark saroufim Game development is something I've thought a lot about, I've had an article that's been in my that I've been drafting for over a year now, on what I think tooling people can learn from games. Specifically, I will say there's a few things that people don't recognize about the complexity of building a successful game. So one is the majority of apps that we interact with in the world are essentially forms. What do you mean by that? So a form, let's say, I don't know, you go to amazon.com, you have a search bar, yep, you fill out a form, you fill out a form, and then you click, this is the thing I want, then you fill up your address and your name, and you put your credit card and you click Submit Sure, there's a lot of complicated stuff that goes behind the scenes. And scaling is hard, yada, yada, yada. But fundamentally, as an experience, it's a form. Whereas from a games perspective, like a form is just your menu, like it's actually a fully breathing world, where you can may have like hundreds of people interacting with each other in real world. Oh, and oh, by the way, you need to do everything at 60 frames per second, otherwise, people will call your game unplayable. So for example, a slow form is still a form, let's say Amazon took a second longer to submit something, you would certainly spend less time on it. And there's a lot of studies on that. But fundamentally, the experience doesn't break, it may be unpleasant. But if you're playing like let's say a soccer game, or like a shooting game, where it takes a second for your mouse to go from one point to another, your game is not playable. There's it doesn't matter how beautiful it is, how great the art is, it's not playable. It's not a game. It's something it's a bunch of software. It's not a game yet. So just even from a performance perspective, this is why I really admire game developers. Because in ML, I can leave a lot of performance on the table. That sounds great. But I can whereas the game I can't. But I think from a design perspective, which was what your original question was about, is that you're looking at a lot of tutorials on websites. And I think tutorials are just like a really difficult art form, I think for a few reasons. One is I think a lot of engineers don't treat it as seriously as their code. And it reminds me a lot of when back in the day, Microsoft testing used to be a separate discipline, where it's like, oh, you know, I'm so smart, I don't need to test my code, like some other people will do it for me. And it's the same thing with docs... people be like, oh, like it, maybe I'll outsource that. Or maybe we'll have someone who's a documentation writer, when often it's like the developer themselves the as like the best understanding of what to do. And obviously, I have role models, you know, I have a good friend and mentor like Hamel Husain, who I think is like one of the best people at this because he doesn't build something unless it's useful. And then he's also sort of writes the documentation alongside writing the product. And I think the secret of good documentation is like sort of twofold, I think you need to sort of get the carrot very quickly, you need to get an immediate reward. Very quickly. You can't expect me to build stuff from source and wait 40 minutes for stuff to compile. You can't expect me to wait for 30 minutes for a Docker image to load. You can't expect me to set like 20 environment variables, I need to just to be able to click somewhere and see a result that makes me feel smart. And this whole making me feel smart thing is really important this because I think if you make people feel dumb, they're going to be like, Well, like I actually don't really have a need for this. But if you make people feel smart, they're not going to come back. They're not going to come back No. And then they're going to feel like hey, like, Oh, now I see the possibilities. Like maybe I can integrate this with that. And so I think as much as possible good docs, what they do, is they actually make the internet, think of applications on your behalf, as long as you explain the core in a clean way. So I said, you need that sort of first experience to be really nailed. But then you want to sort of progressively disclose the complexity in a way that makes people feel like oh, I can learn something here. I'll give you another good example here. I don't Another good friend like maximum parallel, he recently rewrote a lot of the docs for Ray. So Ray is a distributed systems library. You know, that sounds like do you really need distributed systems in your day to day life? Well, I don't know, probably not. And the reason I say probably not is because if you try learning about distributed systems, it's kind of hard. And then you need to go learn C++, you need to learn about all these communication protocols. And whereas with Ray, it's like, well, it's this Python code, and then you add, like, add remote on top of your code. And then you can just like run distributed code. So it makes you feel smart. Right. And so I think one thing Max did really well was when he was refactoring the docs is giving more of those moments throughout the docs where people feel smart. And maybe most people won't realize it, they read or they may not, they will Oh, this is like normal, good Doc's. But when I read it, I was like, you know, this is like chef's kiss, like, this is a work of art. And he did a really good job there. hugo bowne-anderson I love it. And the way I mean, you know, from my time in product, I usually refer to such things as like a moment of delight for the user. Right. But I think it is incredibly important. And somebody who spoke to in that example, I think, is, you know, in this quote, unquote, modern data stack, we have allowing scientists to focus on the model building and the data and that type of stuff, while giving them access to all the lower layers, such as, in that example, it was compute, infrastructure, versioning, orchestration, making sure that they're able to write Python, and maybe a bit of configuration, because we kind of have to at the moment, but having access to all the underlying layers that, you know, we require more infrastructure for and scientists should not be, in my somewhat humble opinion, having to think about all those layers constantly, because it's just cognitive overload as well. mark saroufim So I think realistically, few humans can think about all those problems concurrently. Maybe zero. Yeah. So it's like me thinking when I'm writing my pytorch code, I was like, my OS gonna schedule thing and what is like my pattern, our the items traveling on my GPU, like totally, fundamentally, abstractions, shouldn't should not be leaky. I think the problem with a lot of ml ops tools is I don't think that's the case. I think the reason that's not the case is because Kubernetes for one built a really good abstraction, where they said, Look, you don't have to worry about anything that goes on below us. Just learn about us, and you can do stuff. And that works great. If you're willing to spend the time to learn Kubernetes, which I think the tech gets a lot of flack because it's hard to learn. But that's a bit like saying, like, oh, like, why are operating systems are hard to learn, because they're complicated. Like there's a lot of moving systems in it. So I think the challenge when you're building it, like, let's say, a Kubernetes competitor, or you want to build better abstractions, is you need to guarantee me that the abstractions are not going to be leaky, as in and I'll give you a very concrete example for this. If I get a failure, if I get a bug, and I have to google issues that are related both to technology, and Kubernetes, it means your abstraction is leaky, it means that I have two problems, I need to learn Kubernetes. And I need to learn how you interact with Kubernetes. And I need to learn about you, I don't have time for that, then maybe I'm better off learning about Kubernetes. I think that's the challenge that I've seen a lot of tools face in the ML ops space. Specifically, I have like a different take when it comes to trainers, for example. So a lot of companies are building like an interface that looks like model dot fit parenthesis, data. And then maybe model isn't even a model. Maybe you just say fit because use auto ml. So then it's just data. And I think the challenge a lot of builders, they're like whether it's like Ignite, whether it's pytorch lightning, whether it's mosaic, whether it's fast AI, is that they all have a trade off of how flexible do I make this framework? With how many options what is sort of the default out of the box value for this. So I think this is a ratio, a lot of people get wrong, where if it's like, well, it's less flexible, and I don't get that much stuff, there's probably not a lot of value. But also flexibility in principle is actually a super hard thing to deliver on. Because let's say someone tells you hey, like I want to be able to compute gradients before this function applies. Or before you apply these gradients, I want you to log the stuff to AWS, or, or before like you log stuff to AWS, I want you to double check if you have access to these credentials. Oh, and the credentials maybe are stored on the model itself. Like who knows, like this is a very convoluted example. But they are like potentially many. Yeah, they may exist. And so how do you sort of develop tools that may need to account for general purpose programming, yet be easy to use, like learning to program was hard? I don't think it's going to get much easier. So yeah, I just wonder like, what's the outcome here? Like, I wish I could see the future? It's not the answer. hugo bowne-anderson No, I think in the medium to short term short to medium term, it may get significantly harder with the proliferation of tools as well, but a place I want to go now something I appreciate very much about you mark, because not only do you have a serious depth of technical knowledge, but you have a curiosity and passion for thinking more about like the broader movements and let's say macro trends In our industry, and I think there are a handful of people I think, who have that I know who have this inclination, but you do it in the blogs, you write in an almost I mean, you and I have a shared interest as, I suppose armchair sociologists or data sociologists in our industry, sociologists, so to speak. And I'll link to this in the show notes. But you have a wonderful post that I think can set the scene with respect to this approach called machine learning the great stagnation. And I know you've talked about this as so much I don't necessarily want to go go super deep on it. But I think it's setting the scene in terms of how machine learning and AI and your mind has stagnated over recent years would be instructive as part of this conversation. mark saroufim Sure. Yeah. So I think like my intent with that article, I will say like in hindsight that the article was fairly misunderstood. I think a lot of people understood it, or like, really, what I mean by that is that when I really when I said machine learning is stagnating, what I really, really meant was academic machine learning was stagnating. And I started off like in a sort of very, like, bombastic way, like my intro was pretty much like... machine learning researchers are basically just like, you know, the new medieval Catholic priests, just telling people that what will work and not work and turns out people just like scaling things, has been producing better results. So this is something like even like people like exceptional researchers, like Richard Sutan said in They're bitter lesson article where you can build all the complicated algorithms you want for reinforcement learning, but turns out scaling often just like beats all of those approaches. And I think it will be instructive for me to just quote you, and so I won't do your accent. But you do open the final paragraphs of your first section with SOTA, state of the art, we've rewarded and lauded incremental researchers, as innovators, increase their budgets so they can do even more incremental research parallelized over as many employees or graduate students that report to them. So this is graduate student descent, of course, then machine learning researchers can now this is key can now engage in risk free high income, high prestige work, they are today's medieval Catholic priests. mic drop. That was harder hitting about I remember, actually, I wonder if I'll be able to rewrite something like that? hugo bowne-anderson Well, I think when I read it as well, like I do have a dramatic tendency to so I think that helps. mark saroufim Yeah, so I think there's a few ideas here, I think on the idea of risk, let's talk about risk and prestige for a second. So for me, like when people I've had a lot of like, even friends tell me that like, oh, like we're taking a risk with our work. And then I don't know, getting paid by AWS or Facebook, or whatever, to produce work that's highly relevant for those companies. And then also write papers about it. So they can literally take their I don't know, whatever would be in their performance review at a big tech company. But now give it like slightly different language, and make it seem like this is like academic work. And this is where I felt like, Hey, guys, like maybe cut the bullshit for a second? Yeah, like, for me, my take of what I would consider to be high risk is not like, I'm going to scale this model so much. And boohoo worst case, I write a paper about lessons scaling a large model that a lot of people will read, this doesn't count, it has to actually be legitimately something no one will give a shit about if it fails. That's sort of like my measure of risk. And at least like maybe the purists that meet at least pre deep learning. When I was in grad school, I generally felt that was the attitude there. Like it was people were more interested in like all sorts of like gnarly math ideas. Could we use like quantum computations? Could we use like spectral graph theory for machine learning? And I felt like there was more ideas that came out of completely left field, it was just like, oh, yeah, you want to go learn about this, go read this, like 300 year old manuscript about something, and go rethink like, the very foundations of this field. So if I were to, I guess, be in academia, these would be I guess, the kinds of problems I'd be excited about. So and obviously, like, the reward part is that it's no secret, you can just go to stuff like ai salaries.com or levels dot FYI, you can see absurd, like tech salaries have gotten like recently, especially for like more senior researchers. And for the most part, that's fine. I think people doing valuable work should be rewarded for it. I'm not sort of advocating for any sort of like redistributing the means of production sort of thing. And when it comes to like compute or anything like that, what I am saying is that I would hope that people that have built a safety net with their reputations or income, to actually take some real risks and do things that are like kind of different Absolutely, and not necessarily ml applied to problem x, or I used mine or result from field Y applied to ml and improved results z by Q percent. And honestly, I'm giving you this template because this is generally how a lot of ml abstracts I've read to me nowadays where deep machine learning is very important solving like many problems like vision, language and reinforcement learning. The first like application of machine learning Alex net, using optimizers. Like, Adam, it's a lot of it, like read the exact same to me. And I'm thinking like, hey, I want the ideas out of left field. hugo bowne-anderson This is fascinating for me, because it actually reminds me a lot of my time working in biology as well, I mean, every grant application I had, I was working on something that I would solve cancer or Alzheimer's with mostly cancer, right. So that was in the grant applications. On top of that, the academic model, with wide variance, on average, I will say, I can just see people emailing me after I say this being like what I did. And I'm like, that's not the point at all, is, if you, for example, find a protein that does something cool to the mitotic spindle or something like that, and you get a Nature paper or cell paper or science paper in the top three journals, right, then you will get your lab and be able to hire a bunch of grad students to see if other proteins do it right. So then you essentially then have a rank ordered list of proteins that you parallelize over grad students with oversight by maybe postdocs, and maybe some master's students doing a bunch of the pipetting as well, right. And that not only does what you have to say, in order to market it, or LARP it, which we'll get to have resonances there, but also the idea of distributing over doing this parallel distribution, which, of course, in the case of deep learning can involve such things as looking at a particular part of the hyper parameter search, or the grid search, or whatever it is. And I think this speaks to the fact that also a point you've made, and maybe we can drill down into this a bit more that deep learning is an experimental and empirical science, rather than a theoretical one, mark saroufim I 100%. Agree, I had an algorithm in there as a joke, but I'm pretty sure it works, which I call the graduate student descent where the premise of it is this like you start from an existing strong baseline, you pick some existing state of the art work, you take this work, and then you make a bunch of random changes to it. And then if any of those changes are positive, publish a paper. So the nice thing about this approach is that, as you pointed out, you can paralyze this over, like, as many researchers as you want. But there's another thing like there's another feedback loop aspect here, which is, well, the fact that you have the infrastructure to try out different hyper parameters quickly means that if you solve the initial problem of how to run the first experiment, that's much easier to run subsequent ones. And so now imagine you're at a lab where people have built basically the baseline for a lot of the experiments that maybe it's a lab with no compute budget, they just do whatever they want. It sort of explains why a lot of recent ml work has come out of large labs, because it's a game that really is beneficial to them. Like it's sort of, and I'm not saying like they're doing it like for maliciously no one's sort of like, sitting in a room like cackling like smoking cigars thing like this is what we're going to do to ML research is just it's a natural incentive that arises out of like, differences in funding, differences in promotion cycles, differences in like people's risk tolerances. And because of those incentives, I think we sort of we observe these effects and incentives is something I love looking at, and I guess it's sort of the armchair foot like sociologists and me gravitates towards, like wanting to analyze the more and more. Yeah, and also with your admiration of someone like Taleb. I mean, I think the way he thinks and something I heard earlier, I mean, you know, his book skin in the game, right, but the idea of he hates nothing more than risk free work, right? Yeah. And people who don't, aren't invested in their own way. Yeah, I mean, I have like, the dubious honor of being called an imbecile by Talib on Twitter was was really, I was actually very upset. Like, for the whole, hugo bowne-anderson yeah, I'm kind of jealous of you. I feel like from him, that is almost a compliment. mark saroufim I'm not gonna lie. I was really upset for that whole day, I had to like, take a walk and clear my head outside. I'm like, What am I doing? Like, wow. But you know, the reason I took it to heart is because I really think he's sort of like one of the smartest people alive today. I think his perspective is that people often treat people bordering on a lot of people in multidisciplinary fields always have this problem where people that let's say, a real sociologist may think my sociology is great. And then an ML person may think my sociology things are not that useful. So I think multidisciplinary people often have this problem. And I think Taleb also has this problem, where people just don't get it. Like they look at fields like I don't know, like, risk analysis, and they're like, well, we've proved like with 99% certainty that we can't have a nuclear war anymore. And when you think about that statement a bit more carefully, like, What the hell does that mean? Like, you don't need like a PhD to figure out that? Well, sure, maybe the risk of nuclear war is very low. But if it does happen, it's so catastrophic, that you have to do something about it. And a lot of stuff is like this, like climate change. COVID. hugo bowne-anderson Absolutely. And risk is multi dimensional. I mean, we project it down onto a single decision making process, but I don't think we talked about enough how we need to think about likelihood and impact and all the different aspects of whatever matrix you're looking at mark saroufim So I think this is like similar to looking at a stats book and understanding the formulas and solving problems with them, but not necessarily being to apply it to anything that actually involves you, for example? Yeah, exactly. I think another example is like just survivorship bias here, for example, when people ask me for career advice sometimes and they'll tell me like, Hey Mark, like, should I, like you're saying ML is stagnating, so should I not apply to that amount internship at Google. And I feel like you'd be stupid not to, you're gonna make a lot of money, you're gonna learn a lot, you're gonna have really smart peers. But what I'm really telling you is that, look, once you've made money, once you have like a million or two in the bank, you're comfortable with your own reputation, maybe you can take some risks. Why not? Maybe it's less than that. But I think if you've gotten to that point of that level of income, or savings, and you're still like, I can't take risks, or I'm not allowed to, I think that's more of a mental block than an actual reality. And I think that blocks you from sort of the next level of potential, which is, yeah, you could be a senior engineer, you could be a middle manager, pushing paper, sort of very office space like, and that's great if scaling a model is what gets you there. And just being part of a team that does that, like by all means, you should do that early in your career. But over time, I would encourage people to take more risk back to tell him straight up, he calls it the barbell strategy, which is do a really safe thing and a very risky thing. So for example, do the scaling work, learn Kubernetes. So that's on the lower risk front. And then the high risk is, I don't know, go write a quantum machine learning paper, become a dancer just it's sort of like, totally, because that way, like you actually get a payoff, it's going to be very high. Where I think people get it wrong is like they'll do things like well, let me learn ml ops and TerraForm. Now let me learn MLops and Kubernetes. Now let me learn ml ops and airflow. Now let me learn it in meta flow. So you're learning all of these skill sets that are very similar, none of them are going to make you that rich incrementally once you know one of them. So why reevaluate, like the way you're balancing your learning portfolio? hugo bowne-anderson Yeah, absolutely. I love the idea of a learning portfolio as well. I want to dig deeper into your sociological tendencies, something I'm interested in is how you're started to identify emergent class structures in our industry. And I actually, I wasn't gonna do this, but because I read a bit of you to you before and the listeners, I'd love to read a bit more to you. And I'd love to read the opening of your, what I can. It's the first essay I read by you, you're absolutely beautiful and very, like it moved me a lot as a cold working class deep learner. Okay, I'm going to try to do this with a straight face because it's, it's hilarious, but it's also incredibly challenging it actually before I do it, it actually reminds me, my dad, a while ago, read a relatively recent biography of Franz Kafka. And there's a section in it, where the biographer talks about when Kafka and Max Brod and a whole bunch of that crew were living in Prague, and every Friday night, they would get together and Kafka would read to them, what he'd written during the week, and supposedly they were all on the ground, laughing, clutching their stomachs with tears rolling down their eyes, because they thought what he had written was so funny. And I think we don't always appreciate the humor in Kafka, right? So I suppose the point of telling that story was to identify that this may be funny, but it's also it's always the flip side, which is deeply serious as well. So I've just done a critique before reading it. But without further ado, this is Mark Saroufim's. Working Class deep learner opening. You interviewed with Deep Mind, this was your moment, you were told you're a promising young lad. you aced the first couple programming screens, kid, you're going places you can work here for a year, and I can set you up with a PhD at MIT. But for that last interview, you stumbled on implementing vanilla breadth first search spiraled into self doubt blacked out and woke up remembering feedback, like you need more keyboard time. And that was it. Your dreams of a life where you pursued your interests slowly evaporating, you begin to tell yourself that maybe you should follow your strengths and instead apply to ml engineering roles. You will learn about real businesses and solve real problems for real people. For all the dreams you had with the confidence instilled in you by your loving mother. You now realize that in a room, you're neither the smartest or dumbest engineer. You're nice but not charismatic. You never got around to writing that book for Pakt, your belly is slowly growing as your metabolism isn't sorry, isn't what it used to be. You mostly Keep your head down. You try to keep up with the buzzword du jour that senior management is pitching and keep chugging along on your PowerPoint presentation so that they could finally see the value you bring to the company, your squarely in the middle newpapers overwhelm you. New tools and languages make you feel your obsolescence ML is growing like crazy, but for whatever reason your salary has barely kept up with inflation. If that's you, then this guide is for you. How do you manage life as a working class deep learner? mark saroufim I sort of got you to write my audio book or something. Oh, dude, I read my audio book. hugo bowne-anderson I'm so into that. I think that is actually incredible opening to something which delves into through a lot of reality, what it's like to work within these types of positions. And I'm interested in what you can tell us more about the types of class structures that are emerging and you've identified in our industry. mark saroufim Sure, yeah, I think I do want to preface something that maybe it was clear to some but not to others. I'm often the butt of my own joke. So that person is actually me. Great. So that's a true story, too. It wasn't DeepMind. That was MSR. I did bomb am interview and that was exactly what I heard. And so it does feel like over time, I think what a lot of people get them to ml in the first place, is they get attracted to like, really what open AI and deep mind are doing right now. There's these like, insane demos that you see the stuff like Dall-e or open AI five, and you're like, holy shit, like what? Like, what is this? And then like, you look at your work, and you're like flipping Gamble's and wondering like, why does this file not found or something? In your logging system? hugo bowne-anderson Why am I still dealing with delimiters? Or why can I not get the correct version of TensorFlow installed? mark saroufim And so like, I think the goal of that article was one to sort of give a name for that kind of engineer, because I don't think ml engineer was really it. I don't think data scientist was maybe ml engineer is sort of it. hugo bowne-anderson Well, it is an overloaded term as well, because a lot of things you describe in that essay, some ml engineers do, others do not right. mark saroufim And I guess what I really wanted to do was to give people I guess, some hope that this is actually kind of deep, even if it doesn't feel like it upfront, you don't need to learn about compilers. But if you do, there's a lot of interesting stuff you could do. profiling is one of the like, sort of the oldest disciplines in software and still not figured out. There's a new profiling tool coming out every day. And they're all pretty great. Yeah. hugo bowne-anderson And something I just want to make clear. For the listener, we won't go into this in a lot of detail. But I'll include a link to that essay in the show notes. Mark, you do go as we're saying into a lot of detail around the types of things that it's useful to such as compilers to know if you want to start doing this type of thing. So you're not explicit about it. But it does provide kind of a rubric for things to learn if you're interested to getting into this type of engineering. mark saroufim Oh, I forgot your question, though. So your question is like, what are sort of like the different archetypes in deep learning? Yeah. So generally, I think the ones I've heard in the past are data scientist, ML engineer, MLresearcher, cloud engineer, whatever. I never like particularly like those titles. I think that the ones that resonated more with me are the working class deep learner is what I call people that are worried about solving some application problem. So there are people that need to have a very broad skill set, they aren't necessarily, their pay variance can be very high. Because it depends where they work. Right. Whereas like, with researchers, like specifically, I would say like the top researchers at places like fair open AI, it's sort of like the growth isn't linear, I would say in those kinds of fields, because it's a bit like musicians or something, right, where you have I don't know, Cardi B sort of making like a lot of money, good for her. I wouldn't say it's as imbalanced and like top ml researchers, but it's more power law distributed. Because for whatever reason, the best work is consistently being produced by the same labs. It's kind of incredible how, Dall-e dall-e 2 like open AI, five, open AI, gym, GPT, two, three, all like the same company. Like this is kind of nuts. When you think about it. It's not just something about the culture. I also think it says something about the talent that's there, like there's just like, some wickedly smart people working there. And so are they worth like, twice as much as a normal ml researcher? Depends like maybe it's closer to 10, maybe it's closer to 100, hugo bowne-anderson you're also asking a relatively deep question about what value is as well, and how we create value and how we price value, which will actually direct. I don't know whether you've read this work of graeber, but but the first book I read by David Graeber, is the anthropological theory of value, where he very much goes into how we price it and value things. mark saroufim So I haven't read that book I've read two of graber's has other books like that, and bullshit jobs, both I'd highly recommend maybe both jobs will resonate with people listening to this. Absolutely. I will say in general, the way I think of value is the way well, it's like the name of the Austrian economist guy whose name starts with the von hugo bowne-anderson von Mises, no high, no high IQ. And I'll actually include a YouTube video in the show notes. It's a I don't even seen the rap battle between Hayek and Keynes. And it is I learned more about macro economics from that rap battle, which I watch a couple times a year than anything else mark saroufim has. It's like really interesting paper on basically what is value. I think the way he describes it is it's information. It's basically telling the market that some skill as high supply and low demand. That's what it is, like, for example, like let's say, take a simpler example of I'm selling strawberry berries. And in San Diego strawberries are very expensive. But then you live in Australia where like strawberries are really cheap. So then, if it's so cheap that it justifies you exporting strawberries to San Diego and undercutting me, that is now the new value of strawberries. So it's actually, my expensive strawberries are a signal to other people on the planet like, Hey, you should come sell strawberries here. So I think there's something similar here where there is like, frankly, I think, I hugo bowne-anderson think it is a signaling. But I also think that applies to certain forms of market economies. And a lot of the time it isn't actually like, think about speculative bubbles, right, which is kind of talking around what we might get to. But when we have these types of things, how do we actually define value, and if that actually still holds, I suppose gifting communities value takes on very different forms that aren't market economy forms, mark saroufim it could be the case, like I think that a FOMO aspect is driving these high salaries because smart employees have leverage and that they can go work in other places. And so if you execute that leverage every couple of years, very quickly, you can be making like a small fortune. And hugo bowne-anderson there's actually a joke that at least several people I know who've worked for Faang find it difficult to get a promotion, maybe they can get certain raises at the company they're at. But in order to get the real job and the real salary, they weren't where they are, they have to go to Apple for two years, and then come back, right, in order to get that series of promotions, which, once again, is seems totally bent, mark saroufim I have a slightly different like an addendum to that I agree, or 90% of people. And my observation is that the people that built tremendous value at places like by tremendous value, I mean, people like Sumit, or Ed, people like Adam Bosque, like who build something like pytorch, they couldn't have built it without spending many years doing it in the same place. And once they do it now there are some of them as hugo bowne-anderson well, look what happened when Chris Latner went to Google to work on Swift, right, like that fell apart, right? mark saroufim I'm not too familiar with why Chris left all those different places. But certainly, I think he has like the brains to attract really smart people to come work for him. And it did fall apart very quickly, like after he left, like you're you're right. So losing your founder really matters. And the majority of software projects aren't necessarily don't require a founder, especially if they're being maintained, I think they would require more, someone who's very consistent, was very hard working. I think that's a different skill set from early projects where it's closer to that of a founder, it's obviously being like being a founder is, on average, less worse than being a CEO of a publicly traded company. But in the limit, then in the limit it flips, right, being a foreigner is actually much, much more beneficial. So for most of us, normies, as I would say, maybe that isn't the best strategy, if you want to maximize wealth. hugo bowne-anderson So I want to get back to the emergent structures that you've noticed. And then I actually want to get onto, I think, something we've been speaking to now, and particularly in terms of creation of value, and the idea of speculation is getting people on board before value was demonstrated. So there's an argument that you create pytorch. And it's not valuable until it's adopted at scale, and what the role of marketing like LARPing and kayfabe is in this and we'll get to those terms in a second. Before that, though, we've kind of identified working class deep learners, we've identified some people who are similar to medieval Catholic priests. If I were to push you and ask who like the Barons are and the serfs and blue collar workers and landowners these types of tropes we have, from our kind of our vision of how class structures have emerged over the past several centuries. Who would you identify? mark saroufim sir? Yeah, I think in terms of serfs basically have cloud providers, because regardless of how useful your ML job is, they're making money because it's usage. It's not utility that they're charging you for however, we quantify that maybe that's complicated. But I think like you could make the case that maybe they have a perverse incentive to push large models or something. I don't think that's true. My take is that large models seem useful for the most part, large companies are willing to provide you services to launch those large models and leverage them and they should be compensated for that. Right. So it's more like the question then becomes sort of prematurely. Having like a surf right, maybe you're just fine using like your local desktop, in which case, you've bought your property from Nvidia, you've bought a GPU, so you're not you're like a homeowner. But if you're just like constantly renting it, maybe that's fine if you're exploring, but over the long run, like hugo bowne-anderson So wouldn't that make the cloud providers the landlord's kind of you're renting from them? Or? Yes, so they're barons of some sort? Yeah. mark saroufim So the landlords, I guess, I don't know. What are Nvidia and Intel i guess, like, what are hardware providers in this analogy? They're like the contractors or they're the builders. What do you think? Yeah, hugo bowne-anderson I'd like that. I think there's maybe a sense that they are also well, not landlords, per se, but I think they are builders or people in the hardware supply chain right? So those types of vendors mark saroufim so outside of that you have all the typical tropes, like you have PMs, you have like middle managers, you have like all sorts of the typical tropes you may see in any large organization. I think that the same ones exist. But I think the one that trope I'm most excited about is the ML slash cult leader persona. I think a couple of people are talking face fit this profile like specifically Clem, the CEO, there were I think they've built like an interesting engine where I don't know how much money the company makes at this point. But I can say like, as far as mindshare goes, it's just like, it's an incredible company and its growth in open source has just been mind boggling. I think there's a couple of things people should pay attention to here, I think, take I've heard as NLP is very useful now. That's why hugging face is growing this much. I think that's a very shallow take, actually, I think the reason why hugging face is growing so quickly, is because they figured out how to leverage the creativity of the internet at scale. By at scale. I mean, I don't mean, they open source the code, and gave you a contributing MD, that tells you, we accept all contributions, good luck, make a PR. And fact what they do is they scope out contributions for you. The same way a good manager would if you were like maybe an intern, so they're gonna say, hey, like go to this class, it's called transformer, create this new model. Now, if you want to upload it, good model hub dot upload, or hayleigh, you want to create a data set, go on Google images and annotate a bunch of datasets, use your personal photos, use your text messages, annotate them, put them in this format, great now say, model hub upload dataset. And that's it. So by scoping contributions, and after they do them, making those people feel like they're valuable in the community, they make you feel like I did something like I was part of something bigger than myself. And I didn't need to spend like months of my life ramping up on a new codebase was I don't try contributing to Linux score. It's like, go sign up on this mailing list. And yeah, hugo bowne-anderson to your point earlier, it also makes you feel smart, and welcome. And part of an in group mark saroufim 100%. I have an example I love giving, which is very subtle. But many months ago, I found an API that was deprecated in one of the hugging face tutorial examples. So I found that the notebook with the issue, and I fixed the notebook, and I submitted a PR. Turns out, I was actually fixing an auto generated notebook. So I didn't fix the issue. I fix the symptom. Yeah. So what they did was, they told me Oh, yeah, thank you for your contribution. And then on top of my PR, they added a commit to fix the actual issue. And then they merged the PR. So that way, I show up as a contributor to tutorials, and I felt good, clever, as opposed to this is not the real issue. I'm gonna close it. Here's the real issue. You're like, you feel dumb. Oh, I, you're embarrassed. Oh, I just embarrass myself in front of hugging face or pytorch engineer, I'm so dumb, whatever. Whereas you're like, oh, no, like I did something. I'm valuable. That's cool. That's awesome. I tried to do the same thing with repos I manage nowadays, I don't do as good a job as it. But that's sort of I guess, like my Northstar. hugo bowne-anderson Beautiful. So we'll also link to a post an essay you have called the rise of hugging face, which I think goes through a few of these ideas. I do want to now move into the realm of simulation. And I think it's time to enter the matrix Mark. Yeah, why not the crazy stuff. And what I mean by that is the role of live action role playing, the role of kayfabe. And I'm just gonna give a quick definition of kayfabe. For people. I'm reading this from Wikipedia, in professional wrestling, kayfabe as a noun is the portrayal of staged events within the industry, as real or true, specifically, the betrayal of competition rivalries and relationships between participants has been genuine and not stage. There's an emerging literature on the interactions between reality and kayfabe, how certain things that happen in the theater of kayfabe, and professional American wrestling actually becomes real in the participants lives and this type of stuff, which I think is, is fascinating, but really where I want to get to is the role of marketing, simulation, speculation to manifest yourself and to create worldviews, shared reality communities, I think maybe Elon Musk is a wonderful example of creating a speculative vision of himself that then he will try to manifest when people come on board with it. But to this end, I actually want to read the TLDR of your hugging face essay, because it speaks to this and it makes it enjoyable, actually, so TLDR one, online communities are replacing religions to dev tools need to be open online communities three hugging face is a great reproducible case study for machine learning startups four have fun, make friends LARP more. So what's the role of LARPing kayfabe and simulation mark saroufim so it's funny you mentioned kayfabe because I grew up watching so many Wrestlemanias amaze thing with my younger sister. You grew up in Lebanon Correct? I did. Yeah, we'd have to get bootleg DVDs. wait forever to get like some new DVD, but we loved it. hugo bowne-anderson I wont tell the undertaker that you were bootlegging him in the 90s mark saroufim Oh no but the undertaker so the undertaker recently turns out the undertaker is like this like seven foot tall huge dude arm sleeves like long like dark hair very like brooding vibe is the undertaker. So he actually stayed in character the whole time. Even when he was offset like he was the undertaker. He was not like, I know what his name is like, I just know like, there's the undertaker right now. It's the undertaker. Yeah. Right. So I think I never predicted if I'm being honest, how big of a role this would play in normal day to day life. But I think it sort of went into overdrive when COVID started, maybe even before, but that's really when I felt like it just became the zeitgeist sort of thing. Where I'm at home all day, I'm really scared of going outside. Because if I touch surfaces, I'm gonna die. You know? So I'm sitting like watching lots of YouTube, and I'm on Twitter a lot, hugo bowne-anderson and spread it to everyone else as well, right? Like, you're gonna die before that you kill everyone around you. mark saroufim Exactly. So think like in this sort of setting, I was starting to pay a lot more attention to certain online writers that I found like we're sort of picking beef or inventing words. So I have a few examples. I think like one is Taleb for example, I think he's a good one. Because he invents all of these characters like Fat Tony is sort of like the Yeah, the guy who's like, like down to Earth doesn't he's a street smart. Yeah, the street smart guy. Then you have Nero who's sort of like academic and very clever, but doesn't have a great dating life and struggles to make money. And then there were other examples that came up. Like there's another one like there's one called Roon, who for all we know is just like some random dude that works at Facebook. Maybe he invented these terms. hugo bowne-anderson This is Roon on Twitter, right? I've encountered odd dublo In fascinating LinkedIn, him as well, them or whoever they are. mark saroufim So he invented two terms like shape rotator and word cell. So shape rotators are people that like high, sort of good at math, programming systems building and word cells are like salespeople, sociologists, like English majors, majors, writers, etc. And that's it. Like he just invented two words. No, they were kind of funny. And it just sort of like took over. But hugo bowne-anderson also, it's just a new take on whatever we thought was left brain, right brain crap back in the day as well, right? Like, it's not even new conceptually. mark saroufim Clearly, it's not new conceptually. But it's such a nice reframing of it. It is and it took the internet by storm hugo bowne-anderson cell is both a great but word cell plays into so much of what we think about these days. There's mark saroufim another example like, there's this guy called Mike Solana. And he was sort of like I think initially became kind of notorious for like getting into fights with like the mayor of San Francisco, about like how they're like running their city poorly. And then as a joke, he's like, running for mayor. And then while as a joke, he's running for mayor, the existing Mayor of San Francisco said, oh, like these, like VC billionaires are doing such and such. He's probably not a billionaire. But then you just wonder whether he's like, Yeah, I'm the billionaire. And I'm running for mayor. And I'm gonna do all the stuff. And by all means, I think if he does actually legitimately run for mayor, I think he could actually pull it off because he has like this, like very loyal following online of people that think he's hilarious. He's obviously smart. He's a good writer. It goes to show you I think, as well, how, like, at least in politics are rare. It's become four. hugo bowne-anderson Is 45. An example of this in some way? 45? I haven't heard of them. No, the former President Donald Trump. Oh, yeah. 45th, President of the United States who like he was as surprised as anyone getting elected in 2016, as far as I'm concerned. So I mark saroufim definitely think he distorted reality in his favor, like hugo bowne-anderson were very kind way of putting it. mark saroufim I do remember when he got elected, I was kind of shell shocked. What the hell Oh, this is a joke, and I'll wake up and everything will be back to normal. They're like, Oh, yeah, like, No, this guy actually did it. And I think that's like an example of what it's like to kayfabe. Let's say for more like selfish reasons hugo bowne-anderson Deeply narcissistic reasons and destructive reasons. mark saroufim But I do genuinely believe that the same principles. And the same way that sort of like marketing can be good or bad, depending on what your marketing like, if you're marketing tobacco, that's bad. If you're marketing like a tool that solves your problems, I wouldn't say that's about that sounds like a good thing. Right? hugo bowne-anderson Sounds great to me, particularly when there are so many tools. I want to ask one more maybe slightly controversial, provocative example. Is there an element of kayfabe around the power of machine learning? I mean, for example, like Google has a lot of very incredible machine learning. Of course, as does matter, Facebook cried, but I don't know whether like Google their machine learning.... there's some sort of Wizard of Oz vibe where like, it's a huge infrastructural challenge. Maybe the machine learning is relatively straightforward, and so And what is right. mark saroufim So I think the best answer there is like you talk to the accountants nice if you were to just look at public records of how much money are those companies spending on data centers focus for AI? It's like in the billions of dollars. That's not kayfabe? Because you can't hugo bowne-anderson my point is, it's the infrastructure, not necessarily the AI, right. It's called AI, but most of it is very important key plumbing, essentially. So I agree. They're not like 1000, layered neural nets, and blah, blah, blah, right. mark saroufim There's a few aspects, I think, to your question. One is, if you copied the ML infrastructure at Google, would you be successful as a startup? Hell no, you will probably more likely to fail. The reason that's the case is because Google and Facebook and many of these other like internet products generate a lot of usage, because they're popular products. So they've they're probably found Product Market Fit. hugo bowne-anderson since Metcalfe's law and a lot of network effects, right. Yeah. And I mark saroufim think if given sufficient network laws, I view machine learning as something that builds the moat it sort of takes your modes into overdrive sort of thing, where you have like your plugins, hugo bowne-anderson let's just define what a moat is, for business, what defines their competitive advantage and differentiate it from other people in the same space? And I mean, we're using a moat around the castle is the linguistic tool here with the crocodiles and stuff. Yeah, yeah, exactly, is a picture and draw bridges. And mark saroufim so I think, from their perspective, because this is so important, and because the usage is really the strategic asset, investing a lot in machine learning makes tons of sense. I think, though, there's one potential, like, positive thing here for smaller companies is that like, I think pre trained models, when I talk to people, they sort of view them as this thing where, oh, like, it's this thing that's only accessible to large companies. And I actually say the opposite. It's like, oh, it's not true. Because you can actually like, you can copy the weights. And now you have this model. That's this asset that's infinitely reproducible. So actually, from all weekend, like you, maybe you may not be able to train them from scratch, like, maybe that's true. And maybe there's going to be some research showing us large models aren't actually all that important. Maybe that's the case, I don't know. But even if not, even if larger models are the only useful models, the fact that you can just copy paste the weights, and share them freely, is a huge thing to make this tech accessible to more people. It's also one thing I think people get wrong when they talk about the energy consumption of these models. So I've seen these studies say things like, well, training a large model is a bit like the lifetime ownership of a car. That's a lot, right, like, well, like a car, like a car is using fuel and spinning stuff in the air. It's a lot. So that's true. But on the other hand, is a car being used by millions of people, you know, is a car copyable? Like, can you copy paste the car with minimal energy cost? Not really. Can you use a car to drive in San Diego while the car is deployed in Antarctica? No, but you could you could train him out wherever you wanted on the planet, wherever energy was cheap, clean and abundant. I do think there's a lot of innovation that could be happening in the energy space. It's not something that I'm too involved in. I don't know too much about building data centers. But I do hope someone sort of takes my take here and just goes with it and builds like an amazing company with it. hugo bowne-anderson Yeah, absolutely. So I want to get back to LARPing. So I've defined kayfabe, I just want to give if people haven't heard of LARPing. It's live action role playing. I'm reading from Wikipedia. Again, it's a live action role playing game is a form of role playing game where the participants physically portray the characters so and they pursue the goals of their characters. I think probably the most famous example in our cultural consciousness are like medieval LARPers, right? Who have their big medieval fairs and all of that. So how is that relevant to the modern world of marketing, we'll go into the time during COVID. So maybe you could dig deeper into that. mark saroufim So for some context, I really enjoy DMing. And I enjoy being a dungeon master in dungeons and dragons. What that effectively means is, I create some interesting world for my friends to play with characters and stuff to do. And I have to sort of react to them, like I have to sort of improvise a story along the way. And I don't think any game like digital or otherwise comes close to sort of the creative possibilities you can have here. A lot of people that I've seen played dungeons and dragons for the first time will be very familiar with combat, because it's very time consuming. I've always found that the most boring part of the game, most exciting part is the stories that you have with your friends. It's a bit like playing improv with your friends, but scoping it so that it's a bit easier for anyone to do it even if they're introverted or shy. So back to product marketing. I think that a lot of times when I've seen like more like legacy marketing, I think it's been dominated by case studies. So people will say something like, are you interested and making your models faster? Click here to learn how a company works. I use technology X to make their models Z percent faster. I don't think anyone reads those things. I think if the speedups are substantial, if it's like 300x, or something, people will read it. But if it's like, oh 5% 10%, you're like, who cares? I probably have like some bug in my code that's making it less responsible for that, like performance difference. So I think the motivation and the stage for this is that a lot of stories traditionally in dev marketing, are boring, like no one reads them. However, if you look at successful researchers in ML, like let's say people like Karpathy, they have like these gigantic Twitter followings. And I think this is a very sort of bitter pill to swallow for people in marketing, because it's like, well, like, wait a minute, it's this person, they're paid a lot more than me, they're very good at research and math. And they have a knack for marketing. Because they have the metrics for it, yet, they were never really trained in it, and they never did it. hugo bowne-anderson And also on top of that, they have, and I'm gonna put this in serious quotation marks, they have quote, unquote, authenticity that I as a marketer could never have, essentially, as a marketer has taken someone like Karpathy, right? mark saroufim So when I think of the new marketer, either it looks like someone like Karpathy or I think it's someone who can incubate people like Karpathy within their company, and create a culture where it's easier to create multiple voices within a company. Again, back to HuggingFace. Thomas, every single employee in that company, has a Twitter account and is constantly posting about their work. I've never seen any other company in my life do this. Like it's always like, no, like, how dare you, you have to go through me, comms, like I am the authority on what you can and cannot say. And I think people that are going to gate keep content are in for a better lesson, I don't think those jobs are going to be around. But I do think there is going to be a role for the content VC, as in, you're a person who can identify promising talent, and help them build interesting stories that are in benefit of your company that you're responsible for. That's how I view the marketer. And I think we're gonna have the hybrid marketer, which is, let's say people like Sumit, for example, which are just naturally have a knack for building an audience. But also being a very talented developer, I think a lot more entrepreneurs. And developers are going to look like that. And people that want to say purely in marketing are going to have to take on this like talent, incubation role, more than a moderator role. hugo bowne-anderson And is this something that you think is not necessarily particular to tech? Oh, yes. mark saroufim It's interesting, right? Like, it's sort of whoever is, I guess the product person at a company should owes it to themselves to know a bit of marketing. Because I think it's just like an incredible value multiplier. hugo bowne-anderson But even if we think about, like the current state of broadsheet newspaper journalism, or if you're a tattoo artist, or a jewelry maker, or a musician or something along these lines, the ability to market yourself and to LARP yourself into existence, I suppose to fake it until you make it is more pronounced given a lot of what we're doing online. And of course, coming out of the pandemic, you're reminding mark saroufim me of my hairdresser in Lebanon, so I've never went to a different hairdresser my whole time there. And whenever I call them, I'm like, Yo, can I get an appointment? He's like, Oh, no, no, so busy today. I'm like, please, please, please, like, Could you get me an appointment? He's like, sure, sure. It's like, come in here, but like, be here on time. And then I go there and like, no one is there. And I'm like, and I keep thinking, okay, like, I go with it. And I never point it out, but I'm just he's obviously hustling and pretending to create this, like, sort of sense of urgency and stuff like that. Absolutely. So I do hugo bowne-anderson think sense of scarcity as well. Right? That's part of Bitcoin or something. Yeah. Yeah. And how you build business in some ways, right? Like, there's the joke that if there are two food carts or something, and one has a long line, I may take the time to stand in the one with the long line because I think it's more popular, mark saroufim right? It's probably like the better thing. That's probably a more reliable signal than any other signal you could get. Yeah, it's like are people there? I think various industries have figured this out like restaurants are now like all on social media. A lot of creative work that to artists, illustrators on Instagram. I guess like a lot of like armchair philosophers more on Twitter. That's more the camp I fall in. hugo bowne-anderson Yeah, but everyone's I think cinema has done it as well. I think the Marvel Universe has done it incredibly well. I think one of the first most cynical takes I saw on it was the Hunger Games trilogy, where I mean, this is going pretty off, but it's actually key where the second Hunger Games movie gave absolutely nothing, gave absolutely nothing. It didn't have to give anything because it knew if you'd watch the second one, you're gonna see the third irrespective, like you've already seen to all this sunk cost fallacy bullshit, you're gonna pay for the third even if you hated the second most of the time, right? So they actually didn't have to offer anything at all. Oh, mark saroufim They're not reliable. Like I watched the first Hobbit and I couldn't do it. I'm like, I can't justify a nine hour movie for 150 page book, something felt off in that ratio. Totally. But for sure, where I do think you're right, is that products that resonate with people? And like a core way? I think scarcity is one sort of primitive signal we have, because we sort of don't want to we don't want to die, right. So we want to have resources around us to be safe. So I think scarcity relates back to like a very instinctual survival instinct. That's why it's so powerful, hugo bowne-anderson and FOMO and FOMO, as well, right. It's deeply tied into FOMO. mark saroufim But I think there's other sort of more higher up on the Maslow's hierarchy of needs, like, for example, like, in unities case, like their game engine, they pitch a lot to sort of the first time developer, even though it's likely that first time developers aren't going to necessarily be the ones making a lot of money with some games. But it's more that like, I'm here at my corporate job, and I'm drawing but like, look, I can make a game. And that's why I got into being a developer in the first place. And I could create a world in this world. I can have like, whatever reality I want, like, isn't that amazing? I think this is very powerful. Right? I also think it's back to Dev. Like let's say, let's go back to normal dev tooling, MLOps or something, I think this is why like extensibility, and community is so important, is because if you don't sort of make me feel an emotional need, the best you ever will be, is helping me solve a problem at my day job. And that's great. But given the choice, would you be at your day job if they were not paying you? I would imagine for the majority of people, the answer is probably no. Right? So that's why I think it's important to tap into like a more like basic need of it and recognize it and your product, like what that is, hugo bowne-anderson and I actually I don't even know where which posts this is from, I've got a bunch of screenshots of that I've taken your posts, so we can figure it out at some point. But the h2 is Dev Tools is actualization. And you've written people use software, not only because it solves a business problem, but because it solves a core emotional need. Then you've embedded a tweet, which I really love of yours, which says popular dev tools aren't just solving a problem. They solve core emotional needs, hugging face makes you feel smart unity makes you feel like a kid again, GitHub makes you feel seen fast AI makes you feel like you belong VS code makes you feel like a tinkerer. mark saroufim Yeah, github on the being seen. Like look at LinkedIn, for example. LinkedIn is a portfolio that roughly measures prestige, because it looks at like your seniority. It looks at the companies that you've worked at, and it looks at your titles. So it's more of a good filter, I think I mean, it's a very coarse filter. But that's effectively what it is. Whereas with GitHub, it's pretty much like let's say you want to do a tech screen, you can be like, hey, you know what I don't think at Tech screen is very useful in this situation. Here's my GitHub project that shows why I can do this thing that you need for this job. And here's a specific project. And so that's kind of why I think open source will win in the sense that I think it aligns the needs of people for vanity, and wanting more money. With corporate needs, which is making more reliable software, it could be the case that closed source is actually better, like companies like Tesla are more or less entirely closed source, they build pretty amazing stuff. However, engineers that work there need to rely more on prestige if they want to get another job. Whereas engineers that work on open source, have this like uncancellable brand that they can use, like it's an asset, I think it's worth like 100k and your comp or something like and it's just not something people do when they look at total comp on levels FYI or something. They don't look at it that way. But I'd certainly do like I think open source should be, you should think of it as an increase, like it's a concrete benefit that will make you much more marketable in the future. You know, and I think in the long run will help you make a lot more money. hugo bowne-anderson Absolutely. There's one other post of yours that I want to chat about a bit. It's called All hail the cloud king. And something I like we talked about cloud providers as being landowners in many ways. I don't know a lot about feudalism. I think what I do know is that feudal lords never actually owned land, it was always leased to them or loaned to them from the country, the king, and essentially God, right, is how the structure worked there. So when we're thinking of cloud providers, as being landowners, we can also think of them as countries, as deities in some ways, and these are some things you touch upon in your essay All hail the cloud King. So I'd love your general thoughts. I mean, it's actually an incredibly nuanced essay that we can't get into all the details on, but in what ways we can view AWS for example, as being its own meta country in some ways. mark saroufim So this is a quick background here. The kinds of civilizations we ended up with and the forms of governance are are often very dependent on the kinds of technologies that are available at the time. So for example, let's say you have like wind farms, farms are really hard to secure. So you need a military, but violence, but at the same time, like you're just like a family, like who's gonna do that violence. So it makes sense for you to basically pay taxes for protection from your feudal lord, who then leases the land from your king. hugo bowne-anderson And it isn't only having the land but and the farming, it's when you have agriculture at scale, that you have surplus, and then you need to defend that surplus in some ways as well. Right? mark saroufim There's a lot of aspects to this. But by 100%, I think the best article I've seen on this was by an example, who was like one of the early cypherpunks, he was the inventor of bit gold, like one of the early Bitcoin people. He has like an exceptional essay on this on his really, really good blog, hugo bowne-anderson I'll include that. So one called history and the security of property is, yes, Yep, I'll include that in the show notes. mark saroufim So the next example was just like industrial societies, like where now like you have these giant factories, and you have like workers, and because of the setup, a government needs to only collect taxes from a few large institutions, it would make sense for the state to be large, and also have like, unparalleled ability for violence, like it basically needs to commit violence on a country level scale, to protect people, where I think things get different with cloud providers. And the reason why I called AWS a meta country, is because the reason why this is a meta country, what does this mean exactly? With a lot of like crypto, people always talk about like being able to build your own society with your own currency, using decentralized protocols. And I think this will happen. But this article, and what I'm also wanted to say is I, I think this can entirely happen today with centralized services as well. So for example, you could have your own private satellites for internet, you could have your own data center, you could have your own, like private internet, you could have like your own identity, you could have your own like money on blockchain, you can have your own like robots, you can control those robots from a central place. So you can have your own military. And you don't even have to be that big a unit to organize a large number of people, you just need to be rich, and a good programmer, and all of a sudden, you can create like this violent entity, that the thing about violence nowadays, is that it's asymmetrical, you don't need to be as powerful as a large government to be really annoying to deal with. So if you're gonna be violent enough to just kind of be a pain in the ass, then you could, for example, negotiate not paying taxes, and living like let's say, in the middle of the ocean or in Antarctica. And yes, it is all a stretch, maybe there's a bit of sci fi here, but it is plausible. And so what this means now is that if a lot of people do this, then it becomes very difficult to have something like a two party system in general. Because if you're not happy with a two party, you can change your citizenship with like a keyboard click. Yeah. And so there's people like Balaji, again, like another exceptional crypto writer that talk a lot about this, like, like building your own, like crypto countries. I do think that's possible. And people have, I guess grappled with the implications of such a world. I don't know if it's going to be better or worse, but it's going to be very, very different is at least what I can say with, with a lot of confidence. hugo bowne-anderson Absolutely. And if some listeners are new to either new to or uncomfortable thinking about the concept of sovereign states as implicitly or explicitly relying on coercion and violence, I can point you to I mean, it is a core concept of modern public law, essentially, that a sovereign state is defined by a monopoly on the legitimate use of physical force, which dates back accurate. Well, one of the earliest conceptions of it was Max Weber, who described defined a State as an a structure that has a monopoly on the quote unquote, legitimate use of violence. I think that's a key conception when thinking through these things. I'd love to wrap up just with maybe a couple more grounded questions about the space just generally, what are you excited about in machine learning? mark saroufim I think I've excited about the very abstract stuff and the very like physical stuff. So by abstract stuff, I'm always excited about easier ways to make machine learning easier. Whether that means people writing CUDA ops in an easier way with things like open AI's Triton project, whether that's making it easier for basically anyone to be like a compiler engineer, and machine learning like with projects like torchFX, or funk torch. So this is more on the like representation side what I find very exciting and libraries that make it easier to work with pre trained models and a lot of cool stuff there. And on the applied stuff, I think it's pretty much anything that has more of an impact on the physical world. So I love robots. I think robots are the coolest thing ever. I think robots can help us like create a lot of abundance in the world, it can help us free up a lot of our time in the same way that sort of the dishwasher, for example, helped a lot of people have jobs because you're not home washing dishes all day, you can do other stuff. I think robotics in general will have that effect. I think it will let people do more things. And as a big part of that, I think what will make robotics actually work as simulations? And I really want to see this time when more machine learning engineers become a game developers as well. And I think this may happen with companies doing Metaverse stuff, I think it'll take some time for that, because they're too difficult to acquire skill sets. So it's gonna be hard for the same people to have them. But I'm optimistic and it's sort of I hope we live in a future of more abundance as a result. hugo bowne-anderson So I liked that you brought up the metaverse. In that context, I don't necessarily want to focus on the metaverse, but I do... one of the beautiful things about potential for AR and virtual reality, in particular, is having abundance in things that aren't necessarily abundant in our real world. So such as, I think a non-controversial example is waterfront property. I mean, everyone could have waterfront property in a virtual reality world, but it doesn't seem like that's how it's gonna play out. In fact, there's going to, it looks like we may have virtual reality worlds, which have a lot of enforced artificial scarcity and not abundance. I think NFT's is another example, which is nuanced in a particular way, but provide some sort of, let's say, artificial scarcity, for economic reasons. What are your thoughts on that? mark saroufim So my quick thoughts are, I haven't seen a use case for digital scarcity that goes beyond collectibles, which I think, maybe have some emotional value. Like I've spent, I don't know, 30 bucks on skins and video games. But I can't see myself spending a quarter of a million or at least, like, maybe I would if I was really rich at that point, like, would it really matter? hugo bowne-anderson Because people did in Second Life? For example, 20 years ago, right? But mark saroufim I think on the scarcity front, I agree with you is that we're taking abundance. And it's like, let's say my life is okay, in the real world. Like I live in a like one and a half bedroom, great like in the virtual world, and I live in a mansion, do I want to be living in a studio like not really like I want to have a cool life, life that's better than the one I have in the real world. So I am disappointed that a lot of the NFT takes are headed in this direction. I do think that digital scarcity around money, specifically for Bitcoin is something that I flip flopped on over the years, like I do believe that it has a lot of value to sort of provide insurance against states that are responsible with their monetary policies and their banking laws, and like capital controls and stuff like that. However, I can't necessarily tell you like, Oh, I think it's the greatest investment in the world or anything like that, because for all intents and purposes recently, it's not an inflation hedge. That's not true. It's very volatile. It's still not used as a medium of exchange. So it's difficult to price transactions in it. So I think all of those concerns are valid, to which I guess the number goes up. So like, my opinion, doesn't matter. Maybe that's true. I do own some, but I'm definitely not as bullish as I was on it. Like I would say, a few years ago. Or any cryptocurrency to be honestly, for sure. hugo bowne-anderson So to wrap up, I'm just wondering if you have a final call to action for all of our listeners, who are clearly if they've stuck around to interested in machine learning quite seriously. mark saroufim Yeah, I will say like the very least, feel free to subscribe to like my YouTube, my substack blog. You can also join like my Discord channel. If you want to ask me more questions, one on one, like I do a lot of mentorship there. I will say like, in general, my observation of the job market and machine learning is I still think there's a lot of wealth to be made in machine learning. I think there's a lot of good products to be made there. I think my suggestion has always been to avoid tutorial hell, to find certain open source communities that you want to be a part of, and contribute like, this could just be memes. It could be answering people's questions on GitHub, and then eventually work your way up towards finding bugs than fixing bugs that are proposing features. And then I think once you do that, I think the world's your oyster, you can get any job you want. I would certainly hire you if I was a manager. So feel free to reach out directly to me, if you actually listen to this advice. I'd be more than happy to help. And I think like the key here is to invest in skill sets that you think are marketable machine learning and pytorch and stuff like that, like Oh, uncontroversial stuff, but also try to have some unique opinions that you are unique to you, whether it's technologies that you think are popular yet, whether it's like mathematical ideas or as popular whether it's sociological ideas that you want write about that you don't think are enough. I think a lot of the interesting work is at the cross section. And it's easier to be really great at multiple things. But it's almost impossible to be the best at a single thing. But it's actually pretty easy to be the pretty good at like three or four things. So that's my suggestion. hugo bowne-anderson Absolutely. That's great advice. And in terms of getting in touch with you, I'll link to your substack on YouTube and discord in the show notes as well. So anyone, please do reach out to Mark. Thank you, Mark for such a wide ranging, nourishing conversation. I always enjoy speaking with you. And this was beyond my wildest dreams. So thank you. mark saroufim Honestly, it's a pleasure anytime we should do this again sometime soon. So once I get to writing a bit more again, hugo bowne-anderson I'd love that. But thank you. Yeah. And there are a bunch of posts that I wanted to talk about that we didn't like the wild west of ml ops, and I'll include those in the show notes as well. But oh, yeah, that's mark saroufim a good one. Thank you. Yeah. hugo bowne-anderson There's also a lot more. All right. Thank you. Thank you. Appreciate it. Transcribed by https://otter.ai