The following is a rough transcript which has not been revised by Vanishing Gradients or Jeremy Howard. Please check with us before using any quotations from this transcript. Thank you.

hugo bowne-anderson  
Hi there, Jeremy. And welcome to the show.

Jeremy  
G'day. Nice to be here... Virtually!

hugo bowne-anderson  
Right? Exactly. Great to be in almost the same time zone as you as well. Yeah,

Jeremy  
well, lovely to be in Australia, that's for sure

hugo bowne-anderson  
 it is. And that actually, that's a good point. Because today we're here to talk about the past present, and perhaps even the future of data science and machine learning, what some might call AI. 

Jeremy  
Oh, is that all? How long have we got?

hugo bowne-anderson  
we've got a couple of hours, I actually sent the invite through for 24 hours. So I don't know you feel free to jump jump off at any point, also how to make deep learning accessible, hopefully some things about the data tooling space. But first, as we're in Australia, there's something else that you and I have in common is a love of chocolate biscuits,

Jeremy  
of course. I mean, we're Australian, because we have good chocolate biscuits. So all Australians love chocolate biscuits.

hugo bowne-anderson  
That's the point. And I think, um, you know, you and I've lived in lived in the US for quite a while.

Jeremy  
And it's shocking to me how America has not discovered good chocolate biscuits. I mean, you know, there such an advanced country in other ways, but not in this most important way.

hugo bowne-anderson  
Absolutely. And they have I think it's because they have the cookie. And they actually Tim Tams...  Arnotts tried to market Tim Tams as cookies for an American market because Americans didn't quite get the concept of the biscuit. It is sad to me that we're famous for things such as Vegemite, which I'm a huge fan of Vegemite. But I feel like the Tim Tam and the mint slice should be more well known. 

Jeremy  
So let me pump it up to another level and tell you I just bought a pack of mini wagon wheels, which to those that are not Australian is a cookie sandwich with jam and marshmallow in the middle and wrapped in a layer of chocolate. And tell me you don't need one of those more than you need an Oreo.

hugo bowne-anderson  
Absolutely. And in fact, I think what Americans have done, the smore is somewhat wagon wheel esque correct.

Jeremy  
That imagine you could just pick up a pack at the supermarket covered in a perfect layer of chocolate anytime without needing a fire or anything. The Jam?

hugo bowne-anderson  
Jam is so key. And the mini wagon wheels great, but remember the big ones as a kid

Jeremy  
yeah! Absolutely. Oh, well, you know, we also do, I don't know what you call them America chips, crisps. They called crisps in America, but we do them better as well. You know, like, the bigger rings and the toobes and and we do a whole you know the whole shapes which are... shapes our barbecue shapes. So imagine little savory cookies with tomato powder and herbs and salt .... oh so good. There's so many reasons you may be really hungry to snack food. Amongst other things,

hugo bowne-anderson  
I do want to move on to discuss data science, but I also want to double down on on biscuits as well. Do you remember I thought I may have hallucinated this Do you remember the Arnott's biscuits scare of 1997? So I looked I thought maybe I'd made this up. But in 1997 I'm reading this from the internet. And it's on Wikipedia as well on its biscuits was subjected to an extortion bid by a Queensland extortion to poisoned packets of Arnotts Monte Carlo biscuits in South and Victoria, the company had to conduct a massive recall. And the reason I remembered is because my grandma who'd retired My grandparents had retired to Queensland to the to the Gold Coast, she couldn't have a afternoon Iced Vovos and that I remember how distressing that was for 

Jeremy  
Some people may not be familiar, the Monte Carlo. So I guess it's kind of like a two butterscotch cookies surrounding vanilla cream and jam. I do notice we managed to fit jam into more cookies, and then some other so called developed nations.

hugo bowne-anderson  
Absolutely. And I think it is part of our Um, probably British heritage, right? Because they they do the jam relatively well, for sure. I mean, it's scones and cream and jam are probably one of the only great parts of

Jeremy  
it. But you know, thank you.

hugo bowne-anderson  
So let's move on to cricket. No, I'm joking, we probably should get we've really gone for five, five minutes. So I think maybe we'll circle back. Something I'm interested in Jeremy is a lot of people say, you know, I got into data science before data science was even a thing. And some of them are right and some of them are wrong. But I know that you've been working in different forms of data science, or what we call data science now for decades, and I'd love to talk about what you've seen and how you've seen the space evolve. Perhaps before getting there. You could just give us a brief rundown of your background.

Jeremy  
Sure. So yeah, it's been over 20 years for me. And a big challenge was I was doing a thing that didn't have a name and nearly nobody else did. And I knew I wasn't doing statistics because I wasn't very good at statistics, and I couldn't talk statistics with statisticians. But I was certainly doing something involving analyzing data with computers. There was a brief mention of the idea in my kind of early to mid 20s, that maybe there's a thing called industrial mathematics, which was an attempt to giving a name to this thing. And so yeah, I started doing that in the management consulting world. So in the management consulting world, most people are pretty much everybody has an MBA, and most people are not familiar with data science approaches to things. So I'm actually one of the things I did at a company called AT Kearney, which is one of the big global consultancies was I helped build a new global practice in what we called leveraging customer information, which was our attempt at naming data science. And yeah, you know, I kind of went around the world teaching people how to create data warehouses and use, you know, pull that data out in summaries and create OLAP cubes and do statistical analyses to actually try to capture some value out of all that, and also did a bit of machine learning a bit of neural nets kind of 20 years ago, although they kind of disappeared, and then support vector machines took over, but unfortunately, they're nearly useless.

hugo bowne-anderson  
How are you even doing machine learning and neural nets back then?

Jeremy  
 The first neural nets I commercially involved software from a company called HNC. Which in those days, they made a plug in card for your computer that did neural net calculations for you. And it also came with software. So they were, you know, just fully connected nets in those days, but it was a good piece of software, you know, it did all the kinds of stuff you would like to see nowadays, or, in fact, a lot of software doesn't even do still now, like showing you kind of the derivatives against your inputs to figure out which inputs are most important and you know, stuff like that. And so we were doing this on retail bank data, mainly for marketing purposes. cost a few million dollars have worked. Okay. You know, the other big thing around that time or a little later was single decision trees. So chade and Id three and stuff like that, which I never was a fan of, I always felt like neural nets had better fundamentals. And then yeah, I mean, the the kind of the gurus who came along during the 90s were Ripley and Bishop, who both wrote really great books that kind of stood the test of time in a lot of ways. And actually, Ripley came out to Australia, I met him at the University of New South Wales and had lunch with him. And we both had a good complain about how stupid how we build decision trees are, which was in this kind of greedy way. And so the biggest jump for me in terms of machine learning was 2000, I think I might have even seen it 99, which is when the Random Forests paper came out. And as soon as I saw it, I was just like, Okay, this solves the problem with decision trees. And that was Brieman. Right was prime. And yeah, who also was he was one of the decision tree underlying algorithms. chaidh was each aid, something like that? Yeah. I mean, Brian's an interesting guy, because he was a math professor, who Berkeley I think, got a bit yeah, Berkeley who got a bit sick, I think of the academic nature of math. And he left and became a consultant for a couple of decades. And then he came back to Berkeley, with this kind of newfound, you know, interest in practical applications and really worked in kind of machine learning data science, and yet just develop these incredibly pragmatic, practical things, a lot of which is still largely undiscovered, a lot of it was kind of hidden behind technical reports, because he was in his 70s. And he had no need to get citations and stuff. So he was not on the whole publishing in traditional journals. Yeah, you know, thankfully, Berkeley have maintained his original webpage. So you can go back and actually check out all the great work that he did, because it's still still really valuable and underappreciated.

hugo bowne-anderson  
Yeah. So already we have we have neural nets. And we have random forests. Right?

Jeremy  
Right. Right. Yeah. Yeah. So which are both still great tools. And, you know, the random forest paper, I was, at the time running a company called optimal decisions group, which was all about applying optimization to insurance pricing, and very heavily leveraging machine learning, because we needed bottles to predict like, what price you're going to buy at, and you know, how you're going to claim and if you do claim, how much it's going to cost, you know, and mold here simulation built on top of those things. And it's very hard to simulate and optimize machine learning models, because optimizers tend to find the edge the kind of the edge cases and take advantage of those. And so that was a tricky problem to solve. And so I actually said to a couple of my staff, I said, Look, here's a new paper called Random Forests. I reckon we could have this implemented pretty quickly in C sharp and yeah, but within four hours, they had it implemented and we played with it. And we're like, wow, this is exceptionally fantastic. Fast accurate very little feature engineering, you know, exactly what I've been looking for for decades. And just such a simple and in hindsight, obvious approach to

hugo bowne-anderson  
it's very elegant, isn't it? Yeah. So what happened in that do you go into insurance at that point, that

Jeremy  
was my insurance pricing company, and at the same time, I was also running a mail provider called fast mail. And fast mail didn't need as much machine learning or data science but still needed like anti spam anti virus. stuff. And so particularly on the spam side, I worked pretty hard on, you know, basically simple regression kind of algorithms and, and kind of analyzing what signals were most useful on samples to figure out where to kind of invest in capturing data and stuff like that. And fast mails. anti spam was always pretty famous, you know, people really liked it, it worked very well.

hugo bowne-anderson  
So this was on, among other things, natural language data.

Jeremy  
Yeah. Just a little bit. So the the main kind of natural language data, if you like signal for anti spam was something called ripples razor, and actually was became friends with the inventor of that when I came to San Francisco through an entire coincidence, complete coincidence. But that was something that basically calculated a kind of a hash of the words in the text, but a hash designed that text with similar words will have similar hashes. And so then there's kind of a global database, there was another record DCC did the same thing, kind of global database of hashes. And so then every email that came in, we'd calculate that, that hash and compare it against the database. And if it was similar, that that would be a signal. But most of the signals were also Bayes, you know, so simple, Naive Bayes on the words, but a lot of the signals were like from the IP addresses or sub, you know, subnets or had a as you know, most spam is a pretty stupid, so they're not capable of creating proper headers, you know, stuff like that. So yeah, so we ended up selling both of those companies. And then yeah, and then next thing in my life in in terms of vocation was Kaggle. Which, you know, I've always been fascinated by machine learning competitions. I thought the Netflix prize was really cool and KDD was the main one I knew about KDD had a KDD cup every year for a long time.

hugo bowne-anderson  
And so this was the age of probably it was called data mining or something, then yes. Okay. D

Jeremy  
stands for knowledge discovery. Is it knowledge discovery and databases, the thing? So knowledge discovery and data mining? Yeah, so data mining was kind of like, originally, that term was used in a kind of a negative way to suggest you're not actually doing anything. careful and thoughtful. But you know, it was very effective. So people just kind of said, fine, yes. I'm data mining. It's so good. So, yeah, yeah. So then Kaggle? Yeah, it was kind of like, okay, well, let's turn this into a whole kind of industry of competitions. And I was particularly interested in the data scientist side, I really liked. Like, the idea that this was a way that machine learning practitioners Could, could prove themselves and their techniques without having to show error bound proofs or create complex mathematical derivations or whatever. They could just say, well, this, this works, and a story. And I felt like we'd already been always been missing that except, you know, for things like the Netflix prize, this is a great example. We like everybody had been ignoring probabilistic matrix factorization. They've no thought was a great technique until it, you know, became an early winner of the first rounds of the Netflix prize, and then suddenly, everybody's like, Oh, we should do this for for lots of other things. Yeah. So yeah, so I was a huge fan of the idea of rigging machine learning competitions to the masses, and making kind of champions of practically effective data scientists,

hugo bowne-anderson  
and you got involved in Kaggle by participating in competitions and then becoming the president.

Jeremy  
Yeah. So originally, it was basically one guy, Anthony Goldbloom, who kind of started it and fellow Australian. Yeah, he was from Melbourne, and, and somebody at an R meetup, told me, I should check it out. So I was told, I was interested in learning more about R, and because I've kind of used as plus for a long time, but, you know, I felt like more of an executive techie, you know, having spent a long time in kind of running companies. So yeah, I felt very, had very low expectations about my own capabilities. And so somebody said, you should check out this thing called Kaggle. There's some competitions, and it was kind of a bit intimidated. But I thought, yeah, just try to not come last would be good. Yeah. So I entered a competition in an area I didn't know much about, which was time series analysis. Now I kind of felt like, Oh, that's a good way to learn something would be to pick something I don't know much about. And I ended up teaming up with another guy. And we ended up winning. And that was pretty surprising to me. And so I went back to the next time made up and people were like, I thought you said you were trying to learn. Like, thought you were like just studying and how did you do this thing? I was like, I was surprised this year. You know, I didn't even end up using our I ended up using C sharp, because that's what I know pretty well. And I still think it's a great language. And yeah, it kind of took a very pragmatic common sense approach to it. And kind of noticed, particularly that the the metric being used to measure it was not kind of a symmetric or kind of fair metric. So I kind of found some ways I wasn't exactly cheating, but some found some ways I could kind of take advantage of the metric they were using, even though it might not have been actually better in practice. And so it turned out this metric was something that the econometrics community was really getting behind. So we ended up writing a paper about this and saying, like, this is actually a pretty stupid batch, because it's doesn't measure what you're thinking it's measuring, you know, and then I entered another competition, and I won the second competition I entered. And then I added, I made up I actually met Anthony, who started Kaggle. And said, I thought it was really cool what he was doing. And he was like, Wow, it's really cool that you're winning all my competitions. And he told me, there was a machine learning data mining conference on in Sydney, like the next week, and I was like, I'll come to them. And so we kind of hung out for a few days. And I agreed along with another guy called Nick grow and to become the first investors in the company. And I also agreed, like I also kind of sit he told me about, like, how it was set up on AWS, or whatever it is, like, it's using three instances. And that's, you know, started to get a bit slow. And I was like, flabbergasted. I was like, fast mail has a million people checking their email all the time, and we have less infrastructure than you are for people who are like submitting one competition entry a day. That's doesn't sound good. You're like, well, probably it's not because I don't really know how to code. So I agreed to take a look. And I was like, Oh, my God, this is horrible. For an economist, I guess. Anthony wrote good code. Like it was not very efficient, or maintainable. So I basically, the first thing I did was, like I said, Look, you'd actually don't have any indexes on your tables, that will actually make it 100 times faster. So and he's like, I don't know how to create an index. So I added indexes to the tables for him. And so at this point, I wasn't allowed to enter competitions anymore, because I was only I had access to the database. So that was a bit of a bummer. And what year was this? This was whatever, you know, whatever year Cagle started, you know, it's just a few months after probably three or four months after it started. So a decade ago, or so at least. Yeah. And so basically, as soon as I did that, the load on the servers dropped by like 100x. And Anthony's like, wow, that was like, magic, I'll get rid of my two unnecessary AWS instances. And so now, as an investor, I felt kind of an extra level of, you know, concern as to how things are going. And he told me there was going to be a $1 million prize coming up. And I was like, I don't think this is going to be able to handle it, this PHP, you know, duck type thing. So I basically agreed to rewrite the whole thing from scratch and C sharp, and then we hired a guy to help with that. So yeah, I ended up, me and Jeff ended up rewriting the whole thing from scratch and migrating everything from, you know, to Microsoft SQL Server and C sharp, which in hindsight, was kind of weird, because I yeah, I wasn't being paid. I was just volunteering. I was just helping out because it was fun, you know? Yeah. But eventually, you know, Anthony told me, he wanted to expand this thing. He wanted to get VC, you know, and I ended up coming to Silicon Valley and building the financial model and writing the deck, and we pitched it all. And I never could never quite thought through like, what does this actually mean? So of course, when we actually raised money, and at just suddenly, I kind of realized, like, Oh, we're, I guess we're partners in this thing. Now, even though I only had a small share, because of this, you know, investment I'd made. So yeah, Anthony ended up making me basically an equal partner in the company and restarted a new American company called Kaggle. link together and move to San Francisco. And so awesome. Yeah. So

hugo bowne-anderson  
what is the role of these types of competitions in making data science and machine learning more accessible,

Jeremy  
really helps to bring down the barriers, the kind of the gatekeeping barriers, so before Kaggle, really, pretty much the only way to prove yourself was to like, write papers or get credentials, PhD, whatever, you know, and I am somebody who finds all that extremely uninteresting. And you know, I don't like anything about the academic system. I think it's far too explosive. And I don't feel like it's particularly pragmatic or practical, and I don't like how peer review works. And I don't like the way things it's so much incremental ism, and it really doesn't really kind of highlight bold approaches. So like that wouldn't ever have worked for me, for example. And so for me, it helped make things more accessible because people started seeing me as somebody who was good in practice at creating machine learning models, even though I had none of those things. You know, I have a philosophy major, Bachelor of Arts, you know, no, you know, I've never published a paper And there was a lot of other people like me who did well in competitions and ended up getting hired by companies and becoming, you know, superstars in the field is also great for kind of highlighting techniques that actually work, you know, like random forests. So when I started random forests were not at all popular. And I worked really hard to make them popular. One of the things I did was I said, Look, this competition winning approach is based on a random forest. And here's how I did it. And Random Forests got a lot more popular as a result of Kaggle. And, you know, maybe partly the advocacy of people like me. And so then, you know, since since then they added notebooks, which have a similar benefit, you know, it makes it easier for people to highlight, not just their ability to train models that are predictive, but to communicate difficult technical concepts to an audience and it's judged by your peers, people vote for them.

hugo bowne-anderson  
Yeah, the notebooks and the kernels. I think were big.

Jeremy  
 Yeah.

hugo bowne-anderson  
Yeah, you know, one of the concerns with the at scale adoption of these leaderboard competitions, and so let me kind of lead the witness in a certain direction. We both know that machine learning and inferential models in general, there are lots of things you need to do. And you can break it down maybe into building testing, deploying, maintaining, and leaderboard style competitions don't necessarily cover all of these,

Jeremy  
right? No, of course not. But I don't know why anybody would expect them to or think that it's a problem that they don't, or whatever, you know, so I

hugo bowne-anderson  
think let me be clear, I don't think that's a problem that they don't. But I think there may be a challenge that in the space, leaderboard style competition competitions have gotten so much attention, that it's we've had challenges in building good education around the other aspects of inferential models being incorporated into business.

Jeremy  
I don't know, I mean, to me, we've got problems all over the place, you know, and like on the on the inferential side, I feel only inferential side, I feel like ML Ops is a disaster, it has far too much kind of focus on ml ops tooling, and all these startups getting an ML ops money and you know, and we don't actually have something equivalent for, for ml ops, something which is like, well, what actually practically works instead, like, what's what, you know, we get these kind of overly complicated things full of buzzwords, you know, so no, I don't think I agree with that premise. I think like, we finally found a way to actually get some appreciation for model building methods that actually work in practice. And that's a good thing. And God a nice to have that, you know, something like that in other parts of the pipeline, as well, you know, so the issue is that we, we don't really have a mature ecosystem for labeling or for data validation, or for model maintenance, or so on and so forth. So it'd be great if we can find ways to do all of these things in more pragmatic ways.

hugo bowne-anderson  
Yeah, a great and I think some of these things are things that you're actively spending a lot of time thinking about and working on with fast AI. And so I wonder whether go, we could then continue the story of your journey from Kaggle. To to the types of things you work on and think about now. 

Jeremy  
So the next thing on my journey was enlitic, which was the first company to focus on deep learning and medicine. And that basically came from my, my excitement at seeing, after 20 years, neural networks, which I'd always thought were going to be the eventual right path forward, finally becoming practically useful and actually achieving superhuman performance on very human tasks. You know, the first one was traffic sign recognition. And so, when I saw you know, when I saw that happen, that was 2012. That was when I thought, Okay, I've got I've got to make this my full time job now. So okay, so to answer your question, Kaggle was definitely earlier than 2012 because that was when I was looking to leave.

hugo bowne-anderson  
And when was it was the diabetic retinopathy competition around that time? Yeah.

Jeremy  
So diabetic retinopathy was probably a year or two later. So we had the cats and dogs competition on Kaggle. That was the first time that Kaggle had a, you know, image recognition competition. And that bought the state of the art for recognizing dogs versus cats from a 20% error to a half a percent error, you know, which showed the huge gap between academia and practice. And then yeah, diabetic retinopathy was Ben Graham, who was unheralded but quietly brilliant academic smashed it, absolutely smashed it. And he had developed his own, all his own techniques on sparse training, and his own libraries in CUDA from scratch, like just an absolute genius and also been extremely practical. And that really showed the kind of opportunities I think for Medical Imaging. And so I went to the medical image computing conference, MCI, I guess that was 2014, maybe 2013, as part of my research around, like, aware of neural networks going to be really useful first in practice. And I was just shocked to find that in a whole conference focused on medical image computing, there were no papers being presented that use neural networks, it was all, you know, graph cut, you know, whatever, like, traditional techniques, and I just thought, wow, this is a huge opportunity, you know, to because like, all the approaches, I saw, I knew that neural nets would be far faster and far more accurate and far more interpretable.  And you put this down to, like, siloed knowledge. And also, like we saw somewhere, again, recently

hugo bowne-anderson  
with, like, neural nets, like computer vision and natural language processing, right, though. Yeah.

Jeremy  
I mean, people, generally speaking, people hate neural nets, you know, particularly the older generation. So like, for example, I found it, you know, since I've come back to Australia, I've tried to do quite a few academic conferences to kind of help get the word out, and, you know, help us help the Australian community and literally in every one that I give a talk, there are multiple older men who will get up and just rant about neural nets, and in ways that are like, totally ignorant. Like, they've obviously never trained a neural net in their life and have no idea how they work. But there's a huge amount of negativity towards neural net. So you've gone to MCI that year and presented something about neural nets would have been totally shut down, because people hate them. They just had such a terrible I mean, they still do, in many ways had such a terrible reputation. And so it's partly siloed knowledge and partly a huge amount of kind of baggage

hugo bowne-anderson  
inertia resistance. And but this is great, right? Because a lot of it, at least historically, for adoption can take decades. Yeah, a lot of the time, but something we've seen here in diagnostic imaging using neural nets is, you know, from 2014 to like, five years later, or whatever, a really quick turnaround in massive adoption.

Jeremy  
So I ended up doing a talk at RSA, which is the big radiology conference. It's something like over 100,000 attendees, I think, and I gave the first talk about deep learning for radiology. And they put me in the farthest corner room, and the very last session after most people had gone home. But even that we still got a full house, basically, pretty much full house, talked about the importance was what I thought deep learning for radiology. And so this was organized by a guy called Paul Chang who's a brilliant both radiologist and pathologist academic. Very little, got a lot of foresight, and literally by the next year, our SNA, the show floor was full of people, Hawking AI products, there was a whole AI stream. There were queues out the door every AI talk. So yeah, it took it took a year to go from zero to huge, and RSI created their own journal on radiology, artificial intelligence. And it was a very rapid adoption, which is exactly what I wanted. That was why I went into this field was because I wanted to see as many people as possible getting into it and making it huge.

hugo bowne-anderson  
Yeah, great. And then only a year or two or a few years for these things to actually to be deployed around the world and reported on by the New York Times and all of these types of things, right.

Jeremy  
Yeah, that bits taking longer, you know, because of regulatory issues. Mainly, it's happening a lot in China. It's very widely deployed in radiology in particular, I didn't get into pathology and the reason for that was pathology doesn't even have a digital workflow. So it's all physical microscopes and physical specimens. So there is a company in Australia that's basically been developing that coming out of Queensland, developing digital workflows and AI for pathology in China, which doesn't have the same regulatory and process baggage, you know, they're inventing, you know, they're developing everything from scratch. So they've been much more rapid adopters of, of AI in, in medical imaging than in the West. So what happened after analytic, I was very excited about the opportunities to kind of broad or even universal diagnostic and even kind of creation of an action plan for patients based on deep learning in medicine. But analytic found its product market fit in radiology, particularly lung CT, just because that happened to be the first thing we prototyped and we credit remade that the first thing we prototyped mainly because that was the thing that there was public domain data are available for and the sad reality for VC backed startups is it's once you found some kind of product market fit, it's very difficult to maintain a kind of a big bold vision and rather than just focus on that one thing, so I was not able to do the plan that I wanted to do. And you know, worse than that, there were so many other things I wanted to do with deep learning. So like, I kind of narrowed it down to the big early opportunities to being robotics that are geospatial satellite, and medical. And so I decided to start with medical because I thought that was the area with the biggest opportunity for societal impact, you know, in the shortest amount of time. Yeah, but there are all these other things that I wanted to do. And then I also realized, there are lots of other things that people could make a great societal impact from using deep learning, which I didn't even know about, because I don't know most things. And so I felt like it was a mistake to, for me to just like, keep picking a vertical and doing that I felt like I'd always be kind of underachieving and the way that I felt that I was for Enlitic. So my wife, Rachel, and I decided to start a company to help other people use deep learning to solve societally important problems. And so that's why we started fast AI. And the other reason we started fast AI is, we saw that the people we knew, who were proficient with deep learning, it was a real monoculture. You know, they were generally young white men from exclusive, highly technical academic backgrounds, you know, lovely people and doing a great job. And there's nothing wrong with that, but they weren't, you know, working on or familiar with the kinds of problems that I would like to have seen them working on, you know, like access to water, or access to education, or, you know, dealing with injustice, or whatever. So, we kind of felt like, the way to get a diversity of problems being solved in a diversity of ways was to get a diversity of people. And so we did feel like we needed to, yeah, we needed to create a company to help more people utilize deep learning for their own thing, whatever that thing might be, you know, is as cheap and fast and easy away as possible.

hugo bowne-anderson  
So with fast.ai, you, of course spoken about this, before, I'd love to delve into kind of your philosophy and practice behind it, particularly, I suppose, in the four pillars of software, education, research, and community and how all these kind of interrelate for this project.

Jeremy  
Yeah, so three of those things are in a loop, right. So the way I've been seeing it since we started fast.ai, was that I wanted deep learning to be kind of in a similar place to where the internet is today. So I remember I started using the internet when I was 17, or 18, you know, and I needed a Unix shell account. And, you know, I needed you know, to, to interact on forums, I use the RN newsreader, you know, console based system with I mean, I thought it was brilliant, awesome program, but it did require, you know, fairly arcane commands, you'd have to read, you know, man, RN, and his email, you know, similar thing. And, yeah, it wasn't something that you could use without high level of familiarity with, with Unix and TCP IP, and so forth. You know, in today, my mum sends me email and uses Skype and does her accounts on Excel and sends it to her accountant and whatever. And so that's, that's where I wanted to get to. And so that, that is mainly about software, and understanding, through years of research and development on like, what is the internet useful for? Who was it useful for? How do we make those things available to those people in a way that fits them? So we kind of felt like research and software would be the key. But you know, where do you start? Right? Like, what were really the constraints in what was that? 2013 2014 2015? Anyway, I get confused about years, pardon me, whenever the hell it was. We started fast today, you know, what was stopping everybody from using deep learning, really? And we figured to know that we needed to start by understanding Well, what are the best practices right now? You know, and what would it look like if we tried to teach people those best practices? So that's why we started with education. So we started out by saying, Okay, well, let's try to really understand what are the best practices for training neural nets right now? And what are they actually practical for right now, and teach people those things. So that became the first fast AI practical deep learning course. And so we decided to focus on coders because you couldn't do it without code. But we did have a feeling you could perhaps do it without a huge amount of math. So this was like super speculative, because at that time,

hugo bowne-anderson  
but it's also key of mention You're, you're not a mathematician, your background is in maths. I'm in the way that Rachel's is

Jeremy  
right. Right. So Rachel's a math PhD. And she knows all that Greek letter stuff better, better than I do and proofs and whatever else. Yeah, I mean, I'm, I've discovered that I'm actually quite a capable mathematician in a certain way. But I never had that, you know, university training. Yeah. So yeah, so that's the thing, it's very speculative idea, because at that time, basically everybody doing deep learning, were PhDs in math or computer science, or sometimes even both. And things were explained using math. The one person who was who was starting to show a different direction was Andrei Karpathy, who was developing this CNN course at Stanford, which we thought was very exciting direction. But yeah, we, you know, we kind of had this speculative, but unproven idea that maybe people could become proficient at using neural networks in a really practical way, without needing a PhD. We didn't know it was true, but we had a feeling it might be. And so that was the kind of goal of this first course was to figure out the best practices and teach them to people who are competent coders, but when, you know, possibly quite rusty at math, you teach in Python at this point? Yeah. So we started with theano, which was definitely the best option at the time, which did require Python, you know, and we very much focused on computer vision, because that's really, all that neural nets were kind of the state of the art approach for. And so then to answer your question about like, the community, so then we kind of thought like, Okay, well, we need a way for the people doing this course to, to share their work and to talk to each other and ask questions and to talk to us, you know, and we should make that something which lasts beyond the course, you know, we wanted the idea that people, you know, who had been part of this exciting journey would remain part of the journey ongoing. So and at that time, the discord software for forums was pretty new. But I've always been a big fan of forums and before that of Usenet. And I thought, like, oh, this, this is the way to do it. I think. So I tried setting up a Discord server. And yeah, so we had started a community through these forums, and people started sharing the work that they were building, and we were just like, wow, this is amazing, people were actually building really cool stuff. And a lot of these people were, you know, from places I'd barely heard of, you know, so like, one guy posted, like, I'm, I'm, I'm on I'm in the Ivory Coast, you know, we barely have an internet connection here. So I have to, like, spend 48 hours downloading each video, and then I watch it. And does anybody have tips for, you know, all I've got is a, you know, six year old laptop, anybody got tips on how to run? Books? And, you know, we're like, wow, okay, we really, you know, finding good, interesting people and interesting places of finding us. And then we started hearing people building like, you know, malaria diagnostic systems, or whatever, and did our work, you know, because their own family had been devastated by malaria. And, and we thought, yeah, this is this is happening, people are actually using deep learning to solve problems that we don't understand using data sources we're not familiar with, it was very exciting,

hugo bowne-anderson  
fantastic. And then on top of that, you started to you realize that you needed to write your own software,

Jeremy  
at that time, not too much. Basically, it started out with the classic utils.py, you know, there were certain little bits of boilerplate in siano or around the kind of data processing that I kind of thought, like, ah, no need for people to type that out in full every time I'll just wrap it up in a function. So I think, you know, that first course I probably included in the, in the repo of notebooks, you know, utils.pi said, like, you know, and just used it in the notebooks and said, like, Okay, well, you have to import this. And that way, you can only have to type this one line rather than these six lines. And, you know, one of the things I was really pleased of doing was really focusing on notebooks from the start, so that people would be actually running interactive experiments and actually, tangibly working with the models. And, you know, I thought a lot about like, are where do we, you know, we're going to use like, Coursera or edX, or Udacity, or whatever, you know, Rachel and I thought, No, we, you know, if we put out our future in the hands of some external entity like that, God knows what will happen to you know, VCs will probably end up monetizing these things in ways that hurt students and authors. And of course, that is what ended up happening with stuff like Coursera. So we thought, okay, we'll just chuck them on YouTube, but we'll create our own interface around it. So that all worked really well. And yeah, I mean, there wasn't really any choice other than Python, you know, at that time, Julia, I don't think it existed, but I don't think it had any deep learning stuff yet. You know, whereas Python obviously had the the ecosystem and Theano was really quite a nice piece of software. One of the big challenges, though, was everybody had to set up their own Deep Learning server. So you know, I had a video explaining how to set up AWS and I created an AMI and copied it into each of the main regions and said, like, you can just use this Ami. But there was still a huge barrier to entry there, which is like, Linux System Administration.

hugo bowne-anderson  
 But that's at least relatively solved now with I mean, in the course, now you use or encourage people to use services like paperspace, or Colab.

Jeremy  
Yeah, paperspace and kolab have solved that. And now the new kid on the block, the horribly named sage Maker Studio Lab, which is a similar thing to pepper smells and cola, but much more of a kind of classic Jupiter lab environment, which I much prefer to the CO Lab version of it type of space has never been as reliable as I'd like. It's always been better in theory than in practice, but yeah, some people seem to be able to use it successfully. I've never had a lot of luck. But yeah, it's great, you know, on on our course, now, you just click a link, and it opens it up in colab. And you start, there's nothing to

hugo bowne-anderson  
GPUs, which is

Jeremy  
just, it's such a great service. 

hugo bowne-anderson  
Something we've mentioned a few times that it might be worth drilling into is the importance of notebooks and literate programming. And perhaps you could even give us a bit of historical perspective, perhaps dating back to maybe method Mathematica, is that, for sure,

Jeremy  
yes. So it was notebooks. Yeah. Yeah, so I always had this love hate experience with Mathematica, I felt like whatever I used to it, that I was using the thing that everybody should be using all the time for everything, you know, both in terms of the kind of Lisp like language, which is just so much more elegant than any other language that I had used. So much more expressive, such a good library, and yeah, and the notebook environment, being able to explain what I was doing, as I went both to myself, you know, into future me and to my clients, or customers or whatever. 

hugo bowne-anderson  
This is super important for science, right? Because I used to work as you may recall, I used to work in in cell biology and by the physics and my biologist, colleagues would have their journal books. Yeah, journals exactly what they post slightly out there PCR results. And

Jeremy  
absolutely, I remember when I was learning science, that the journal is like every scientist, they would study their journal methods to learn, you know, and it's all about being totally rigorous in documenting every step. And, you know, like the story of how the Nobel gases were discovered, you know, it was really thanks to that, that care of documenting each step. And it's like, oh, there's a little bit of something left over. And then you know, you figure out like, Oh, we've done all the right steps. So that's actually a real thing. We should study it. Yeah, it's, I find it weird that so many people are so against notebooks, when, you know, other sciences are so aware that documenting the development and experimental process is almost like it is probably the most critical part of science. Yes,

hugo bowne-anderson  
I think one of the challenges is, firstly, that too many of these conversations occur on social media. And there's some form of like, schism magenic polarization that that happens there. The other thing, though, is I think it solves that part of what we do. And this is something of course, we all think about a lot. But you know, the deployment story and incorporating software engineering best practices into this, it doesn't necessarily solve all of those without the existence of other tools. So I wonder whether you can tell us a bit about

Jeremy  
I think, also, the problem is that a lot of people, if you're having trouble with your code, as a data scientist, or you know it nowadays, increasingly, you're some other kind of domain expert who happens to have to do data science. The person who asked for help is a software engineer. Notebooks, don't quite solve the problems that classic Software Engineering has, you know, like, it's not necessarily going what you're going to want to use to kind of create a CRUD app, or scale up a chatbot to 15 million users or whatever. And so, but it looks like it's solving the same problem, because it's something that people write code in. And so software engineers then all kind of look at, you know, what their chemists are doing, or their astrophysicists are doing, or whatever in their department and be like, Dude, this is not how you write code here. Let me show you emacs or VS code, or whatever. And I'll teach you how to write code properly. So I think it's partly a confusion. So misinterpreting the goal. Yeah, people people don't understand that code is used in different ways. For for different things, there is this fact that after some period of research and development, you end up with something you do actually want to build, you know, you're like, oh, this, this is actually working. And some people will kind of, or a lot of people seem to think that if you can't take the experimental notebooks you use to develop that and productionize them somehow, then the entire process was stupid. You know, it's kind of like our Kaggle competition can't be directly turned in production than the competition was a waste of time. You know, neither of these things are designed for that purpose. So like the fact that you then go through a period of like, developing a solution that may well require breaking out emacs or VS code, or whatever, is not the stuff in that the original r&d process was, was stupid. And so notebooks were developed by a scientist, you know, Fernando Perez at Berkeley, for the purpose of helping with science, and that's the customer base, you know, that's the that's, that's the user segment that's being targeted, and indeed for for teaching and experimenting with deep learning. Yeah, it's great. So that so that never developed, you know, the usual software engineering stuff, building modules, and, you know, releasing libraries and running unit tests and refactoring and all that stuff, you know, and so, you know, but for me, I still felt like that experimental exploratory environment is where I want to do all my work, you know, so for me, I felt like, Okay, well, let's add the unit testing and the continuous integration and module building and stuff into it. Rather than have two tools for two separate things. Because, you know, anytime I had to switch over, I'm a huge vim user, and I love them. I used to, you know, run vim tutorials. And I've been using it for 20 plus years, you know, but I would much rather be in Jupyter Notebooks, I don't like having to switch over to VMR or VS code, or, or Emacs, all of which I'm very familiar with. So yeah, I decided I wanted to add all of that other software engineering stuff, to notebooks. That's where NB dev came from.  And maybe you can say a bit more about indie dev for those people who haven't heard of it.  Yeah. So NB Dev is a framework for allowing you to create high quality software artifacts from notebooks. And so I find it a real delight to use notebooks for this purpose. So I do so all of my libraries are built in notebooks, including even my servers, even my GUI tools, I get all that nice ability to communicate with future me, and also my customers and my co developers in you know, in a very rich way with animations and plots and pictures, and, you know, hierarchies of headings and all that, and I get to explore in that environment. And I kept the whole process of that exploration. So as I'm learning a new API, you can see them documenting that as I go. And, but those notebooks create Python libraries, you know, and the Python libraries. Because we now kind of have this NB dev step, the Python libraries can end up being a much higher quality than your average Python library, because they do all the things that normally we can't be bothered doing. So for example, you should always have a Dunder all defined in your modules such that with people import star, they only import the stuff that you actually want to be exporting. But who can be bothered doing that, while NP dev does that for you, you should have, you know, your entire kind of introductory spiel, in your readme, and on your documentation homepage, and in your conda description and in your PyPi description. But who can be bothered doing that? Well, if you use MB Dev, all that automatically happens for you, you know, you should have like, lots of code examples, including the outputs littered throughout your documentation. And again, who can be bothered doing that? Well, with NB Dev, that's all done for you. So you, you. So for me, I get to enjoy working in an environment in which I'm extremely productive. And I find really enjoyable. And like when I say extremely productive, three, four or five times more productive, like, really so much more productive. And I can do it for longer, because I'm having such a good time. And I end up with these really high quality libraries. So all of your cells become tests, you know, so as you as you go, you can, you know, and the tests are in the same place as the implementation. So you don't have to, people don't have to jump backwards and forwards. It's all automatically run through continuous integration on GitHub actions. The documentation is all built directly from the notebooks. It's all fully hyperlinked. You know, in your markdown, anytime you have backticks. To kind of say, here's a code element. If it's, if it's the name of a symbol in your library, it automatically links to it. Yeah, it just makes everything easy. You know, you can focus on writing code. Yeah,

hugo bowne-anderson  
so what I'm hearing is you focus on writing code in the place, you want to write code, and it does all the stuff that you kind of don't enjoy doing. Yeah.

Jeremy  
And you end up with better PRs as well, because like, you don't end up with a PR where somebody doesn't change the docs or doesn't add the tests, because they're writing in a notebook where all around the thing they're writing is, docs and tests, you know, and in this kind of, right, in the same way, you don't have to learn a new like, pytest framework, or sphinx documentation. It's just notebooks, you know. And so PRs generally come with good docs and good tests. And of course, they work because the CI has run when they, when they push them.

hugo bowne-anderson  
And we'll definitely include a link to NB dev in the show notes. I'm wondering, for any data scientists, who maybe have a bunch of back their background is in in science, what type of what are a couple of software engineering best practices that you'd encourage them to become more comfortable with

Jeremy  
I think tests are really important. You know, anytime you do run a cell in a notebook and output something and you check that output is what you're expecting, you know, I would then copy and paste that output into the same cell and turn that into a test to check that they're equal. And so then from now on, anytime you change your code, that thing you wanted to make sure it was what you expected is still that thing that you expected. And so notebooks pick up very easy to create tests like that. It I think, also, really getting to the habit of using that restart and run from top button in a notebook is, you know, to really make sure that you, you can always run your notebook from the top with NB dev it enforces that. So like, when you run the tests, they always run in that way. So that's a very good place to be, you don't want it to be in a place where, you know, you need 10 pages of instructions saying run this cell, and then copy this over there, and then run that. And stuff, kind of like a lot of Excel spreadsheets, in practice, seem to kind of end up like that. I'm trying to think I mean, you know, software engineering, such an interesting thing, you know, I would say, like, really focused on decreasing complexity, you know, so most of my functions are going to be about five lines long, long enough to do something interesting, but short enough that you can look at it and understand what it's doing pretty quickly. Yeah, I generally speaking, if you know, if there's a function that's much more than 10 or so lines long, at least for the actual meat of what it's doing, I'd be considering breaking it up into smaller pieces. And so I end up finding, I hardly have any comments in my code, because I carefully name my variables and I carefully name my functions and things that are kind of like, oh, this blocks doing this thing will actually factor it out into a separate function. And now you can see exactly what the inputs and outputs for that block are, and it's got a name. And so I end up finding that I only really need comments in places where it's like, oh, this is working around a bug in this external API, or you know, something like that it kind of tell the reader something that they really need to know,

hugo bowne-anderson  
I'd like to drill down a bit more into the fast AI package and library and ecosystem that that you've developed. In particular, why did you start turning this into something, something bigger than like a utils, dot.pi. And perhaps, you can speak a bit about the rationale behind having a layered API?

Jeremy  
Yeah, so this was always the thing that we kind of planned to focus on. As I said, from the start, you know, we felt like we need what we needed to write with software, do it would be software that would allow deep learning to become truly accessible. So that first course, you know, was step one in us figuring out what that software would need to do, you know, and it allowed us to create a loop that we ran each year, or that we didn't run it last year, mainly because of COVID, where we look at, like, what were we able to teach in that course? And what Weren't we able to teach in that course? And what things were were too difficult? What things took too long, too much compute too much data? And how do we make those things simpler, faster, lighter weight. So that means after the course, the first thing we have to do is research? You know, if this thing took too much data, how do we make it so it needs less? This thing took too much compute, how do we make it so it needs less, you know, this thing required combining four different libraries, each one of which is complicated, you know, in weird ways too much to get your head around this one required explaining all this, you know, complex mathematical machinery. And so then, you know, that research, then often it would just be like, Okay, we didn't quite get a state of the art result for this, why not? Or this is a state of the art result, but it seems less good than it feels like it should be how do we improve the state of the art? Yeah, so the result of all that research then it we put into a software library that basically kind of packages it all up, so that you get world class results with as little work as possible. So basically, the idea is you should only tell the Computer, what it can't figure out for itself? What is the basic idea. And when you use that library, in that way, you should get, basically state of the art results, you know, it shouldn't surprise you with giving you rubbish without knowing it's rubbish and why it's rubbish. So yeah, so that's, that's why we, after the first class, we started creating the library, which is called fast AI. Now, the problem is that, if you focus only on that kind of top level API experience of like, make sure it's it's fast and easy to do the main things we can think of that you might want to do, what then happens when you using that library, find it's not quite as fast as you need it to be, or the file formats it's using isn't quite what you have, or you want to use it in a slightly different domain. You know, maybe it's designed for vision, but you want to use audio, you know, you need to change things. And so that means that, we need to make sure when you need to dig down that extra layer, that the next layer is just as easy to use as the top layer. So that's where the layered API comes in, right? When you dig underneath that first level of code to see Well, how was that made? And how do I change it, you should be looking at something which is just as, you know, friendly and helpful, and gives you best in class results, you know, by default. And so everything we write that top layer in is the mid tier API, like so we're using an API that we really enjoy using, you know, and so then same thing, like, Okay, you're using that mid tier API to kind of customize something, but then you want to, like use a totally different set of graphics primitives, or you want to, you know, target some different hardware or whatever, how do you do that. So you know, we then create the bottom tier API, we've written our own computer vision library, our own NLP, tokenisation, etc, system. And again, so you know, each of you know, you can go in and change any of that. And so because each layer lives on top of each other layer, you can change anything at any layer, and then all the other pieces will continue to work, you know, as long as you continue to follow that that API that we've provided. So it means that power users can, you know, do their PhDs within this thing, like create totally different approaches to any part of the system they like. But it also means that total beginners can do lesson one of the fast.ai costs, you know, in five lines of code. And they can just change that one line of code to bring in their new data set or whatever. So yeah, layered API's totally normal, almost universal, you know, out in the software engineering world, but before fast AI came along, they didn't really exist in in deep learning.

hugo bowne-anderson  
I feel like the closest thing maybe is like, and it's two different packages, right? So keras and TensorFlow or something like that, I suppose.

Jeremy  
Right? Which exactly, which is a very different thing, because like TensorFlow was not created with a view of supporting keras, do you know what I mean, and also keras was, it has a single layer of API. So with keras like this, so we, in year two used keras and TensorFlow, which, you know, had just come out. And that was great, you know, but particularly for part two of our course, which is mainly about implementing research papers, there was just so much stuff that we couldn't do in keras. And then I looked at the source code to care us to see like, Well, how do I customize how this works? And that was just all coupled up together in these complex ways. I couldn't, I couldn't do it. So I just like, and that's a real shame, then I'd have to tear out that piece of keras. But then it meant I had to tear out all of keras because I just found you couldn't, it didn't decompose, you know. So when pytorch came along, that was when we realized, you know, fast aI had to become a much more complete self sufficient thing. Because pytorch was like, such a big jump over TensorFlow in terms of accessibility. And in terms of kind of a match with how developers like to develop, and how data scientists like to experiment, but they didn't have anything like keras, you know, so before that, you know, we were more like, fast AI was still a pretty thin thing, which kind of added a few pieces to keras and, you know, try to find ways to make us more flexible. But yeah, like I say, when and pytorch came out just a bit before our Part Two course was due to come out and between realizing like, Okay, I can't write what I went to write for NLP in keras I can't make it work. And then I tried to do it in PI torch and it was so much easier and so much faster. And then Okay, so we can If we ended up using PI torch for that part two course, but I knew it'd be totally unsuitable for part one, you know, I really didn't feel like you could start with like, Okay, here's you training your first model, where you have to create data sets in a data loader and then write a loop for the epochs and then write a loop for the batches and then calculate the gradients and zero, then I was like, Oh, my God, no way. So yeah, so that's when we kind of decided, okay, well, fast AI is going to have to become a total package, which Yeah, and kind of inspired by keras in some ways, but also with a very different approach to building with this layered API. And I was very lucky that at that time, Silvain Guger became available. And he was able to work on that with me full time. So so so and I, you know, I now basically had a whole extra person to work with. And so the two of us built this package, and we decided to build the whole thing in notebooks. And yeah, worked, worked really well. We're really happy with how it came out, and how, how much people liked it.

hugo bowne-anderson  
And you've also written and published a book together in notebooks.

Jeremy  
Yeah. So yeah, so we've got an O'Reilly book, practical, deep learning. And we started writing that. So actually, so Sylvain, started working on the first chapters, not in notebooks. And, you know, I kind of then said to him, you know, why don't you try restarting that. But like, but with notebooks and like, but let's, let's pretend there was some way of turning it into a book. Within two days, I was like, Oh, my God, Jeremy, I love this. You know, I was hating writing this book before. And now I'm loving reading this book. And so we decided to create this thing called fast doc, which is something that turns notebooks into, into a book. And so that really did make writing the book a lot more enjoyable. And it allowed us to write it in this very free flowing way, where it's where there's like a conversation, it's like, look at this line of code, if we run it, this is what happens, you can see this picture appears, you know, but if we did it this other way, here's what happens. Or we get this other picture. And, you know, we don't have any worries about people emailing O'Reilly's irata group and saying, like, oh, this code doesn't actually run, or it doesn't actually match up with this output? Of course it does, because

hugo bowne-anderson  
it's actually the code. That's how it was developed. Yeah, yeah. So

Jeremy  
that that ended up really good, and people are really, really loving the book, you know, like, I sent it to a lot of people I admired. And a lot of them actually read the damn thing and sent me back messages saying that they really, really enjoyed it. But five stars on Amazon is good.

hugo bowne-anderson  
That's awesome. And the book is all freely available online. So we'll include a link to that in the show notes. But if you like it, I encourage you to buy it as well.

Jeremy  
Yeah, I mean, reading Jupyter Notebooks is a bit of a different experience to reading a physical book. And so even people who have read the notebooks have told me they do like having the book but don't feel Yeah, don't feel compelled if if the notebooks work for you, then just do that. Totally.

hugo bowne-anderson  
I also want to mention the kind of the top layer of the fastai, the top layer, the API is is really nice, because it's very use case specific. I mean, you really parse it out into a vision, text, tabular and collaborative filtering applications. Yeah, yeah. Which is very intuitive. And it also I mean, all of them use, and you kind of hinted at this before talking about loading data, data sets, and all of this and all that, which eventually turned, I think, probably into your into your data block API to make that incredibly Yeah. And

Jeremy  
so for each of those four applications, you basically use exactly the same code. So you only really have to learn one API. Yeah, yeah. I mean, the data block API in particular, is something I was pretty proud of. And it came out of me driving myself crazy by creating like, all these different like, image regression class and image classification class, and black and white image regression class. And this is insane. I had 1000s and 1000s, of lines of party data processing code, and I stopped typing, stopped coding, and actually stepped back and tried to think for a bit, you know, took me a few months to realize I should do that. And I was like, Well, what am I actually do that, oh, when I when I do this data processing, I thought, well, there's some kind of source of data, which I define. And then there's some way I find the labels. And then there's some way I split it into kind of test validation and training sets. And there's some way I process the data to create the labels and to create the features. There's some way I turn it into batches, you know, there's a pretty, very specific number of steps. And so I split them out. And so now suddenly, I just needed one class for each thing. And I realized looking back at my old code that I had this Cartesian product of every combination of those things that I've ever used. And now I just, yeah, just had one thing of like, Oh, here's like over the one line of code to create labels if you can Get them from the folder name. For example, here's the one line of code to create labels, if you can get them from a regular expression from the file name or whatever. And it was also nice that we can then document that API and say to people, or you can create your own, you know, labeling function or data splitter function, or whatever. And actually, one of the things that made a lot simpler was when we tried to port it to Swift. And Swift is much more functional, you know. And as we went through that process, we kind of realized that I had been using this fluent API, which I've always possibly had an over tendency to use fluent API's, which I like in the JavaScript world, and even the kind of the VBA world. But then when we kind of redid it in a functional way, I got some help from Chris Latner, and some of his team and, and Alexis Galaga. And we kind of realized, oh, you know, this actual functional way, ends up looking better in Python, as well. So a lot of fast AI actually switched to a much more functional approach after that swift port, which definitely made it definitely made it better.

hugo bowne-anderson  
Cool. And Chris even appeared in some of the course. 

Jeremy  
Chris did appear in some of the course. Yeah, I mean, sadly, they didn't end up going anywhere, because he left Google. And so Google had to close down the Swift project. But it was really Yeah, it was really sweet working with him and his team, because they're very cool people doing cool work. And I, I learned a lot and it definitely helped our Python library,

hugo bowne-anderson  
when you're talking about the development of I mean, the suffering involved in loading, data sets and splitting and all of the these things. What really one way of framing this is, you're kind of on a journey to find the best abstraction layer for the API. Right? That's exactly right. And this is one of the major challenges of developing any any software and something I think in the deployment story is part of the huge challenge. We don't even have the right abstraction lies yet.

Jeremy  
Yeah, yeah, no, absolutely. I spent a lot more time in my life writing code than doing models. I guess, like, that's been much more of my time reading books about software engineering and API design than I have about reading math books. You know, that's like, that's my thing. And I think that's unusual. You know, there's not that many people who spent most of their time focused on software engineering, who were, who were actually building deep learning libraries. And so, yeah, we don't have enough people really thinking about the developer experience, you know, like, look at TensorFlow version one, you know, I mean, like, no software developer in their right mind would have come up with all these variable scopes, and named scopes, and particularly somebody who deeply understood Python and realized it was like literally recreating things that are already in Python. But it was created by brilliant genius machine learning engineers, you know, who were like, figuring this out as they went. So yeah, there's a lot of room for people who are passionate about developer experience, and API design to really rethink every stage of the machine learning pipeline.

hugo bowne-anderson  
agreed. with your background, I think you're in a really interesting position to talk about how you've seen a lot of incorporating data, inference, AI, algorithms into kind of business, logistics, project management, these types of things. And these are kind of two worlds, which I think could clearly do with a lot of I mean, a lot of data scientists complained that business leaders don't know enough about data science to ask the right questions. It's a two way street, though, right? I think a lot of people in our discipline perhaps don't know enough about supply chains, logistics, project management to have those those conversations. So what are the challenges you've seen historically? And how can how can we fix those?

Jeremy  
it's much harder for a machine learning expert, to become deeply familiar with medicine than it is for a doctor to become familiar with deep learning, you know, and ditto for law, or journalism, or activism, or just about anything you can think of, you know, those domains often take decades to understand and not set up for lay people to come in and develop enough of an understanding to contribute, you know, if you really want to understand law, it's very hard to do it without a law degree and, and some years of, you know, that process of kind of going through clerking and all that other stuff. You know, similar with doctors, you do, you do the medical degree, and then you become a resident and you know, you it's takes a decade, it's training and apprenticeship and you know, a lot of stuff that's not really written down. And a lot of it's about understanding, but what are the constraints of like, how does the hospital system work and how do people communicate with nurses and you know, what does it look like to do rounds and, you know, are at a deeper level of like, oh, how does you know how to radiologists actually work with DICOM files? What's the system of how they bring them up? And how does their workflow actually happen? And, yeah, all this stuff, it's just a lot of kind of deep implicit knowledge in the field. So, you know, we really encourage domain experts to try and develop some proficiency with deep learning, you know, enough that they can train some models, and understand deeply the ideas of like, machine learning means a flexible functional form has some parameters, which allow it to be tuned to solving a particular problem. And after that's done, you end up with this function that you can then run independently and, you know, can extrapolate, but there are limitations to how much it can extrapolate. And you can, you know, understand how well it's doing with these kinds of diagnostics and blah, blah, blah, like, that's not heaps, you know, a few a few months of diligent study from an intelligent person can get you there. I mean, we've we've seen that many times, 

hugo bowne-anderson  
and should the focus be on the techniques and the ways of thinking more than the code?

Jeremy  
I think it's both without the code, I haven't found people develop enough of an understanding to be useful. Unfortunately, I would love to fix that because code is, you know, a skill that takes a while to acquire. But I mean, to understand code enough to be able to kind of get the gist of it, even if you can't necessarily read it all from scratch yourself is a lot lower bar. And that's the bar that I think people need to at least get to as domain experts. And then I think it's yeah, it's really important then to have data scientists become as deeply immersed in the domain as possible. So for example, at the Children's Hospital of Los Angeles, CH LA, they I don't know, if they still do it, they at least used to have data scientists doing rounds, with doctors with the pediatricians amazing. And so their colleagues, with the people who are going to be using the solutions that they're building, and they're seeing how the data is captured. And so like, you know, when then when there's some field, which is like, populated in some way they know, it's because the doctor selected from one of those four things on their iPad, or that they wrote this thing by scratch. And they know actually, half the time it was actually written by a nurse because the doctors rarely have time to do it. And the nurse sometimes just guesses and yeah, you know.

hugo bowne-anderson  
that story reminds me of when sports broadcasters initially like, you know, half a century ago, whatever started having statisticians or actuaries sitting next to them, telling them about the statistics of the game so that they could tell the people in real time. And now that was breaking down those silos once again,

Jeremy  
That's a standard part now of statistics everywhere. Yeah,

hugo bowne-anderson  
fantastic. We've mentioned several other programming language. I mean, you mentioned swift and C sharp, a lot of people work in Python today, is Python going to grow in usage for machine learning? Or do you see diminishing returns? At some point?

Jeremy  
I think the answer is yes to both of those, it'll probably keep growing as more and more people find it, you know, like, I think at the moment, the ecosystem for Python, so strong, and it's so flexible, and people are finding ways to really push it, you know, further and further, you know, with things like numba or cupy, or kind of things, which effectively take the Python code and kind of recompile it in different ways. Or indeed, you know, like, torch script, or TF function or whatever, you know, it's for that same reason, though, that I think we're gonna see what we are seeing diminishing returns. Python, was not created for that purpose. And indeed, it was created for almost the opposite purpose, it was created to be highly dynamic, then Python is incredibly elegant, when you look at like, how little infrastructure is actually needed to create all this stuff that's in Python. So you know, like it's meta object protocol, for example, like when you actually dig in and understand what all those Dunder methods do, and how they all fit together. This is incredible elegance, about you know, how OO works in Python. And it's also kind of elegant that you can actually throw it all out and create your own version of by leveraging the same meta object protocol. So it's highly dynamic, lot of reflection, you know, everything can be can be replaced with anything else. And that functionality is used a lot inside the libraries that you're familiar with, even though most of us don't use it ourselves. The problem is that that dynamic functionality is totally at odds with compiling something to run on a GPU or compiling something to run in parallel on a multi core processor or having careful type checking on top of it or for whatever, right, so we end up with crummy versions of all of those things, you know. So for example, if you look at, like, really elegant type systems that were carefully built for that purpose, you know, whether it be TypeScript or C sharp or going a lot further, something like Haskell, you know, or indeed Swift, very interesting type system, they're designed from the ground up for that purpose, you know, where else you look at, like mypi, which is brilliant people working on it, but it's trying to like, shoehorn something in, you know, it's a very deficient type system compared to those other ones that I'm mentioning, in terms of its expressivity, and, and also how well it matches to actually, you know, the Python code you want to write, or indeed, you know, compilers, you know, there's, you know, again, comparing to something like Swift, you know, it's Swift was explicitly designed by the creator of LLVM, to leverage the capabilities of the LLVM compiler. And so it really allows you to use the compiler in very smart ways. On the other hand, Python was never designed for that. And so something like torch script or whatever, sometimes you see what a hack it is, like, for example, last time I used to, if you had a comment that was in I think, any column other than the first you would get a syntax error, because it like, you know, like, they had to reparse Python from scratch and then do something totally separate with it, if you use the tupple symbol with a small t, rather than importing tupple with a big T, which both refer to the same thing, again, it wouldn't work, because like, literally, they were doing, you know, string matching, and it was machine learning engineers that were doing the string matching, it wasn't, you know, they weren't compiler writers or or parser experts. So I compare this to something like, Julia, you know, which is developed with a very small, elegant, you know, base written in Scheme, actually, that's designed to be highly expressive, and, you know, deep metaprogramming capabilities, but the meta programming is very much built, with an eye to kind of maintaining the type system. And a really interesting and well thought out type system from the ground up. That's kind of a platform, which it just feels like you can build on and build up and up and up. Where else? Yeah, Python, it feels like we're just hacking hacks on top of hacks. Yeah, yeah, very clever hacks. And it's amazing what people are doing with it. But you know, that's not really the way I would want to spend my time programming is figuring out how to hack around the deficiencies of language, because you're using it in ways it wasn't designed for.

hugo bowne-anderson  
For sure. And I think something i As you know, I think about quite a bit is kind of the fragmentation of the tooling space that kind of simultaneously feels like we have not enough tools and too many tools. And I do think so for example, in in the first few lessons of your course, it's incredible how after several hours, you get people deploying a machine learning model that they've built, and to do so they're not only using fast AI, Jupiter colab, or paperspace. And, as we talked before, but they're also using IPython widgets, voila, and binder, right? Yeah. So I think this is, it's fantastic that you can do this in several hours. But there is some there's huge cognitive overhead with all of these things, right? I mean, you do it in a way that you can introduce people to just the bits they need, which I think is is fantastic. But there is a there is a huge amount to learn in the space. And do you see some form of consolidation happening here?

Jeremy  
something like Voila is so cool, you know, with with IP widgets, that you can interactively create a GUI and use stuff that you're pretty familiar with. But it suffers from this problem that they're wrappers, you know, just like pie torch suffers from the problem that it's wrappers around CUDA code, voila and ipywidgets suffer from the problem, they're wrappers around kind of JavaScript code. And in the end, you know, you end up a less capable front end developer than somebody who actually understands JavaScript because all you can do is use what's been given to you it's the same problem in pie torch, you know, you're you can't end up doing what Ben Graham did with his sparse cnns, because that requires writing CUDA, and you know, and pie torch should just really use what's been given to you. So that yeah, that worries me. And it bothers me. I don't think data scientists should have to learn lots of programming languages. Like I just don't think that's feasible, particularly when a lot of people doing data science and not even full time data scientists, they're, you know, virologists or microscopists, or whatever, who happened to be doing some data science. So I am keen to find ways for people to use one language for everything.

hugo bowne-anderson  
And another example I think, is you might very clear on I think in your last lecture of part one that I'm The machine learning models that I think have the highest impact ensembles of decision trees. So random forests and gradient boosting machines and multi layered neural networks, learn with Stochastic gradient descent, right? To use these three, you need, essentially three different packages like there's there aren't even two packages that you'd use for these three, right? You'd probably use Scikit, learn for random forests xgboost For boosted, and then fast AI or pytorch or whatever it is.

Jeremy  
Yeah, but I mean, even that, though, like, it's so nice to to say, well, here are the three things you need. Like I feel like it's so different to what you tend to get in academia, a lot of gatekeeping people like well, there's 400 different techniques. And we're going to teach everyone as if a totally new thing, here's logit and his Probit, and he is, you know, discriminative analysis and his, you know, blah, blah, blah, it's, you know, Gaussian processes, and he is SVM. And it's like, that's what all the books were like, when I was kind of starting out, there was just chapter after chapter after chapter of totally separate things that each one were introduced as a whole new thing. And it is nice to have to say, look, there are basically three things you need. And I think like, you can probably get by with scikit-learn, for all of your tree ensembles, nowadays. I mean, you know, they have wrappers, and they do have a consistent API. So I do think things are a lot easier than they were. But I still think, you know, it seems like, still a lot of courses and books do take people down this path of like, 100 different tools, and different things, which Yeah, I think it's a terrible idea. Yeah.

hugo bowne-anderson  
And also mentioning, we've mentioned several times, academic research versus implementation, something you seem to have done wonderfully at fast AI is fast.ai. With fast AI is taking current state of the art research and figuring out how to port it to very usable models in a very small amount of time, that requires users to only write a few lines of code to develop state of the art stuff,

Jeremy  
Something I do, which seems less common than it should be, is anytime I'm interested in a paper, I always write it from scratch myself. And I don't feel like you can understand a paper, really, until you've done that. And in the process of rewriting it from scratch, I tend to rewrite it from scratch based off a kind of surface levels giving of the basic ideas rather than a really careful analysis of exactly what the paper says, because then after that, I'll then go back and compare it to the paper and to their implementation. And often I'll find there's a few like little decisions here, and then I'll have done differently that just happened to turn out better, you know, or at least that I realized, like, oh, that there is a decision to be made there. And maybe different choices are used for different data types or situations. And so yeah, so as a result of that, it does mean I end up being also able to integrate it better into the rest of the software, because I kind of often will realize like, oh, well, actually, these three quarters of it are just this type of training loop. So I only actually need to implement a module with this one quarter of it. And oh, actually, now I think about it. That's kind of an API for doing X. So I'm just gonna make a new API for this. And this will be the first thing and that API. And then that will often make me think like, oh, what else could I do with that API? And so then I'll create y and Zed as well. And then realize like, oh, okay, these are actually a bit better than the original version. So the research and the development have a nice, circular kind of relationship,

hugo bowne-anderson  
because part of the point is, you have to read the research, right. And as you kind of mentioned earlier, when you were starting off thinking about diagnostic imaging, like people clearly hadn't read research that could be relevant across different fields that will

Jeremy  
Yeah, and so I do think people over specialized you know, I definitely like, you know, ULM fit, which was, you know, our idea of training a language model, and then fine tuning it for other problems, you know, which has become, since that time very popular, entirely came out of the idea of like saying, well, let's just use what works in computer vision in NLP, you know, and I asked lots of people in NLP before I started, I didn't want to waste my time. And I said, like, do you think this is going to work? And everybody in NLP said, No, it won't work. And then I said, you know, can you convince me that it won't work and nothing can convince me that it won't work? It was just this belief that NLP was too difficult

hugo bowne-anderson  
and to be clear, your work on your own fits set the seed, if I call correctly, for what of now the GPT is, right?

Jeremy  
yeah, so Exactly. So Alec Radford, who's a genius, machine learning engineer and developer had done a lot of great stuff in NLP related NLP before but he told me he had always assumed that you would have to kind of pre train on on an in domain data set. So with your LM fit when he saw that we could do that with just Wikipedia. That was what inspired him to create GPT.

hugo bowne-anderson  
Amazing, somewhere I'd like to move now briefly is I feel like some of the early research techniques simile proofs of principle, is there an argument that they're kind of sucked up into big tech companies? And that kind of, on the converse, we all kind of live in the shadow of big tech companies and use a lot of tools they develop, which may not be quite appropriate for all our reasonable scale ml problems.

Jeremy  
Yeah, definitely. It's a constant struggle, you know, the big tech companies get all the PR, you know, and lots of particularly young researchers want to be like them, you know, or they want to use the tools that they're using will be acquired by them, perhaps as well. Yeah, you end up with everybody thinking you need a roomful of TPUs to solve a problem when you actually you could just do it on an old 1080 Ti, graphics card, run it overnight. You know, people think they need terabytes of data without actually realizing that you could do it with just 100 data points. And yeah, just in general, it's not just an ML thing. But in general, we get startups with 100 customers using Kubernetes. You know, so they have some kind of global failover design for a billion customers. Yeah, I think premature optimization is a big problem. And in this case, it's basically premature scaling. It's like premature optimization of like, or what happens when we hit a billion customers,

hugo bowne-anderson  
and does it introduce more complexity into the system as well is that part of the challenge?

Jeremy  
massively increases the chance that you're going to fail, and it makes you much less able to rapidly iterate. And so that's the number one thing that you have as a startup to make you beat the big guys. And so then if you go and replicate the infrastructure of the big guys, you've now lost your number one benefit. So yeah, you should be if you want to beat Google, you should be trying to be as unlike Google as possible, leverage your, what you can do, which they can't, which is that you can build something and run it on a VPS for $5 a month, you know, make it incredibly simple, so that you can rapidly change it, do lots of stuff in entirely manual ways. And then, you know, really figure out fantastic product that everybody loves

hugo bowne-anderson  
I want to move back to thinking about making deep learning as accessible as possible. I don't actually know what my question is here. But I'm, I have a sense. So I just want to point in a certain direction, you've done a huge, a huge amount of work in trying to help people understand what's happening with COVID over the past several years, and I'm sure it's a lot of backbreaking work, and a lot of frustrating work, particularly when it seems like there's a severe lack of both data literacy and statistical literacy, for sure. And also a lot of other motivations. But I just wonder how how you reconcile these, these things of trying to increase deep learning accessibility, while recognizing that even like, people need to understand the base rate fallacy, and what a precision maybe is, or false negatives, and what to actually report... sampling statistics.

Jeremy  
I mean, they're pretty separate things. You know, like, if we've kind of reached this level, I'm hoping where using deep learning is as easy as using the internet, on your phone, it just becomes part of the, the background, you know, it's like, in certain areas, it's already true, right? So you can already leverage sophisticated deep learning for computational photography, by taking a picture on your phone. And at least on my Pixel, six Pro, you can like say, oh, there's a, you know, power line in the background I don't like and you can tap on it and delete the power line. You get it for watching TV on an NVIDIA Shield Android device, you get deep learning based super resolution by literally clicking a box. Yeah, so they're fairly different things, you know, leveraging deep learning effectively just should not require statistical

hugo bowne-anderson  
What about, like understanding class imbalances? And that that I think once I mean for domain experts trying to maybe in the deployment you're talking about you're absolutely right. But for radiology and that type of stuff, it seems like

Jeremy  
that should be that becomes trivially obvious when you design, the workflow, the experience and the visualizations correctly. You know, class imbalance is something which just shouldn't, it shouldn't be a problem just in the same way. So for example, with fast AI, reporting metrics on your training set, is not a problem ever. Because it's impossible to do with fast AI. And so that's like an example of something that people continually get wrong. And you could say it requires some statistical sophistication. Say, No, it doesn't. It requires software sophistication that we just like, there's literally no way to create the data for a fast AI model without having a validation set. And there's no way to get the metrics on something that's not the validation set. So same thing with class imbalance. I haven't spent as much time thinking about that as I should, but I'm sure that particularly for a specific domain like say, radiology, there are ways to report the correct metrics in the correct way and visualize them in the correct way so that there just isn't an option to accidentally read it incorrectly

hugo bowne-anderson  
that makes sense. So going back to the challenges with data literacy and statistical literacy, how, how can we, I suppose, facilitate people understanding more about these things?

Jeremy  
I have pretty low expectations. Honestly, I don't know if that's fair. But I feel like I've spent 20 plus years trying to do that, particularly when I was in consulting, I spent 10 years trying to do that. And oh, gosh, I'm just trying to think, sorry, 30 years, into it over 30 years. So 30 years ago, I really started doing data science. So I find that, you know, people who are not kind of interested, that's, you're not going to convince them and you know, people are naturally convinced by narratives, you know, not really by data and confirmation bias. Yeah, you know, our brains are what they are. And you can kind of spend your lifetime trying to train your brain to be data driven, and work around all of its foibles, which is something I've tried to do to my brain with some success. But you know, most people aren't going to do that. So I think instead, yeah, you have to kind of try to present things in ways that the correct answer is the intuitively obvious answer and package up stories in ways like I think we should be trying to change what we do, based on how people's brains are, rather than thinking we can change people's brains.

hugo bowne-anderson  
And is that something you've attempted in your your work on masks, for example, masks for everyone?

Jeremy  
Yes, although I would say that the vast majority of that was done for me, you know, out of the work in the Czech Republic, the very phrase masks are all came out of the Czech Republic, and really all of the ideas about how to communicate it. So thanks to my research into communication and influence, and so forth, I was able to recognize that all of the key techniques had been used. But I don't think I would have been able to think of how to do them so well, myself, but I did have the background to be able to say like, oh, we should do exactly what they're doing. So, you know, my mask, protects you. Your mask protects me comes from the Czech Republic, you know, masks are all comes in the Czech Republic. Yeah. So definitely, yeah, definitely helped. But I would say I'm not I'm not particularly good at coming up with these things myself, but I'm pretty good at recognizing it when somebody else has nailed it. 

hugo bowne-anderson  
So we've chatted a lot about the past and present of deep learning to wrap up, I'd just like to get a sense of what you'd like to see the future of deep learning bring in any timeframe that you're excited to discuss.

Jeremy  
Yeah, I mean, I would love to see all of us really leveraging deep learning in our day to day jobs in ways that basically dramatically reduce the kind of menial work that we do makes make us all more productive without requiring us or any, you know, almost any of us to understand gradients or activation functions or whatever, you know, I would really love to see deep learning being used in much more thoughtful ways around things like making labeling exponentially more effective by kind of, you know, having much better interplay between humans and algorithms, I really hope that as more domain experts become proficient deep learning practitioners that will really see deep learning, embedded into domains in ways that really improve people's productivity in ways that hopefully might reduce, you know, the massive resource shortages we have. So for example, in medicine, I really believe that medical practitioners could be 10 times more productive by leveraging deep learning. That is basically the shortage we have of medical experts in the developing world. So if we could do that, then we could resolve the shortage of medical experts in the developing world, stuff like that would be great would be great outcomes.

hugo bowne-anderson  
And do you envisage a no code future or low code future?

Jeremy  
Oh, yeah, for sure. I mean, that's always going to be needing code to, you know, just like, we still need people writing code that works with subnet masks and IP addresses. And whatever else,

hugo bowne-anderson  
I really mean, for like some Pareto law of, you know, most people doing deep learning,

Jeremy  
yeah, most people don't deep learning, just like most people using the internet don't have to create a PPP configuration file or whatever, most people using deep learning. I mean, in some ways, you can say this is already true. I mean, most people using deep learning are using it accidentally through their phones, computation or photography, or their photo albums, ability to find all the pictures of your daughter or whatever. And even Yeah, people, hopefully these things will become more and more flexible. And so there needs to be a lot of work done on what are the kind of optimal UI ways of interacting with a deep learning model? Those don't really exist yet. At the moment, not many people are working on that. To me, that's the areas which are the most exciting. 

hugo bowne-anderson  
You build something called Platform AI, which played around with that for some time. Is that right?

Jeremy  
I didn't build that. So I built with one of our engineers at analytic I builder kind of a there's actually for my talk on ted.com. I built a absolutely version of that for a particular domain, which is image classification. And yeah, it's like, it's exactly the kind of thing I'm talking about. It's it's something where a domain expert can label hundreds of 1000s of images in 10 minutes by interacting with a model. And then platform AI. Yeah, was built to commercialize that idea. 

hugo bowne-anderson  
We'll include a link to that TED Talk in in the show notes. I actually, that's a nice and nice way to wrap up, because I do recall, I haven't watched it in some time. But something you mentioned there, which I think about probably a bit too often, is that the machine learning and deep learning revolution is qualitatively different from the previous revolutions, such as the industrial revolution, in terms of this idea of exponential technology. So maybe we could wrap up by you telling us a bit about that. And what that means for us as a labor force and approaches we can have to reconfiguring society to kind of have have it nourish and raise up as many people as possible.

Jeremy  
Yeah. Yeah. I mean, it's particularly something I didn't talk about in that TED Talk. There's an interplay between the huge rise of corporations and this huge increase in productivity from deep learning. And also this kind of natural monopolization that occurs, you know, with companies owning the data and the resources, being able to procure to provide better products and services as a result. Yeah, I definitely worry about that kind of dystopian outcome where all the resources and end up in the hands of, you know, a tiny handful of people even more than we have now. And again, this is something where I feel like making the technology more accessible, hopefully means that more people can kind of play in this area. And maybe we kind of avoid dystopia of like, huge amount of income inequality and centralization of power. But I would say I'm not particularly confident.

hugo bowne-anderson  
They're all very serious. I mean, Mary Gray has a beautiful book called Ghost work, which delves into the outsourcing of labor in the gig economy. So I think people becoming vastly more familiar with this, this type of stuff.

Jeremy  
And the great book is Manor, by Marshall brain, which is really a manner I'll check it out and put in the show notes basically predict, I think it's like 2002, he basically predicted everything that's happened today on and particularly involving, kind of like how Amazon works, he predicted that and I could imagine that continuing to go in the direction he think, thought it might, which is basically, you know, machines, timing, everything that most of us do. And if you take more than three seconds more than you're actually more than three times the machine automatically fires you. And

hugo bowne-anderson  
there's really a pathological end game of like, Fordist production line Taylorism stuff,

Jeremy  
and which is, you know, already what we have at Amazon. Yeah, that's how a lot of the jobs already already work. That's why you have people peeing in bottles. Yeah, not because somebody told them to pee in bottles, but because the algorithm is, you know, figured out what how long each job can take. And everybody has to take that amount of time to do each job. 

hugo bowne-anderson  
I discovered a term recently ludic capitalism, which is the gamification of financial markets. And I know that Uber drivers, for example, they receive bonuses for going to a particular region at a particular point in time, and it unlocks new features for them and that type of stuff. The fact that this is obscured behind veils of their versions of the API's, which we don't necessarily have have access to, is really important to recognize

Jeremy  
I hate the way we're all increasingly becoming, yeah, pawns to these giant corporate machines.

hugo bowne-anderson  
There's actually a wonderful book on this note by someone called Alex Rosenblatt. It's a book called Uber land. And it's kind of a tech sociological study of the current labor force in Uber and such such companies. And she's done a lot of underground research with Uber drivers. The final thing is, you mentioned in your TED talk, or one of your final slides, and I've actually just brought it up here is that with these types of concerns, you say, what doesn't necessarily help is better education and more incentives to work. But what does help is to separate labor from earnings and the idea of craft based economies?

Jeremy  
Yeah, so a lot of people say that they'll, you know, if we, as we kind of have less and less need for human inputs, and so human labor becomes lower and lower value, which we're already seeing, obviously, we're seeing a lot of people working for not enough wages, that's obviously going to get worse and worse. And so people often say, Oh, we should invest in education. But yeah, I mean, that that doesn't actually help with the underlying problem at all, because you now have more people building the things that increase inequality and decrease the need for human inputs. So for an individual like it was my child, for example, I would say, Oh, yes, definitely focus on education. Because really individually, it helps you avoid obsolescence for longer. But it doesn't help society avoid the problem. So yeah, currently we have our economy structured on the assumption that human labor is a scarce resource, and that it's therefore something that we should associate with, with money to the level where, in many jurisdictions, if you don't have a job, you can barely live or sometimes you can't live at all. That seems like a fundamentally incorrect premise. And so we could use something like negative taxation rates, to allow people who are not currently making enough money to have enough money to live and when you have, yes, Australia is generally pretty good at this, we have like, far fewer people in desperate need in Australia. And as a result, we have a an environment that I enjoy being in a lot more than most environments in America, you don't need kind of as much like gated communities, or you know, all that kind of stuff.

hugo bowne-anderson  
And you lived in San Francisco in a time where that type of inequality increased tremendously as well.

Jeremy  
Yeah. So you know, it's nice to live in a society without, without desperation. I mean, so certainly, and where things aren't getting worse and worse. So like negative income tax, helps a lot with that. So the people who are making the least money, don't just pay no income tax, they actually receive extra. And then yeah, over time, as inequality increases, more and more, you know, that negative income tax can actually turn into UBI, you know, into into a Basic Income everyone receives, although at the moment, I'm surprised in some ways that people are so focused on UBI rather than negative income tax, because negative income tax is like, such a very scalable thing, it's a parameterised thing that you can gradually change. Yeah, where else UBI is, is kind of like, I don't know, it's a much less kind of elegant tool. You know, it's much more of a light gray that you kind of throw this huge bunch of money at people doesn't feel like quite the right approach at the moment.

hugo bowne-anderson  
I do think maybe part of it. And I haven't really thought about this deeply at all, but it is due to well, we our focus on universality, and what equality means and that type of stuff.

Jeremy  
I don't know. I mean, we have progressive taxation. So negative income tax just means making it more progressive. Absolutely. Nobody seems to be trying that. I don't know why not. Maybe I'm missing something.

hugo bowne-anderson  
 Jeremy, it is always fun to talk with you. And thank you for for spending this time and giving us so much insight into your work and your thoughts. I deeply appreciate it. Thank you.