Mixed - Carl Gold.mp3 Carl Gold: [00:00:00] The first law of power is that you need to make your boss look smart, not yourself. It's a very good lesson for any smart person embarking on their career to remember that you're not going to get a lot of points for making yourself look smarter than everyone else. You're actually going to get more points for making the other people look smart, potentially make your boss look smart is the number one career advice. Harpreet Sahota: [00:00:40] What's up, everybody? Welcome to the artists Data Science podcast, the only self development podcast for Data scientists. You're going to learn from and be inspired by the people, ideas and conversations that'll encourage creativity and innovation in yourself so that you can do the same for others. I also host open office hours. You can register to attend by going to Bitly dot com forward. Slash a d. S o h. I look forward to seeing you all there. Let's ride this beat out into another awesome episode and don't forget to subscribe to the show and leave a five star review. Harpreet Sahota: [00:01:31] Our guest today is a former Wall Street quant turned data scientist who is leading the battle against churn using Data as his weapon, he holds a PhD from the California Institute of Technology, a.k.a. Cal Tech, and has first authored or publications in leading machine learning and neuroscience journals. As a data scientist, he uses a variety of tools and techniques to analyze data around online systems, and his expertize has led to the creation of the subscription economy index. Currently, is chief data scientist of zuora, a comprehensive subscription management platform, and newly public Silicon Valley unicorn with more than one thousand customers worldwide in his role there. He's been fortunate to analyze subscriber churn at over 50 companies in a large variety of industries. He's also an author and shares his experiences fighting churn in hopes that it can help other companies thrive in the subscription economy. So please help me in welcoming our guests today, the author of Fighting Churn with Data, Dr Carl Gold. Harpreet Sahota: [00:02:46] Dr. Gold, thank you so much for taking time out of your schedule to be here. I really appreciate you coming on to the show. Carl Gold: [00:02:52] Thank you Harpreet. Thank you for having me. Harpreet Sahota: [00:02:54] So talk to us about how you first got into Data science and what drew you to this field. Carl Gold: [00:02:59] That's complicated because I got interested in Data science before it was Data science. I was doing a masters in computer science. I'm going to date myself back in the 90s and I learned about AI and machine learning and the state of the art back then. And yeah, I got really interested in that. So I went on to do a PhD at Caltech and it was in an interdisciplinary program that combined neuroscience and machine learning. Carl Gold: [00:03:28] Now, that doesn't sound like so revolutionary now because everyone's trying to combine neuroscience and machine learning. But back in 2000, when I was doing it, it was a little bit more of a fringe pursuit for some academics. Machine learning hadn't really made a splash in the industry yet. So, yeah, that was how I got started. Although at the end of my academic years, I mean, for personal reasons, I left academia and as you mentioned, I became a Wall Street client for a while back then. That was actually your main ally from academia. If you were like a quantitative scientist or something back in the 2000s and you didn't want to stay as a do a postdoc, going to Wall Street was pretty much your best option. Nowadays there is Data science as a career and so Data science probably the number one option for academics leaving academia. And maybe, I would guess that finance has been pushed to number two. But it's actually a funny question. What's the number one choice for academics leaving academia now, Data science or finance? Harpreet Sahota: [00:04:30] So what led to your interest in churn and what led to creation? I guess what is the subscription economy index and what led to that creation and how does that fit together with churn? Carl Gold: [00:04:42] All right. So two separate topics, actually. So the subscription economy index is the analytics and statistical products that we make from the zuora database. Carl Gold: [00:04:55] So zuora's customers basically run their business on zuora, meaning their customers, their customers subscription and all their invoicing and billing and financing. So we completely anonymize that data. I should stress that. So we're not giving away any company secrets here, but so we completely anonymize and aggregate that data to answer questions about benchmarking. Well, the type of things we can calculate from that database are like the company's growth rate. How many new customers are getting, how much is their revenue growing? Also, because we have the customer database, we can calculate a churn rate for every company in our database so we can provide benchmark numbers for what's a typical churn rate that we see. And then we can specialize it, for example, to different verticals like their average churn rate is usually higher for consumer companies than B2B or business to business companies. So we look at things like that. And because zuora also controls our customers pricing and packaging, we do a lot of analysis around what's the best pricing and packaging for subscription products. Carl Gold: [00:06:03] So churn is just one of the things that we look at in the subscription economy index and the studies based off of it. I got interested in churn because a friend of mine asked me about it really when I was still a Wall Street quant. I was thinking about getting out of finance at the time and because it's a long story, but after the financial crash, the Wall Street world changed and I wanted to get out of. And for a while and I knew that Data science had become a thing. I read the newspaper, so I was reading the news. Oh, my God. Machine learning is now widely used in some industry, especially in the Silicon Valley area where I'm living. Carl Gold: [00:06:43] So I was interested in getting into Data science and making that lateral career move, having had Doni machine learning and had journal articles and machine learning back before, it was cool and wasn't too hard for me, but I had to find the right opportunity. And I ended up working with a friend who was trying to make a product to do customer analysis, including churn. It was a small startup that was trying to make a customer analytics product for customer success and use cases like that. So he got me looking at churn problems because they were actually doing it's a long story, but they were doing something else at the start up and this was going to be a pivot. He had a junior Data scientist at the time and he was like, Hey, Carl, can you look at this churn Data with my junior Data science? So I started helping them on that. And one thing led to another. And about less than a year later, I quit my Wall Street job and joined that company to try to make a customer dashboard for churn and customer success. And that was where I got into analyzing churn. And I did something like my first 15 or 20. Carl Gold: [00:07:48] I can't even remember now how many turn projects we did at that startup. I think between 20 and 25 actually I did at that time. Carl Gold: [00:07:56] And that was honestly where I made some of my first mistakes and I started learning how hard it is to deliver a usable churn analysis. Harpreet Sahota: [00:08:04] So thats pretty interesting. You had quite the path into where you are right now into full fledged data scientist. You started off with the PhD studying machine learning with neuroscience. Now here you are applying it in the industry. I'm wondering how much more hyped do you think the field has become since you first broke into it? Carl Gold: [00:08:26] Oh, it's ridiculously hyped. Honestly, I'm someone who is in the anti hype camp. I think we're in the Data science field. We're overpromising. Well, some elements of the field, I should say. And it's a lot the media and journalism, they seize on one small result and they promise a revolution, although a lot of people other than myself have pointed this out. And it's actually a lot of academics even encourage this. It just helps their academic stature and their grant writing success to kind of hype up the accomplishments. Carl Gold: [00:09:04] And the media just eats it up. Carl Gold: [00:09:06] But then when you get to being like a practicing Data scientist like us, we get unrealistic expectations from our customers. I mean, when I started out, of course, people thought machine learning was trash. This was like when I went to Wall Street first. No one was that interested in machine learning back in the early 2000s. It wasn't until after Google essentially had showed how much they could do with machine learning in a production environment with big data. Carl Gold: [00:09:34] So it was only around two thousand, nine or ten. I think that machine learning really became respectable. Harpreet Sahota: [00:09:41] And where do you see the field headed? Where do you see filled with Data science machine learning? Where do you see this headed in the next two to five years? Carl Gold: [00:09:49] I think it's going in a great direction as it is. I mean, just because I think that we kind of overpromised in some cases in machine learning, I don't mean that that's like a crisis in the field. I do hope that in the next few years that some expectations are already starting to come down, actually. Yeah. So I do hope that people get more realistic expectations coming up. And you can see this really, for example, in driverless cars. I do remember back in 2015 when all these driverless car companies were like, oh, we're going to have driverless cars by 2020. Now at the time I wasn't working in driverless cars, but I knew machine learning and I knew AI and I just told anyone who would listen that that was complete B.S. and that we'll be lucky to have driverless cars by 2030 with the current understanding of intelligence and the brain. So the hype has definitely been sucked out of driverless cars and there's probably a lot of other areas of machine learning that have been overhyped, where we still have some a little hyper deflation is still needed in some areas. But at the same time, machine learning and data science are delivering great things. I mean, first of all, when you talk about data science, it's not just machine learning, it's also statistics and advanced analytics. And what I show in the churn book is that your advanced analytics can really do most of what most companies need. And machine learning is definitely appropriate for a lot of more advanced use cases. Harpreet Sahota: [00:11:20] So what do you think would be the biggest positive impact that Data science will have on society in the next two to five years? Carl Gold: [00:11:27] Wow, that's a deep question, I hope that Data science is making us more productive. I mean, everyone says, well, we should be more data driven. Well, what's the result of being more data driven? Right. Well, you should make companies and individuals who make better decisions faster. And I know it's making all these companies more productive. It's one of those confusing questions because the economists say we're not becoming more productive as a society, but I work in an industry and you do, too, where I feel like I can see increased productivity constantly. And I don't know why. You know, why it doesn't come out in The Economist numbers. Maybe it's because data scientist salaries are so high, it kind of eats the productivity or something. Harpreet Sahota: [00:12:11] Yeah, definitely. I think it's enabling us to automate away some tasks so that it frees up other people's creativity to perform other tasks. Right. Which just compounds in itself. Carl Gold: [00:12:22] And it should enable better decisions to not just faster decisions by getting the right data to the right people and giving them the right tools. We really should see companies making more optimal decisions. Harpreet Sahota: [00:12:36] I definitely agree with that 100 percent. So what do you think would be the scariest application of machine learning and data science in the next two to five years? Carl Gold: [00:12:46] Probably the scariest applications now seem to have to do with the spreading of false information. I mean, we're not quite at the point where we have Terminator robots hunting us down. That's one of those overhyped theories. Carl Gold: [00:13:02] But definitely deep fakes and making misinformation harder to root out in the social networking world is definitely very scary if you look at what all this disinformation is doing to society right now. Harpreet Sahota: [00:13:17] I definitely agree with that. So as practitioners of Data science and machine learning, what do you think would be some of our biggest concerns when we're out there doing our thing, doing our work? Carl Gold: [00:13:30] Well, as a data scientist in the trenches, you know, your main concern is usually not to make mistakes. You don't put your project your biases into your work is always one concern. But there's also I mean, of course, the ethics question, which is getting a lot more talk. I mean, I was just listening to a webinar where someone is saying there should be like a Hippocratic Oath for Data scientists, which means that goes beyond just you don't want to make mistakes. It means that you shouldn't be working on those, you know, on those dangerous applications. Like if someone asks you, hey, can you help me make a better deep fake bot to spread false news on Twitter, you should say no. But the truth is, there's a lot of people out there who won't even see those types of applications as a legitimate weapon in their cause, even if it's harmful to the world at large. Carl Gold: [00:14:30] That really answer the question? I'm not sure. Carl Gold: [00:14:32] Yeah, no, that's really interesting. Like a Hippocratic Oath for scientists. I think that makes one hundred percent sense like that absolutely. Should be something that we should be thinking about, because as we are moving into this feature where applications for their science can essentially they're ubiquitous, they're everywhere and injecting your biases into things and coming up with these weird deep fakes to mislead you, that's dangerous stuff that I think a Hippocratic Oath Data scientists. I think that is an awesome idea. Harpreet Sahota: [00:15:02] What artists I would love to hear from. You feel free to send me an email to the artists of Data Science@Gmail.com. Let me know what you love about the show. Let me know what you don't love about the show and let me know what you would like to see in the future. I absolutely would love to hear from you. I've also got open office hours that I will be hosting and you can register like going to Bitly.Com forward, slash a d s o h. I look forward to hearing from you all and look forward to seeing you in the office hours. Let's get back to the episode. Harpreet Sahota: [00:15:48] So I want to get into book here, a book on churn, which I thought is amazing. It is chock full of so many examples and coding examples and I've never seen churn treated so thoroughly. Harpreet Sahota: [00:16:01] So I was really excited to go to that book and then get you on the show for this. Let's start at the top. A very basic question. What is Churn is that what we do we make butter. Carl Gold: [00:16:11] No,for everyone who doesn't know, churn means customers quitting or canceling. And the term comes from the churn rate, which is a metric and they call it the churn rate because it's the turnover in your customer base if you say five percent of your customers. Cancel, then you have to replace those five percent with new customers before you can keep growing so it takes out of your growth. They call it churn because churn refers to turnover or mixing. But now if you're in the business, the SAS or software as a service industry, people use churn as both a noun and a verb like we will call a customer and churn. We'll say, oh, that account, there are churn. Don't worry about them or or make a report on last quarter's turns, which is clearly using churn as a noun. And then you can also use it as a verb and say, oh, the customer is churning or the customer churn in the past tense. And some of us who really spend a lot of time on churn will even talk about our own terms and say something like, Oh yeah, I'm going to turn from Hulu because I finished the last season, I don't know, Survivor, whatever you're watching on out in a. Harpreet Sahota: [00:17:27] So why churn so hard to fight? Carl Gold: [00:17:30] I think it's hard to fight really, because it depends on actually giving more value to your customers and there's no quick fixes for churn. Carl Gold: [00:17:40] Or rather, there shouldn't be a mature product without serious bugs. There's no quick fixes for churn. And it's also hard to fight because the solutions need to be customized to the causes that there's many different causes for churn. Typically, typically, it could be either the customer is not using the product or maybe they're using it incorrectly, or maybe they signed up for the prior plan, but they're not taking advantage of all the features. And you need to know which of those reasons is the reason for churn to do something about it, which is actually why in the book I explain it's not always a great use case for machine learning, which just gives you a yes or no prediction. Most companies, if you give them yes, know, churn predictions, it's not actually very helpful to them because what they need to do is they need to segment the customer based on the cause of the churn and then take some kind of targeted activity. If you find a segment of customers that are not using your products, best feature, for example, then you want to reach out to those customers and say, hey, do you know that you're about this great feature you're not using, but you don't want to send that same email to the customers who are already using the feature. So in churn, there's no one size fits all solution, which means a simple machine learning approach of just predicting it doesn't really help. And that's another thing that makes it really hard to fight churn. Harpreet Sahota: [00:19:10] So that will also makes it hard to predict when a customer is about to churn because there's no one size fits all solution or what is that? Carl Gold: [00:19:19] You know, sometimes you can predict churn pretty accurately, although it is often hard to predict churn. There's a few reasons that make the accuracy of churn prediction difficult. One is that you never have all the information that you'd like. I'll just make a very simple example. Like the amount a customer pays for a product is important factor in churn. But it would be really helpful if you also knew their disposable income, for example, or if then you'd know how much is that cost to them. Another thing like if it's a business product, well, you'd really like to know your customers revenue, right? You'd like to know everything about their business and that would help, you know, if their insurance. Carl Gold: [00:20:01] Well, guess what? You're not going to get all that information ever. Carl Gold: [00:20:05] So you're always faced with incomplete information about churn. And there's also a lot of subjectivity in it, too. So if you take a consumer product and this is related to the incomplete information, you'll never really know if someone likes it or not in their inside their subjective part. You don't know what they're really thinking. So you're relying on all these other variables that you can measure to infer things about what you can't measure, which is the customer's true satisfaction. And then lastly, there's the timing effect that it can be hard to predict the timing of churn even when you know someone's a at risk, usually a churn. If you do the analytics or machine learning for churn, it's very easy to see people who are highly at risk because guess what? They're the people not using the product at all. But then when you take those people who aren't using the product at all, it's still really hard to figure out when they're going to churn. Because what we all have this experience like, hey, I haven't watched anything on Netflix for two months. Oh, what a waste of money. I should cancel Netflix and then you forget about it. And then a week later, you think about Netflix again and you're like, oh yeah, I meant to cancel Netflix. And many of us go on like this for months. Actually, Netflix introduced a new feature where they're going to auto cancel everyone who hasn't signed in for two years. Right. That's nice of them. But that just shows there are people who haven't signed into Netflix. For two years and they have insurance. So how's it turn prediction algorithm or trend prediction system going to deal with that? So you've got incomplete information subjectivity and the timing has so many extraneous factors in it. Carl Gold: [00:21:44] So it's hard to get a very high accuracy on term prediction. Harpreet Sahota: [00:21:48] This brings me to my next question, which was the importance of metrics in our battle against churn. So correct me if I'm wrong, but it seems like that everything you just kind of described is also the importance of metrics. Is there anything more about metrics that we need to know when it comes to our battle against churn? Carl Gold: [00:22:04] Well, I mean, the point about metrics is really common in data science, just that your features are super important. And the features that you choose in my mind are really the main part of solving any data science problem and not the algorithm. I show actually in my book that if you do a good job on your feature engineering, the algorithm that you choose is not that important for your accuracy. So feature engineering always has number one importance in Data science. And in turn you're typically working from raw data where you just have a bunch of events that occurred in data warehouse and you have to design summary metrics of the customers. So it's very important how you choose those metrics for your ultimate accuracy and also, even more importantly, for the understandability of the model. Because like I was saying, to take action on churn, the people in the business, like the people who do the email campaigns or who call customers, they need to understand why people are churning and they need to understand what a healthy customer looks like. And so if you don't have interpretable metrics which actually predict churn, then, well, I mean, that's what you need. So the metrics have to be interpretable. And as I show in the book, you can make any metric predict churn if you design it correctly, that one metric won't be a perfect churn predictor. But every metric that you use should be a churn predictor that the business people can understand. Harpreet Sahota: [00:23:37] Yeah absolutely love that about your book. You really went deep in on metrics and feature engineering and want to talk to people about a little bit later. But I loved how you had really thorough, well-designed coding examples that we could use to go from event data into metrics and laid out in school code in Python code. I thought the really amazing thing in your book that I hadn't seen done on most other books, like at that level of detail. Harpreet Sahota: [00:24:01] So while we're on this topic of metrics and defining metrics like how do we go from Rhod event data to metrics, do you have like an example for us so that we can kind of conceptualize what you mean by that? Carl Gold: [00:24:13] Yeah, the most basic metrics are simply a backward looking count or total of some kind of event. So your events might be things like let's say it's a social networking app and your events might be posts. So every time someone makes a post, it's an event. Now, a metric would be something like posts per month or posts in the last month for each customer. So it summarizes the raw data and it makes it into a feature that can go into a machine learning algorithm. Harpreet Sahota: [00:24:45] So talk to us a bit about cohort analysis and how do cohorts help us analyze and predict and understand churn? Carl Gold: [00:24:54] Yeah, well, this is again, obviously you read the book. This is one of the main themes, which is that I introduced a technique that I called metric cohort analysis. Most people are familiar with cohort analysis in which you observe a bunch of customers who sign up at a certain time and they form a cohort. And then you look at them over time, like one month, three months, six months later. And the idea is that you look at those cohorts and look at the churn in the cohort. So you're looking at the churn rate in the cohort at different points of time for the metric cohorts, ie group the customers by the value of a metric. So if it's, say, posts per month, you would cohort customers based on the amount of posts they make. So you might have a cohort of zero to 10 posts per month and then the next cohort will be 10 to 30 posts per month, etc.. And you always have a long tail of outliers in these and you'll have a few people who are making like thousands of posts per month. But the cohorts are you can just think of them as the percentile grouping of those customers. And then the interesting part is when you look at the churn rate in cohorts formed based on behaviors, and that's where you'll see that typically people who use the product the most are going to churn the least. Carl Gold: [00:26:19] And then people who use the product the least churn the most. That's the method for empirically showing that you're metric is predicting churn and. It's a great method because it's really easy to explain to business people. I mean, if you do, you can show the same thing with a regression, right? And you come back with a regression coefficient. But I find that when I go to business people and I tell them the regression coefficients and the statistics, they give me blank stares. But if you show them a plot that shows that, oh, people who post from zero to 10 times a month churn at a 20 percent rate. But if you have a customers who post more than 50 times a month, then they only churn at a five percent rate. People really get that. People get it. Oh, wow. We want people, our customers to be posting at least 50 times a month so that they're in that they achieve that healthy, low churn state of the heavy product users. So it's really, for me, like a cornerstone of how I convey this information to the business users. Harpreet Sahota: [00:27:22] Yeah, you talk about the single most important concept in the book being the ratio metric. So you kind of touched on that. But just for the sake of clarity, can you define what the ratio metric is or what we are and why are they so powerful? Carl Gold: [00:27:37] Ok, so I made the simple example of like a post per month metric. Now let's say there's another event in the social network, which is like the ads viewed by a customer. This is an example out of the simulation in the book. So now viewing ads is usually bad for customers, right? You expect that people who view more ads might be driven away from your service, like if they seen a lot of ads. But the funny thing that you find usually is that you don't actually see those negative correlations between behavior and renewal or a mirror. It's a positive. You don't see those correlations between a bad behavior or what should be a negative behavior in churn. And the reason is because there is both correlation and causation happening, which is that in any service you have power users who use it a lot and they have both a lot of good events and bad events. So, for example, a power user is going to see a lot of ads and make a lot of posts, and they'll probably get a lot of likes and a lot of shares and all those other things, too. But the power user has a lot of ads use, which you expect to be negative. But you find if you look at a simple correlation between ad of use and churn, you'll see that the more ads someone use, the less they churn. So you're like, wait, I thought ads were bad. How can people view more ads, churn less? Well, it's the correlation for the power users. So now this is a long way I've been coming around about a ratio metric is just a ratio of two other metrics. So it looks that one behavior in relation to another in a way that's very interpretable for normal humans. Carl Gold: [00:29:20] So in this case, you might look at a metric like ad views per post. Carl Gold: [00:29:26] So you take the abuse metric and you just divide it by the post metric. And they were both, for example, measured on a monthly window. And the reason you do that is because it separates out what's going on by making it a relative metric. The ratio you'll actually see does someone get a lot of ads relative to their amount of use. And the same trick I actually did start doing this with ads and post. I started doing this with money and behavior because you see the same thing in products with multiple price points where there's like a basic plan, a standard plan and a premier plan. And the premier plan is the most expensive and the basic plan is the cheapest. So then what you find is that people who pay the most churn the least. Why? Because the people on your premier plan have self selected. They're like a self selected group of people who are really into your product. So that's why they're on the premier plan. So from this, you get the correlation that the more people pay, the less they churn. And you're like, OK, well, it actually makes sense when you think about power users. And it's the same thing would be to be products, business products where bigger customers always churn less as well. So if you do enterprise sales, your bigger customers are usually your best customers. But again, you've got this correlation between paying more and turning less, which is actually it's not that it's not true, but it's not the right point to make the money case. You look at the ratio metric of dollars per use of the product, like let's say your product allows you to share videos. Right. And you have a metric for a number of video downloads from your site per month. Carl Gold: [00:31:08] Then you would look at like dollars per video downloaded and then you're actually seeing what kind of a value the customer gets. And so the ratio metrics actually allow you to interpretable way normalize the behavior relative to the size or the level of the. Summer, and there's lots of other ways to do this in Data science, for example, if you do dimensioned reduction with a principal component analysis, you get factors made up of the normal AIs difference between different metrics and normalized differences are actually another way to get the same information into your model. It's the relative difference between two other metrics. Now the problem is in interprete ability, and that's where ratio metrics really win is if you go to your business users and say, I made a feature, which is the difference between normalized revenue and normalized usage, you're going to get blank stares, right? But if you say I made a metric, which is dollars per use, people say, oh, that's great. Dollars per use metric, because for some reason, ratios are very intuitive for the human brain. Now, this is actually getting out of my area into my cognitive psychology, our ratio units, very easy to understand. And the contrast, if you ever took a physics class with probably most Data scientists have multiplicative, units are impossible to understand, like kilowatt hours or gramme meters or any multiplication of two units is very difficult for humans to intuit. But we're very easily intuit it ratio units. And I pretty much just latched on to that fact and centered my analysis around ratio metrics and ratio units, because it's very easy for the business people to understand the long answer. Harpreet Sahota: [00:33:09] No, no, that is absolutely great. I know that the audience is really going to enjoy that because super insightful and you go in such amazing detail in your book, I highly recommend everybody listen to check it out, especially if you're working in e commerce or working anywhere where churn is an issue. This book is really going to provide a benefit for you. So I want to touch on some other things that are kind, gentle to your book. But I think the audience would love to hear your perspective on this first talk about outliers. You mentioned outliers a little bit previously, but why are outliers so problematic to deal with? Carl Gold: [00:33:40] Well, they're only a problem if you don't deal with them. But I mean, they're really common just because of the long tail of behaviors. Like you've always got power users who just hammer your product, write your typical user might make 10 posts a month and then you've got people making thousands and they're problematic in many types of analysis. So it really depends on what you're doing. They're not a problem for a lot of machine learning algorithms, depending on what algorithm you choose. They're definitely a problem for regressions and averages, and they'll just blow away anything based on an average because the outliers will dominate. I'm sure anyone familiar with Data science has heard the many different problems with the outliers, although the thing is there's different types of outliers, distinct, important to distinguish. There's false outliers, which ideally you would just detect them and remove them because they're bad Data. But the thing is in the customer behavior is you get this long tail of genuine outliers and those you want to incorporate them in the model and get information from them, but without them ruining your numbers. Harpreet Sahota: [00:34:57] Yeah, this can be my next question is what are some common mistakes that you've seen Data scientists make when it comes to dealing with outliers? And I think that's probably one right there where it's like the outliers actually kind of useful. Carl Gold: [00:35:09] Yeah, yeah. You definitely want to use them. I mean, the first thing to do is, of course, just normalize your data. Carl Gold: [00:35:15] Although I should point out not just a standard normalization, I use log based normalization. So you take a log scale of the metric with it. If you have a lot of outliers, it's very helpful for your analysis to take a log scale of the metric and then you can apply a normal standard normalization procedure, like subtract the mean and divide by the standard deviation. But it's that step of taking the log of the metric and you have to be careful with zeros to. So typically you do log of one. Plus the metric is what you do for that. And so then you don't have that problem with your data being so spread out because you pull it all into a small scale. The other trick I like to use for outliers is actually trimming the outliers rather than removing them and trimming the outliers. It's actually known as, I think, Winzer ization. And this is one of those funny stories where in World War two, Lord Winzer came up with a statistical technique to analyze the bombing raids or something. I think that's the real story. If I got it right and they call it Winzer ization. And what it means is that you trim the outliers. Is to a high level like the ninety nine percentile without fully removing them. It's funny you're making me realize something I didn't mention is the problem of outliers. I think a lot of statistics, textbooks have it wrong for big data because they'll tell you you can't have outliers or it's going to ruin your model. Carl Gold: [00:36:47] I actually don't find that really. If you have a decent amount of data, outliers don't usually affect your model too much is what I astounded by a decent amount of data. I mean, like ten thousand or more examples. Then if you have some extreme outliers, they honestly don't change like your regression coefficients too much. But what I see is often another problem is you get unreasonable predictions on an outlier, so you might get a reasonable model like a regression, even including outliers. But then when you turn around and make a prediction for an outlier customer, you'll get really extreme values. Like it's never true that anyone's like one hundred percent likely to churn or ninety nine percent likely to churn or zero percent likely to churn you. No, but if you keep outliers in your prediction data, then you'll get these really extreme predictions. And business people just hate that. And you have to spend all this time explaining to the business people, oh, it's not really one hundred percent, it's just an outlier, but it really reduces confidence in the model. So sometimes what I've done is I have ignored outliers when fitting a model, but then when I predict with the model, if I'm going to return scores or probabilities to the users, I make sure to screen the outliers or rather trim the outliers in the prediction because it just avoids stupid questions with the business people that ninety nine percent churn will turn into like a 70 percent churn after you trim the outliers. So you'll still consider it like a risk, but you just won't have to explain so much to the business people. Harpreet Sahota: [00:38:31] Yeah, I absolutely appreciate that explanation. Thank you for that. And I definitely understand the need for having to put things in the perspective of the business. And sometimes you're going to have to do some things that maybe don't sit quite right with you as a data scientist. But hey, at the end of the day, you're communicating with business stakeholders and you've got to make the stuff approachable and accessible for them to understand and does not leave things that will make them question it. Carl Gold: [00:38:56] If you do trem like an outlier churn probability from like one hundred percent to 70 percent. Right. 70 percent is really still a high churn probability. So you're still telling the business this customer is at risk, but you're not giving them a number that will make them question your modeling. Harpreet Sahota: [00:39:14] So touched a little bit on feature engineering just a little bit earlier. I want to dig a little bit deeper on that. One hundred percent of that feature engineering is really the most important part of the entire process. But what are some tips that you could share with our audience so that we can be more thoughtful or more active with more ingenuity when it comes to our feature engineering? Carl Gold: [00:39:34] Well, for customer churn, let me just focus on that. The first thing I tell people is to focus on features that are close to the value the customer receives. And when we kind of touched on the fact that the customers value in their mind is subjective and you're never actually going to know exactly how much they value the service. But if you have, for example, let's say it's some service where you create documents. Right. And you have one feature for the log in to the service, the system and one for creating documents like, well, which ones could be more important, right? Probably the one for creating documents, because that's the feature that is closer to the value creation and the log in to the site. Well, that's just a gateway they have to pass through. But generally modern products are over instrumented for customer data science. They instrumented every little click and thing that the customer can do, or maybe not everyone, because no product can be that completely instrumented. But you're generally collecting a lot more advanced than you actually want to use in an analysis. So the first thing is to just try to focus on events that are close to the customer value. The second most important tip is really about just dimension reduction. And in the book I kind of advocate for a simple approach to dimension reduction. But the foundation of it is to look at the correlations and look at what behaviors are correlated with what, because very often, again, in this over instrumented product scenario, you'll have 20 to 50 metrics that are all highly correlated with each other, you to different parts of some process and behaviors, like, again, the hypothetical document editing application, you're going to have edit documents, save document, open document and all the. Carl Gold: [00:41:28] Behaviors are going to be highly correlated because a customer who does a lot of one is going to be doing a lot of the others. So you've got to get to the bottom of those, pile up some correlated metrics by reducing the dimension. And also it gets back to ratio metrics because if you do have a lot of correlated behaviors, there could be some information, interesting information in the relative difference of those between those behaviors that you would naturally pick those up. If you use the principal component analysis to do your dimensioned reduction, then of course, the problem with Pisgah is you can't explain it to everyone, even myself. I honestly have a hard time understanding the results. So the problem with that is yet is that it's hard to explain. So that's where you get back into this area of ratio metrics if you want to look at the relationship between behaviors, but keeping it interpretable for the end user. So that's the ratio. Metrics are actually another important trick with the feature engineering. Harpreet Sahota: [00:42:31] So you talk about the common misconception that the choice of algorithm is the most important thing that contributes to model performance. Where do you think that misconception stems from? Carl Gold: [00:42:41] Well, it definitely stems from academic work where the data set is fixed as a benchmark data set. [00:42:47] I mean, you read all these academic papers, a couple of academic papers that I've written on machine learning. You start with a benchmark data set, and typically your goal is to show that your new algorithm is either better or the same as the previous ones. And in these scenarios, feature engineering is actually not really an option because the benchmark data set, you don't have the raw data. So it's actually almost impossible to teach feature engineering because you need real raw data to do it with and then you actually go through the process of making the features. So I think it's just the confluence of how data science and machine learning are taught and reported in academic papers. Like if you just read the literature, you would conclude that the most important thing was the algorithm, because that's what everyone is writing about. No one's writing paper about feature engineering. And again, that's partly just because they don't have access to the raw data, which would even enable them to study the future engineering. Carl Gold: [00:43:50] You have to be a practitioner out in the real world to actually have data to play with. Harpreet Sahota: [00:43:56] Raw data, plus a little bit of intuition about the industry that you're in, about the use case and combine that together, come up with some really creative features that you could then use to build upon. Carl Gold: [00:44:06] But it doesn't have to be creative. I mean, basically in the book I lay out a cookie cutter process. It's like start with count metrics on everything, maybe some sums and averages. If you have appropriate features, then look at correlations. And then from your correlation analysis, think about what relationships might be interesting. So it definitely takes domain knowledge and some ingenuity. But I think once you study the process, I feel like it actually doesn't take too much like real creativity. I hate to say it. I mean, it is creative to a degree. But once you get I mean, I, I mean, honestly, what this is partly what inspired me to write the book is just that we had come up with kind of like a cookie cutter factory process to come up with good metrics and deploy churn models. So in a way, the feature engineer you mentioned that I really focus on feature engineering in the book. And honestly, it was really my inspiration was a lot to share the feature engineering techniques. So in a way, I would say I wrote a book about feature engineering, but it's in the use case for churn. Harpreet Sahota: [00:45:14] So that's what I really enjoyed about the book, is you gave it a nice through treatment like every topic. And there it's it's literally a Bible for churn. If you're listening to this podcast and you're interested in solving problems like this is the book to get one hundred percent. Harpreet Sahota: [00:45:28] So speaking about deploying models into production, I think that's something that doesn't get nearly enough coverage in any book, really is what to do once it is in production. So once we fit the model and we ship, it does not work as data scientists stop there. Carl Gold: [00:45:45] No, no, of course not. And I don't go too much into this in the book, although I point out the need you have to like you should definitely set up your models with code and not with like ad hoc procedures because it's inevitable that you need to rewrite your model. It's a little bit complicated because I mean, there are studies that show that SR1 prediction models become less accurate over time and so that forces you to refit the model. Carl Gold: [00:46:12] And I mean, it makes sense because the world is changing and the customers are changing. The competitive landscape is changing. And if you ever believe that there has been a big change in your competitive landscape and your customer behavior, you really must. Refit model, because what it's not going to be telling you the right thing anymore. It's a little bit harder to know how often to change the model in general, because changing the model has its costs as well, because you have to go back to the business people and explain everything to them again and they're going to get confused. And generally, it's a best practice not to change your model more than once a year unless you have a strong reason to do so just because the change is going to impact the users. There's even examples. There are companies I know where the customer success reps are. Actually, they're monitored for how much they reduce churn risk. Right. So the reps are actually being monitored and expected to impact the model output by getting the customer to do better behaviors. Now, the thing is, if your reps are being graded based on their performance and improving this model, no, you can't change the model mid year. It's like you're moving the goalposts. Right. And that's a whole nother problem. So you definitely need to refit the model. But then again, the papers will tell you, oh, your model is losing accuracy after just three months, but you have business needs that you can't update your model after just three months unless it's an emergency like actually right now due to covid. Carl Gold: [00:47:53] I'm telling everyone to update their model. Right. Right. Now, if you do a new churn model, you should really only use Data since covid if possible. That will only be possible for a consumer company that has a lot of observations. Generally, business-to-business companies have a small Data challenge because their customers are on annual subscriptions and they might not have that many thousands of customers, maybe even hundreds of customers. Then you have a real challenge in collecting a new data set and it would be impossible really for a business to business company with only a few hundred customers to create a new data set with only renewals since covid only Data since covid they'd hardly have any data left. So there's so many competing concerns with refitting your model. I mean, you're absolutely right that the job doesn't stop once you've deployed. We haven't even really talked about the technical aspects of deploying a model in production and monitoring its output. That is also something you need to do, like you should continuously monitor your model's predictions for accuracy, because that'll actually give you the warning sign if it's been too long. Harpreet Sahota: [00:49:05] And thank you so much for that. I was about to ask the next question is, what are some things that we need to monitor and track, let's say specifically the context of churn to make sure that our model is doing what it should be, that is performing as we've designed it, to perform from the Data science type of perspective on the backend and from the business perspective. Carl Gold: [00:49:24] Yeah, well, definitely in terms of accuracy, I advocate using that testing or they also call it cross validation through time. So to measure your historical accuracy, I actually advocate doing like a historical simulation of you fit the model as if it were a certain point in time and then you only predict on customer turns in the future from that backdated time. And that's to prevent what they call lookahead bias, which is using a historical information in your model fitting. So that extends just to live modeling where you should be checking the accuracy of your model periodically. Always, of course, is that in a validation framework. So you test the model on Data that you didn't use to fit the model, but you do that moving forward through time. And so that's going to be definitely one of your big things to make sure that the models are still performing as it intended, as it was intended to. Harpreet Sahota: [00:50:26] So you spoke a little bit about how COVID is messing up turn models. So what is it about this situation that is going to be making it difficult to predict going forward? Is it because all of a sudden people are just unsubscribing from things just out of fear and panic or what is causing this kind of. Carl Gold: [00:50:43] Well, it's all kinds of change, actually. I should stress that there's not like a churn crisis due to covid. It's actually very specific because some companies are seeing a boom like booming business, like Zoom video. I'll mention them because they're Ozuora customer. We're very proud that our customer resume has gone to being a household name in the past three months. But I know from looking at their system Data, I think they have something like four X as many subscribers as they did in January right now. So their business has exploded. But then on the flip side, you have companies where their business had to do with, say, sporting events or travel. And those companies really are having a churn crisis due to covid. But the. In both cases, you have a big change that happens, which means that when you do your empirical analysis of churn and behavior, you're going to see, OK, what behaviors are typically correlated churn or what level of use of the product is considered a healthy level. But when you have a changing business environment, you have to discover the new normal. Just hypothetically making I'm completely making numbers up, but if you are some video service, it's not zero. Now, this is not at all based on any real data from Zoom, but let's say you are a video conferencing service and in twenty nineteen you thought that a healthy you said number would be five conferences a month and above five conferences would be considered a healthy customer. Well guess what, it's twenty 20 and five video conferences a month is nothing. Now a healthy customer not at risk for churn probably should have 20 video conferences a month because usage of video conferencing has gone up so much. So that's like a positive improvement in the business. Now people are doing more video conferencing, but if you now want to identify trends, you need to refit to the new normal of customer behavior. And that doesn't matter if your churn went up or down, you've got a new normal. So you need to redo the empirical analysis. And if you're using a model, refit the model. Harpreet Sahota: [00:52:51] Thank you so much for going into a deep dove on a book. There's so much in this book that I wanted to cover, but we would have run out of time. But you there's 11 chapters, really clear examples, really well constructed examples. I think anybody who is interested in or wants to learn more about churn should get their hands on this book. So shifting gears a little bit here, I've got some questions that I would ask that are not about churn. Harpreet Sahota: [00:53:14] First is considered data science or machine learning to be an art or just purely a Harp science. And why? Carl Gold: [00:53:21] Well, that's funny because I have a PhD in hard science. I mean, it was in a biology division that I was doing computational work. I consider hard science to be partly an art, just in the sense that it relies a lot on intuition and creativity. I mean, the difference between science and engineering in my mind is that if you're doing science, you're doing something that no one else has ever done before. You're doing new research and it could be new research on your own company. If no one has analyzed churn at your company, then you're you're actually doing science because you're going to do an investigation that no one has done before. And I do think that pretty much all science is to some degree in art because, well, there's too many hypotheses to possibly test them. All right. So your first step in a scientific project is going to be to narrow down the hypotheses to the ones that make the most sense and a lot of that. It's kind of an art. You know, it uses some gut instinct. It uses your domain knowledge. It uses your knowledge of the modeling, what's possible. So I feel that a very important component of science is really an art. So I would say that extends to Data science whenever you're doing a new problem that no one's done before. There's an element of hypothesis testing. And like I said, you can't test all the hypotheses that you're going to have to use your creativity, your domain knowledge, your intuition and your experience to come up with the best ones. Harpreet Sahota: [00:54:52] As watching a documentary recently about babies, my wife and I just had a baby about three months ago, almost, Carl Gold: [00:54:57] Congratulations! Harpreet Sahota: [00:54:57] Thank you. And just these experiments that these scientists are coming up with to test these various hypotheses, we're just super creative and it's like, OK, that takes some level of creativity for you to be able to test what is going on in a baby's brain using simple things. Carl Gold: [00:55:15] Yeah, and the perfect example is it's not obvious how to come up with those kind of psychological tests with someone who can't talk to you. Right. Harpreet Sahota: [00:55:24] So what are some soft skills that Data scientists are missing that are really going to help them take their careers to the next level? Carl Gold: [00:55:31] Well, I mean, the soft skills everyone kind of needs are emotional intelligence and just being able to, you know, put yourself in the mind of the listener or the user. I mean, definitely I see a lot of problem with junior Data scientists who meet with business people and they use a lot of jargon. They use technical terms that the other people won't understand. A very common example of this is actually the term feature engineering. It's a terrible term to use in a software company because we've got software engineers and software features. And then when you talk about feature engineering, people think you're talking about something else. And so this is just about putting yourself in the mind of the listener and remembering what's their vocabulary, what's their understanding and how am I going to explain this term to them without it being a lot of data science? It seems like they even want to be pretty. People with the jargon, which is a very bad practice, I mean, unless it's under Data scientists, I mean, if it's other data scientists have at it, but if it's business people, you really need to put yourself in their shoes and convey to them the findings without the jargon in the math. And I mean, that's just like a part of just emotional intelligence and good presentation skills. Harpreet Sahota: [00:56:51] So how could a data scientist develop their business acumen and their product sense when they first had a company? Carl Gold: [00:56:58] Wow. I mean, it takes a while to get to know any company's products. Carl Gold: [00:57:03] I mean, really, the I don't know if there's a specific way to do it, but you have to really embrace it. Carl Gold: [00:57:09] I mean, I'm in the camp that says domain knowledge is essential, but it's actually not that hard. And I'm not the first person to say this. Everyone talks, oh, you need domain knowledge. You need domain knowledge, and you definitely do. But for someone who could learn all the data science, you know, it shouldn't be that hard to learn the details of a particular business and business process. But you do have to care and be invested in it. You have to say, OK, it's important for me to learn this business stuff. If you try to blow it off at all, that's the only thing you can really do wrong I think. Harpreet Sahota: [00:57:44] Yeah, I definitely agree with that sentiment. So what advice or insight can you share with people who are breaking into the field or whether they're fresh out of school or a career transitions and they see some of these job postings and they look like they want the ability of an entire team wrapped up into one person and they end up feeling kind of discouraged and dejected and they don't apply for the job. What advice or words of encouragement can you share with them? Carl Gold: [00:58:10] Well, definitely apply. I mean, the thing is, you got to look at those qualifications and realize that, you know, not everyone has them. And also, I want to mention from the literature that there is a real gender bias in this, that jobs that have all the qualifications, so many qualifications in general. I've read that men are more likely to apply even if they don't have all the qualifications and women are less likely to apply. And it's just from some studies. So it's very important, especially for women just to apply anyway, because they're not really expecting everyone to have all those skills. I mean, unless it also says they want 10 years of experience. Right. If they if it's for a senior role and they want a lot of experience, then, OK, maybe they do want all those skills. But if it's a junior role, both of those skills are going to be nice to have. And you need to figure out what are the core ones that they're really looking for. But the most important thing is to just get on the phone and apply anyway or not get on the phone, but get online, get on to the website. Hopefully you'll get on the phone in a phone screen after you submit your resume. But in the phone screen, you can tell them what your skills really are and what they're really looking for. But just at least get to that phone screen, you know? Harpreet Sahota: [00:59:27] Thank you for that. So last formal question before you jump into a lightning round here. And that is what's the one thing you want people to learn from your story? Carl Gold: [00:59:36] I don't know. To learn from my story? Well, definitely. I feel like the points around the importance of feature engineering and how most important things in Data science are not necessarily the most hyped ones. You can make a big difference at your company just by doing this kind of feature engineering and worry about the modeling later. Worried more about bringing understanding to your users than using the latest and greatest modeling techniques. The buzzwords stay away from buzzwords and focus on bringing knowledge to your users.What I would say. Harpreet Sahota: [01:00:18] Absolutely love it. And that definitely picked up on that sense through your book because you didn't present, like, these crazy cutting, bleeding edge algorithms. You focused mostly on just simple parsimonious algorithms, but really double down on the entire process from raw data to feature engineering, all that. I like that approach. I like that. I like that lesson to learn from story as well. Harpreet Sahota: [01:00:39] So jumping into lightning round here, if you can meet any historical figure, who would it be and what would you ask them? Carl Gold: [01:00:45] I don't think that's I have a good answer to that one. Maybe Isaac Newton. And what giants in particular were you standing on their shoulders? He said, if I could see farther than others, I stood on the shoulders of giants. Who exactly was he referring to? Harpreet Sahota: [01:01:04] A good one. So what do you believe that other people think is crazy? Carl Gold: [01:01:08] Well, in Data science, I'll say I believe that deep learning models have very little to do or nothing to do with how the brain works, and that in some years from now there will be a new principle that someone will figure out that will actually show how brains work and process information. But it's not going to look anything like deep learning with gradients. Harpreet Sahota: [01:01:31] If you could have a billboard put up anywhere. What would you put on it? Carl Gold: [01:01:35] Well, I guess I would put up a fight turn with Data billboard next to the one on one in San Francisco, where all the tech workers drive by on to the Bay Bridge every day. Carl Gold: [01:01:46] Good One, So what do you love most about being a data scientist? Carl Gold: [01:01:50] Well, it's that it's you know, you're constantly doing new things, like I mentioned before, about when you doing science to me means doing something that no one's done before. So almost all of my problems at work are things that no one's really done before. So it's different to me than software engineering, where in software engineering you're kind of you're implementing a known solution, maybe in a new context, but you're not really doing new research. So usually in Data science, you're really doing something new, which is what I really like about it. Harpreet Sahota: [01:02:20] I really like that perspective as well. What do you wish you had known when you first started out on your career? Carl Gold: [01:02:27] Definitely what's in my book, because I wrote the book. This is like a memo to my past self. You know, if I could have gone back in time, five years instead focus on empirical feature engineering. Don't worry about the predictive model except with very advanced customers. That would be the message I would give myself. Harpreet Sahota: [01:02:49] What are you curious about right now? Carl Gold: [01:02:51] Lots of things. I have a job. I mentioned that I am interested in how organic brains really process information, and I'm still interested in that. I follow neuroscientists and I even do a little bit of coding, a biologically realistic spiking neural networks to try to come up with my own ideas in that area. Harpreet Sahota: [01:03:11] That's awesome. So what's something you failed at? Carl Gold: [01:03:14] Oh, God. And the world of Data science or my personal life science. I have failed in getting my organization to always adopt best practices. And it's something I'm still working on where there are some Data use practices that I see people in my organization doing. And I'm just like, no, no. And they shouldn't be doing that. But I'm still kind of trying to redirect them. And so I've definitely been a failure and is getting people to to adopt those practices. Harpreet Sahota: [01:03:48] What is a academic topic or maybe just an area of research outside of Data science that you think every data scientist should spend some time studying or researching about? Carl Gold: [01:04:00] That's a good question. Outside of Data science, it's a little bit, I would definitely say, design patterns in object oriented programing. Very useful to have a handle on, even if you're not a software engineer. Harpreet Sahota: [01:04:14] That's a good one. So what's the number one book? Fiction, nonfiction or even one of each? Besides fight Data the turn you'd recommend our audience read. And what was your most impactful takeaway from it? Carl Gold: [01:04:25] Let's say that's a hard one. I didn't prepare all these interviews, but I'm currently reading of the 48 Laws of Power. I don't know if you heard about. Harpreet Sahota: [01:04:35] Robert Green? Carl Gold: [01:04:36] Yeah, yeah. It's a historical book about looking at and won lessons that should be close to many Data scientists is the first law of power, which is that you need to make your boss look smart, not yourself. It's a very good lesson for any smart person embarking on their career to remember that you're not going to get a lot of points for making yourself look smarter than everyone else. You're actually going to get more points for making the other people look smart, particularly make your boss look smart is the number one career advice. Harpreet Sahota: [01:05:11] I agree with that so vigorously. I think you should be focusing on making your boss look good on your teammates. Good making everybody around you look good. Said I'm a big fan of his protege. Ryan Holladay should be familiar with Ryan Holiday. He written the obstacles, the way Eagles the enemy. Still, this is the key. Yeah. So that's, Carl Gold: [01:05:30] that can be my next read. Harpreet Sahota: [01:05:33] Yeah, yeah. He's a bit of a modern day stoic philosopher of sorts, but yeah, it's interesting books. I highly recommend those predictions. Robert Greene. Harpreet Sahota: [01:05:41] So if we could somehow get a magic telephone that allowed you to contact 18 year old Carl, what would you tell him? Carl Gold: [01:05:47] That's funny. I might tell him to study statistics and not engineering because I went through this long process where first I studied electrical engineering, that I did a master's in computer science, but I did like a Ph.D. and this neuro stuff and then went on to Wall Street, especially when you get into advanced machine learning, you discover that it's all about statistics. Right. And the same thing with that. A lot of the Wall Street is all about statistics. So I've had many times to wish that I had just studied statistics as an undergraduate instead of getting interested in it through all these round about other topics. The. All the other interesting things I've done, so in a way I don't know, maybe 18 year old me should just hang up the phone and make your own mistakes. Harpreet Sahota: [01:06:35] I like that. I like that. Yeah. When I was in grad school, like, it wasn't called machine learning was called statistical learning. So it's like. Carl Gold: [01:06:42] Yeah, yeah. Statistical learning theory. Harpreet Sahota: [01:06:46] So what song do you currently have on repeat? Carl Gold: [01:06:48] currently on repeat the Zen song by X Ambassadors with K Flay. Harpreet Sahota: [01:06:55] I check that one out. So where can people find your book? Carl Gold: [01:06:59] What's currently only in ebook so you can get the book in PDF or Kindle or EPUB format from my publisher Manning Publications. The hard copy printing has actually been slowed down by cozied because Amazon and the book distributors like Barnes and Noble stopped taking delivery for a while this spring. They're actually delivering books again and more importantly, they're accepting books from publishers again. But my publishers production was slowed down, so it's still supposed to come out later this summer. But right now it's only as an e-book from my publisher's website, Manning Dotcom. Harpreet Sahota: [01:07:34] Awesome definitely link that to the show,and I also link a discount code that Manning has given me on to share with the listener so that they can save 30 percent off a book. So I'll be posting that in the photos as well. Harpreet Sahota: [01:07:47] So how could people connect with you and where could they find you online? Carl Gold: [01:07:52] Ok, well, definitely social media. I'm on Twitter at Carl 24K and on LinkedIn you just search for my name on LinkedIn and if you want to connect with me on LinkedIn, tell me that you're interested in my book. A LinkedIn. I don't connect with random people because they might be recruiters are just trying to use my connection for recruiting purposes. But if you message me and say that you're interested in, I'll always connect with you. And of course, on Twitter you connect with everyone. I've also got a blog website Fight churn with Data. You can, you can actually connect with me there but you can find more of my writings and information. And of course there's code on GitHub. So the book has a repro on GitHub, so that's where you get all the code. And I guess that's not really a way to connect to people actually read messages on get highlighted now. Carl Gold: [01:08:42] But yeah. Yeah. Your blog is is awesome as well. I spent some time just kind of thumbing through all the writings that you've done on there and it's really, really helpful. Recommended that already seven people this week. Carl Gold: [01:08:57] Thanks. Harpreet Sahota: [01:08:58] So Dr. Gold, thank you again so much for taking time out of schedule to be on the show. I really appreciate you doing a deep dove into your book. I really appreciated you share knowledge and wisdom with us as well. Thank you. Carl Gold: [01:09:09] Yeah. Thanks for having me. Really enjoyed it.