HH79-29-04-22.mp3 Harpreet: [00:00:09] Let's go. What's up, everybody? Welcome. Welcome to the heart of the data science. Happy hour. It is Friday, April 29th, and for some reason it's still like, holy shit and cloudy and it's summer. Everything that I know. Some days, hopefully you guys are having a good week. And heads down. Just write in code all week. It's going to be a long, long time. So I've done that. So it's been awesome. I'm super excited to have all you guys here so far in the building. We've got Russell Willis in the building. Kenji Luke Ross is in the house too then so, so happy to see you guys here and see what's going on then. So hopefully you guys do get a chance to tune in to the episode that has not yet been released, but the episode will be releasing this weekend is with the one and only people's data scientist, Mr. Danny Moore. We recorded this episode live back in October, so it is live streamed on YouTube and on LinkedIn. So be shot there. That's all good. Just tuning the podcast anyway, because then we'll listen to it again. Yeah, this month has been pretty, pretty good man had had an amazing lineup this month just to remind you. Right. Not only do I have Danny Meyers episode releasing this weekend, we had Chairman Mac, the data professor, we had Christina topless, we had Natalie Mixon, we had Andrew Jones as well, super, super stacked with a lot of great people. Came onto the show and had a lot of awesome conversations. Harpreet: [00:01:40] So please do tune into that if you have any questions. If you tune in on LinkedIn, if you're tuning in on YouTube, Twitch, wherever the hell you're watching this thing, man, if you guys got questions, please do let me know. Drop in, chat in the comments. I will be watching and keeping track of those questions. So I told a friend of mine, Karen [00:02:00] Jean-Francois, that I'd help her out a little bit. She sent me a survey with a bunch of really, really interesting questions, all about the concepts of data science and business life. And I said, Oh, do her one better than me answering those questions because I'm not that interesting. But I told her I would ask these questions during happy hour and get people's perspective on it. I think they're great questions. We've got three of them. We'll get through a couple of them. But if you guys have questions and you're watching on the chat or wherever you are watching, you've got questions. But first question is this. Karen's asking the guys to provide the example of data science or data analytics project that failed because it wasn't aligned with the business needs or objectives. How was that received? What were the consequences? So these are interesting questions. So I'll start with that. I remember one of the first data science jobs I had was at this company here in Winnipeg called Bold Commerce. And I was like one of the founding members of the data science team. And they wanted they wanted insights into the customer base. Harpreet: [00:03:15] And they had a Tableau license, one Tableau license, and they're like, Oh, we got this thing. We paid for it. Do something with it. And so what did I do? I went ham. I created, like, the most fucking dope. Amazing dashboard. Imaginable, but people didn't know how to use it, didn't know how to interact with it. They didn't know what the hell it was that they're looking at. And not only that, it was one license. So in order for anybody else to see this amazing dashboard, we had to play hot potato with the credentials and only give it to, like, one person at a time. That was really, really frustrating, but we've got more people come into the room, so I'm going to [00:04:00] repeat the question and hopefully we get some responses of what can for this one to see if you've got any more stories or can can is Luke GROSS as well? I'd love to hear about any more stories you got, but an example of a data science data analytics project that failed because it wasn't aligned with business needs or objectives. How is that received? What were the consequences? If you are listening on YouTube or on LinkedIn and you've got a war story, you want to share it, go ahead and let me know. I'll give you a link and you can go for it. Yeah, I got one that comes to mind. So when I was an intern in grad school, I had a data science project at a large manufacturing company and it had to do with so with with trains. Harpreet: [00:04:48] And so trains would come in and a person had to decide if they should fully tear down the engine, take it apart, and then put it back together again and clean it, whatever it might be. And I was tasked with building an algorithm that would essentially replace that, that work. There was one person who did that job or in each place that they would tear things down. And so the only person I could get data from was the person whose job that this algorithm in theory would replace, which is not a good situation to be in. The other problem, though, is there is no ground truth in that scenario. You only have if the engines were torn down or if they weren't torn down and it was based on this person's judgment. So the best model that that I could build would just be a replica of what this person is doing. And so there was just an inefficient amount of data. The problem was scoped incredibly poorly, and I was coming in as an intern trying to solve this, and I produced absolutely nothing because it was a flawed problem to begin with. So to me, maybe it's not related to the vision and then misunderstanding the stakeholder, but it is a problem with misunderstanding the actual problem [00:06:00] that's at play by the people that scoped the problem to begin with. And so, you know, to me, that's about as bad as you could get, right. Speaker2: [00:06:10] Is that you're throwing this kid in. Harpreet: [00:06:12] I was a kid at the time, at least I think to a problem that is with the resources and the people and the understanding of data. That is that is not. Realistically possible to solve. Ken, thank you so much. Luke, if you want to share story at any point, just let me just slide you down and I'll be happy to call on you. In the meantime, let's go to let's go to Rashad. Rashad look drastically different from what I saw some 24 hours ago. It's because I decided to get a haircut. So Afro, I went to my favorite Dominican barber. Yeah, but anyway. Hey, no, it looks good. Clean face. So just there's a lot of people that just joined in. So pretty much what happened was a friend of ours, friend of mine, yours. Everyone's Karen. James. Francois sent me a survey. He had three questions. A little survey all around the concepts of business alignment with data science. So what would get through a couple of these questions? See where it takes us. But the first question I thought was interesting and it was to provide an example of a data science or data analytics project that failed because it wasn't aligned with the business needs or objectives. How was that received? What were the consequences? So Rashad, if you've got a story, please do let me know. And if anybody else in the room here, what's the share? I mean, I would have been for sure he's got some good ones. But if Eric, Joe, Tom, Russell or anybody watching on LinkedIn has a story, please do. Harpreet: [00:07:47] Rashad, go for it. Yeah, yeah. It's never happened. Every project I've ever seen has been aligned 100% to business. Right. So that works. Yeah. I can think of some in my [00:08:00] in my first job. So we, we basically the first company I was with or I got my career start, we're basically selling data and the idea is that we could use data science to compute values of key, key columns. Right? And then we could sell it and say we're more complete. And so like the use case was like be able to get lift basically in the percent of data that we have and then maybe also sort of have confidence or something like that. The thing is like so the business incentive there you might say is to fill in as much stuff as possible. But this is real estate data, right? So real estate is very local. And there is, I guess, emotion, there's personal feelings around it. So the data that is being sold to these people like know the areas inside and out. And so with them, basically, if you see like a record, if you see a prediction and it's like really off, it will dramatically reduce your trust. Like even a single record is just way that a lot of people in real estate, broadly speaking, think and they're like, Well, of course I don't trust this this nonsense. I have to physically go there and see the thing, right? And so essentially, like the incentive of the business imputation versus like what the customers would react to or were not aligned. Harpreet: [00:09:25] And so then it being this big discussion, like what essentially it boil down to what success metric should we use or prioritize? And people kept going back and forth. And then that combined with switching technical platforms multiple times, basically led to the situation of. We've been a model. It looks pretty good. Is it good enough? No one knows. Because we could actually get access to the people selling this data to learn more about the customer and eventually one data scientist eventually. So yeah, it took too long to get to value. And then and then it led to people [00:10:00] getting bored on the data side side and then business being like, well, we read this article that said we need data science, but I don't really know what more. Right. So it was kind of like a start and stop thing. Suddenly something that happened to me early in my career to do, I get like so hyped up about a project. Yes, we do. It would be awesome. But everybody else around me gets hyped up. No time to value is like this long tail. And it's just. It just doesn't bode well for anyone. Then let's hear from you. Then anybody else want to share story? Please do let me know. Either raise your hand or drop in a chat that you'd like to be called on. Jolie Jolie's. Good to see you again. It's been a while. Out there at the Berkshire Hathaway shareholder event. So go for it. Speaker2: [00:10:48] Yeah. Also like shot. I've never been on one of those projects before, but a friend of mine told me a story once about one of their projects and it was kind of interesting because it early on in my career, the only way, especially in the data science side of it, the only way I could get anybody to do anything was I had to build out the ridiculous business case. So for the first three years, I didn't run into this at all because I had to spend so much time begging people to let me do data science to open up their budgets, that it was always lined up with business value. And then about it was was 2015 ran into a project that was social media where they wanted to improve social media engagement and get their brand's image up. You know, all the vague stuff that usually comes with early maturity marketing projects. And I started out with that client and they said, Oh, you understand social media. Hey, can you can you jump on this? Because, you know, you're on social media to Oh, you'll find these. So I got in and I made the mistake that I think everybody makes where you get wrapped up in everybody who's really excited about something [00:12:00] and you don't know what it is, but they're excited. Speaker2: [00:12:02] So I got excited and I just started doing what they told me to do, and it wasn't until like two weeks later where I finally stopped in my wound. It was the business case for this. Why am I, why am I so excited? What am I doing? And that's literally where it was. It was three weeks later, I should have known better. And we went to try to figure out how to what was the value of engagement? What was the value of a view? Was the value of a like what was the value of a share? And it turned out like their entire audience was wrong. They had bought followers and that's what they had grown their audience from buying a bunch of followers. So most of their followers were actually not even interested in what it was that they were selling. And they had to pretty much rebuild the entire following. And it was was a tragedy. But yeah, that was like the only takeaway we did was we finally analyzed their customer network and realized that their social media channels were not going to produce any value. Harpreet: [00:13:07] Thank you so much. Shout out to Shashank is in the building that I think like one half of all of the data science creators are in the building right now. That's that's pretty dope. Good to have you here. Man is taking a nap in the other room. Is he taking the nap, Liz? Oh, Luke and Kent are here. Yeah, I think so. The question that I kicked off the sorry to put you on the spot. We hope you don't mind, but the question I kicked off the session with was provide an example. So background. This is for a friend's podcast survey thing that you're doing. And the question is, please provide an example of a data science or data analytics project that failed because it wasn't aligned with the business needs or objectives. How is that received? What were the consequences? You [00:14:00] got any more stories for us? Yes. Similar, but not exactly. We had a project where so long story short, a. Bunch of directors would go into a meeting with a higher up at the company. Right. So usually like a cheap whatever officer and the directors had no data to like actually talk about what they were doing. So they'd all come to the meeting, they'd all just kind of like shoot the shit and let me know how this needs to be, but I'll come and shoot the shit and then not really talk about anything of use really. Harpreet: [00:14:36] And what happened was we would have to create a. Like an entire. Deck for them. So just like a client I was working for, we had to create like an entire deck for them to, like, go into each of these meetings with. But it turns out that half of the metrics that we were asked to collect were not actually tied to the business objectives of like what they actually needed to accomplish or the business or the metric itself became a target. And I'm sure you guys all know that famous quote When a metric becomes a target, it ceases to be useful, really, really important for something like social media, for example, like I just heard about the whole fake followers thing and yeah, I've witnessed what that does to people's accounts and just how like you think you're making progress until you actually have to do something with that following and no one cares about anything you post. But yeah, we went ahead and created these metrics and everything for them, put it all up and it's like great format and everything. And then we find out that when the we find out that in this company there was a couple of directors whose departments were managed so badly that they didn't want us to actually release the results of the data that we'd all put together. Harpreet: [00:15:52] So I would say, yeah, the, the kind of stonewalling we received over there made sure that metrics that were [00:16:00] relevant to the business did not get released to the chief officer. And because it was a client, I was like, Screw this, I'm not going to like sit here and try and convince you to do what's good for you. If I was working at the company, it would be a very different. But yeah, no, that was that was that was kind of an example of how we were stonewalled from showing the business metrics that were relevant to them. No idea what's going on with them right now. Hopefully. Happy anniversary because that sounds like you should show that I'm cool with it going out of the way. We'll move on to the second question that Karen had. So it's kind of like the opposite. So when you think about any projects you work on data science, data analytics project that was aligned with the business needs and objective, how is that received? What was the impact for that? Let's go to Joe Reese for this and then Eric Sims and Russell Crowe will share your audio is going to see apologies in advance. If my voice is video is going to be at my grandma's Internet is not great. So it's a step up from AOL so so far. Harpreet: [00:17:10] So yeah. What happened when it worked? Everyone got to keep their jobs. So that was that was a plus. Yeah. So yeah, I would say the weird thing is when when the business is aligned, it seems like it's almost seamless. Like it just sort of everything clicks and it just works. So I think a lot of my business is going into situations where it doesn't work, but I would say the companies where it does work, I never hear from them. So everything's working great. So it's so I think that's that's usually it happens. You never know about it, actually. Or they might write a blog post about how awesome it is, but those are the kinds of successes you want. So it's like installing plumbing or something that works. You just don't care. It does this thing, so it's awesome. So curious other people see though, [00:18:00] plumbing is a sore spot for me. Sorry, I just figured they still have that. They still have not yet fixed my basement. So sorry about that. It's like electrical wiring when it works. It's a funny story. For a while my basement has been taking so long, so this would be an interesting whatever. So the insurance adjuster that was in charge of my claim, he was on the verge of retirement, was about to retire a month or two after taking our claim. Harpreet: [00:18:38] But then he had a bunch of unused vacations. So since our claim when he went on vacation, came back, retired. And so then our. Claim was lost in the process, which is why it's taken so long to get the station back up and running, which is extremely frustrating. But Eric, let's let's hear from you, man. Do you have a happy story for us or. I guess I guess like. Projects could could be aligned with business needs and objectives and still fail. So I guess it. That's true. Real quick. Speaking of claims and weird things happen, we had a place that had some basement flooding, but we were renting it. So we were like across the country at the time. And then we found out that the person who was responsible for the flooding because it came from the neighbors had actually died. And so we had to like figure out, like, okay, how do we now get in touch with their insurance or whatever? Does it need to go through probate, blah, blah, blah? It was a headache and a half and I was like trying to figure it out. And then finally I managed to like look up this person's kid who's older than I am online and gave them a call. It turns out they weren't dead. And much it was it was really, really weird and complicated. Eventually [00:20:00] we got to figure it out. Harpreet: [00:20:01] I'm grateful that they're still alive. But anyway, data science analytics project. Okay, so how was it received? So I actually have a project like this in process right now, and I would say it was received really well because it was actually the business's idea. Product team says, Hey, we have this this feature thing that we think we could maybe leverage to accomplish to to improve our revenue and yadda, yadda, yadda. And so that's been something that I've been working on for quite a while to get it into a good place. And so the reception has been really good and sometimes I feel kind of bad because it's like, Hey, I've got this cool project I'm working on, but I can't like totally claim credit for having this great idea of how to use it because it came from my stakeholders who just needed me to do some work so that we could do it, which, you know, feels like Darren I can't take credit for it, but it also feels like, yes, I don't have to sell it because people are already sold on it. So that's in process right now. Should be going should be deployed probably in the next week or two. So I'm pretty excited about that. Hey. Nice to be here, Tom. How you doing, man? Long time, brother. I'd like to give an announcement. Speaker2: [00:21:20] You. Y'all might have known I was lead data scientist at strategy. Harpreet: [00:21:25] And that wasn't working out. Speaker2: [00:21:30] I just informed my CEO today that I was leaving to go join ECHO. Harpreet: [00:21:36] Global. Speaker2: [00:21:36] Logistics. And I'm super excited because it has to do with the shipping industry and the logistics around that and a lot of advanced stuff can be done there. But the amazing thing is it was a very amicable leave. My CEO wasn't mad, my teammates weren't mad. We agreed that whatever I discovered I'd still share with them. [00:22:00] They'd reach out to me if they needed help. So it was really good to see how good nature, the whole thing. The new group's treating me like a superstar and make me feel highly valued, which is always wonderful. And it's a big increase in pay too. So just excited all the way around. And regarding the regarding the other thing about. Harpreet: [00:22:24] I think I will never meet anyone that was more stupid. Speaker2: [00:22:29] About making sure their data science work was aligned to a business objective than me when I first started trying to do that stuff now. In fairness, I'm not sure the term data science even existed then. But I learned the hard way. Harpreet: [00:22:45] Jeez, Tom. Speaker2: [00:22:46] You taught control system design for five semesters. Couldn't you have abstracted those principles and apply them to. Getting frequent feedback from the business on whether what you were working on would really serve someone's needs, or were you just over there trying to develop cool stuff that no one may know how to use? Oh man. That's why most of these hairs are gray right now. So. Harpreet: [00:23:10] Just. Just. Speaker2: [00:23:11] That's. Harpreet: [00:23:11] That's my public confession for the day. Tom, thanks so much. And also congratulations to you. That's that's awesome. I'm very, very excited for these opportunities moving up. Look at these stories back there. It does not look like it. Any questions coming in from you? I have a bad one, but not a good one. All right, let's let's here. Let's get the bad one. Okay. So a bad one was I was assigned to a team, supply chain team for six months. And we were I was in charge of basically building out some sort of analytical tool to monitor supply chain performance. And [00:24:00] we found out that the data we needed was actually pretty difficult to get inside of this other application. So being like the go getter and like bright eyed and bushy tail at the time, so I was like a young data analyst. It's like, Oh, I can build out a solution for that. And so, you know, I ask everybody on the team for this. I'm like, Hey, Ken, is this is this data going to be good or are we gonna be okay with that? So I ended up building out this whole solution. It took me a few months to build it out because frankly, I'm not good at engineering and brought it in. And then after we got it in and I started to make some of these dashboards and share it with the team, they one of the members was just like, this stat isn't good because the metrics were saying like they're part of the team like wasn't performing well. And so it was like, this data isn't. Speaker2: [00:24:46] Good, we. Harpreet: [00:24:46] Can't use this data. And basically they went around me and convinced everybody that the data was really bad and that we can't use the data. And so yeah, so like months of work just like down the drain and not even. Speaker2: [00:25:00] Able to use this data and the. Harpreet: [00:25:01] Solution that I built. And yeah, as far as the impact, I mean, like I said, I was only there for six months so it was coming time to my end. Anyway, I transferred away and you know, the company just lost all that time and effort for me there. I mean, I learned some stuff about that engineering, but that was the only value out of that whole experience, unfortunately. Hey, you got paid to. That's how I got paid. Yeah. Yeah. And you got to share that story data out there. So that's that's awesome for us. If you guys have questions listening on LinkedIn, on YouTube, on Twitter, it is. Or you can hear the questions, please do let me know. I'm happy to take your questions. So we keep asking what are what stories are we sharing? So Karen Jean-Francois, who also has a podcast about women, did a podcast. She sent out a survey and she just asked me, Fill it out. And I was like, Oh, hey, well, I could just use these as questions for our space. And I was happy already. She's like, Go for [00:26:00] it, be awesome. So there's a list of three questions and questions ranged from provide an example of a data science or analytics project that failed because it wasn't aligned with the business needs or objectives. How is it received or the consequences? Think about a data science or data analytics project that was aligned with the business objectives. How was that received? What was the impact? And then also the final question here is if you could share an example of when an advanced solution was built but wasn't implemented because the stakeholders weren't receptive or wanted something more simple. But we had a good one for that. But we could go any time anyone goes or if anybody else wants to. Go ahead. Raise your hand or in the chat. Speaker3: [00:26:52] Yeah. Well, I would also say, honestly, for all the projects I've done. And this is going to make me sound like a terrible engineer, slash data scientist, slash everything, I'd say. There were more failures and successes if I were, be honest. But it depends if you like. I think sometimes a project not going into production, honestly, that's insight on its own right. So part of it's also how we're defining failure. So for example, if you go through a project and then so for example, like I've done things like build dashboards or reports, people were like, oh, we want to figure out here is a common one was we want to figure out the propensity of these customers to purchase these products or even like which sales teams are like, which sales reps should we recommend to sell this product? And then we dove in and it's like, Oh, that would be interesting. Except like all the sales reps are automatically assigned to like a certain industry vertical and go, so already you won't have any [00:28:00] like experimental data or even for example, like on the health care side. And this is this part that's really kind of interesting is like so for example, like if you're dealing in a company that has like really, really big data, a lot of times you don't have to worry about like sample size because if you have millions and trillions of like click streams or whatever, going through like anything just becomes statistically significant, pretty much like any change. Speaker3: [00:28:27] And for a company like Google, it might be worth it to have a 0.02% no, that's a 0.5% increase in click through rates that that is like can be very meaningful to a Google sized company. But for a health care company, if you're talking about emails being sent to like a population of 1000 Medicare clients, that is so not even useful. That's not useful. So I think like something that, you know, we, we sometimes try like when I was working as a data scientist or at Teladoc is, for example, we would send out these email marketing campaigns and their email marketing is cheap, but we are also sending out physical like paper surveys and also sign up forms. And basically what happened was that they weren't segmented or the sample sizes were not correctly calculated, which means you wouldn't if you combine that with not calculating the effect size or the power, you basically just spent a couple hundred thousand dollars on mail that you can't actually use the results for to improve your future campaigns. So there's definitely been stuff like that and I think it sucked when I like, did that project less so because of something that was like on our side, but more because we hadn't built that relationship with the business partner team where they felt confident like, Oh, we can bring you in and we can help and you will help us design the experiment, you know, and I think that was a fail, but [00:30:00] it was also a good insight that like, hey, we need to be like, you can't just have an engineer or data scientist operating without context or domain, like they have to have this really tight relationship. Speaker3: [00:30:11] And because we didn't invest in that relationship and we kind of held ourselves apart ultimately, like both teams, kind of like, you know, both teams suffered, right? Because at the end of the day, like if you're working, for example, like a health care company, like your goal usually unless your insurance at which your goal is to get as much money as possible. But if you're not at a health care insurance company, if you're like a medical device, is your goal is to help people get healthier outcomes, right. It's to help them with their diabetes, help them with their chronic conditions like you name it. So any loss of revenue is also a loss in like quality of life for them. So, you know, there's been that. And also we've sometimes hired people where they had more of like the big data mindset. We're like, let's just toss big data out there. It's like, Oh, great, yeah. By the way, you only have 100 patients that you can actually do quote unquote experimentation on. Harpreet: [00:30:59] So have fun with that. Speaker3: [00:31:03] Right. But one of the one of the successes, for example, like, was actually I was celebrating today, which I was late to, this was so into it has like a week long hackathon and it's meant to basically create a demo, create a poke, get it as close as you can. You know, in my team there's a project I've been working on for two or three months that was essentially, you know, can we it answers the question of can we deploy? Can we get a data scientist model like out of development to production in basically under an hour? Because right now, sometimes the process is like. Three weeks. And even for a simple model, right? It's just because of the legacy tooling and all that. And this was like a project I was dragging on for two or three months, and it just felt like everything was so hard. It was like this octopus project where on the surface it's [00:32:00] like, Oh yeah, we're going to crush it. And then it's like, you just get into so much red tape or it's like, Oh, right. Except all our tooling is tied to this legacy systems and you have to go talk to this team. And oh, by the way, all this tooling is like owned by four or five other teams. And so and I and so my, my big fail was I wasn't asking for help when I needed to, when like, as I got deeper, I'm like, this is a project where we definitely need we need just more hands on the project. Speaker3: [00:32:30] We need some like buy in and all that stuff. And so finally, like the hackathon was coming around and I was like, okay, well, I could keep on holding on to the project and it's not going to go anywhere. It's not going to help the team. Or I could kind of let go of the ego. I could offer up the project for the entire team to work on. We'll crush the demo and we might get that executive buy in. We'll get extra investment, all that good stuff. And that's exactly what we did. We crushed it in front of a bunch of the VP's and VP's get into it. And to me, that's a big win. And so that's kind of what we were celebrating because even though it's like it's it's 98% there, there's 2% where I edit out the video where it's like, okay, we have to copy paste some configuration files. That's okay. Hand-waving. That's okay. It runs. Right. But we were selling the vision and it's 98% there. Right? There's like a few days of Google searching. But to me, that's like a win because people loved it. They're like, yes, this is what we want our ML Ops vision to kind of look like. This is a core part of it. Speaker3: [00:33:42] And on the one hand, it's kind of embarrassing because I'm like, okay, well, we like crushed 98%. Well, we crushed like 75% of the project. In a week, even though I was struggling with it for two months on my own. But I think honestly, that's a huge win. I'm like super excited for next steps. They [00:34:00] also told me I can go chill next week, so I'm like, Yeah, you know? But the other part to point out was like, the win was not independent, like it was a team effort. So when we kind of combined different teams and we had different experiences, we had a tech lead senior myself. I'm like between honestly senior and junior, and then we had two juniors who are all just like working on a different parts and we coordinated and it came out with something really beautiful. So, you know, sometimes that myth of like the ten x or the hundred x engineer is just that it's a myth, right? Sometimes it's not about the ten x or hundred x engineer, it's about the ten x hundred x teammate. So I'm like, I'm super excited about that. So a win. There's a lot more losses than wins. Honestly, in my kind of history. And 50% of those losses, I think, were just validation that that business idea didn't need to go forward or it didn't need, like, engineering effort, you know? So I'm. Harpreet: [00:34:58] Like. Speaker3: [00:34:59] I'm like stoked. Harpreet: [00:35:01] I'm super stoked to win. That's a huge undertaking, huge effort to get something deployed. If from scratch, what would you guys use to. Speaker3: [00:35:11] Yeah. I mean, so, you know, not to give too much details, but essentially we had some. So we we have some certain project templates that we use, but we had kind of built in some of these patterns into like a full application. So the baseline is we use this legacy internal application to get their project set up templated, get it deployed, all that. But it was really kind of. It was really smart in a lot of ways. It was really witty and. Powerful and robust. But that also meant it was very painful and things broke. And then when the one laptop came out, there are certain things that Docker and TensorFlow decide not [00:36:00] to agree on. So it's like rice. Like a third of our projects are TensorFlow based. So what we decided was we were like, we're going to actually just toss the laptop out of the pipeline, like just get rid of it. The model, the package, whatever is never, ever, ever going to touch the laptop. And then we're essentially going to like lift some of the legacy parts out. But we're going to use we use a combination a lot a lot of GitHub and also a lot of sort of GCP stuff, but we're just going to do all that. So they never touch the laptop, essentially. They can develop it, test it. It gets deployed as a as a container. And then right then from there we can run a bunch of apps. So that's rough level, rough summary. But the key sort of insight was there was a lot of legacy stuff we had already built out that does work, but there is enough of it that wasn't working. So why don't we just get rid of the laptop and just do a complete remote dev environment? And so that was kind of what we did and it was really fun. Yeah. Harpreet: [00:37:10] So yeah. Speaker3: [00:37:12] Really. Harpreet: [00:37:14] That's what that's awesome. That's interesting too, because I'm like, I do all my prototyping on a laptop on it, and so that would be very, very high definition or something. Let's go to Russell then. Let's go to Ken and then let's go to Eric. Then, by the way, if you guys have questions, please do let me know whether you're watching on YouTube, on Twitch or on LinkedIn or even here in the chat. If you got a question, let me know. So we'll do this. We're going to Russell and Ken Erickson has a question shout out to McCormick. He's watching on LinkedIn. Keith, if you want to come in to the Zoom room, let me know. We're happy to have you there. Russell Wolpert. Speaker2: [00:38:00] Thanks. [00:38:00] So firstly, I agree entirely with Mexico. Usually far more. Harpreet: [00:38:06] Failures than successes. If you're innovating at a good pace, you're likely to have far more failures than successes. And rather than having any specific examples or stories, I've got a couple of lessons that I've. Speaker2: [00:38:25] Learned by building data products that. Harpreet: [00:38:29] If the consuming audience has. Speaker2: [00:38:32] A low data maturity or low data literature. Harpreet: [00:38:35] Sorry, literacy level, it's far more likely that you're going to have a poor taker, or you have to be very, very rigorous in producing something that's going to be optimized for their level. So that might mean that you produce something that's 10% of what you know you're capable of doing. Now, you could put bells and whistles on some kind of data product. Speaker2: [00:38:58] It's going to be amazing. Harpreet: [00:39:00] And your peers will be putting you on the back end and it's great, but it's not going to serve the purpose for the consuming audience. And I struggle with that sometimes. Speaker2: [00:39:08] You know, if I have to build something that's really basic. Harpreet: [00:39:10] I know it can be better. But if you if you hit that target to begin with and then slowly iterate, you can build the increases to the product along. Speaker2: [00:39:21] With the literacy and data. Harpreet: [00:39:23] Maturity of the audience. And secondly, and again, also tied to to low data literacy on maturity is the stability of the data itself. Often common reasons why we get a failure of uptake is the stability of the data itself. So something breaks. The audience thinks that, well, that's not working, so why should we even look at this? But the reason it's broken is because some of the primary fields have changed. You need keys or something else, and part of the handover of the new product we give is like the data is, the data is, the key [00:40:00] data has to stay unchanged. Don't make any changes to the data and we get a call. It's not working. If you changed anything. Nope, we didn't change anything. We look and see. But all these field names have changed. Ah, yes we did. We change some of the fields actually, as I remember now and and they're not able to initially make that connection to changing the fields breaks the product, even though we've made it very clear at the start so that when the audience is very new to data, it's very, very common together obstacles with the uptake. Harpreet: [00:40:37] Russell, thanks so much. Let's go to Kenji then after Kenji will go to Eric with a question and say, by the way, if you guys are listening on YouTube or if you got a question, please let me know. Yeah, I got. Quite an interesting one regarding overcomplicating the problem. So in my line of work, I mean, I'm dealing a lot of the time with athletes, former athletes, people that aren't like inherently technical, like they spent their entire life dedicating to a different craft. That is not data, that is not business, that is that is not like a domain that has a like a lexicon that that we're familiar with. Right. And I find that almost all of the solutions that we propose, they can still be useful, but we have to communicate them in in different ways. Like, like fundamentally very different ways, like with analogies, with case studies, almost never in aggregate, you have to say, hey, in this scenario, if this happens like that is how you explain like the entirety of the model with. Speaker2: [00:41:44] These unique case studies. Harpreet: [00:41:45] Which is fundamentally different than what in theory we should be doing, because that's not how data works. It doesn't work on necessarily a case by case basis. All of our algorithms and models work when we have large volumes of information, whatever that might be. So I think that that's [00:42:00] one one component I think is really interesting is almost always have to dumb things down, not dumb things down. I almost always have to simplify things to to convey what the information means to the audience that I have another. Speaker2: [00:42:12] Very. Harpreet: [00:42:12] Specific example of this. Oh, not of this. Of a data scientist not understanding the business need or overcomplicating the problem. My company, we collaborated with a very large technology company, maybe one of the largest in the world, who was owned by a guy named whose name rhymes with will mate. We, we so we worked with them to try to like to help build a model that would predict for golfers the probability that they would finish with a certain number of points by the end of the year. Right. And to me, that is a simulation problem. Right. It's very easy. Like the best results. Speaker2: [00:43:03] You're going to get. Harpreet: [00:43:04] We just simulate what happens. Like there's constant states. Whatever might be is joining us and she's it there. But from my perspective, very straightforward. That is that is how we would solve it. Speaker2: [00:43:19] It produces really good results. Harpreet: [00:43:21] The data scientist on that team, they were trying to sell machine learning. So the answer to every single question that we would pose would be machine learning. We say, hey, is this they're saying machine learning is the answer. What's the question? And they spent two months working on this and never got anything. The first. Speaker2: [00:43:39] Week we had the simulation ready and it was good to go and they completely. Harpreet: [00:43:42] Spun their wheels for all this time trying to fit a square piece into a round hole. And I think that when other things are aligned, a lot of companies actually do machine learning and AI projects for publicity, not for business value. And when things get muddled like that, we [00:44:00] get really. Speaker2: [00:44:01] We get in a. Harpreet: [00:44:01] Lot of trouble. So I think that, at least for me, that's that's a fun thing to think about is projects aren't always about like linear value that's created sometimes that the largest organization in the world, half of it is PR. And you have to think about the return on that investment, too, right? Because if they said that they had created machine learning. Speaker2: [00:44:19] Solution, which was like even if it was worse. Harpreet: [00:44:23] But if it still worked, they could say they did machine learning to solve this problem. And I don't know. I just think that's like a really weird and paradoxical approach. But it's not all about just that linear value that's created. Your mistress needs to appreciate that hammer and nail type situation will stay with me. Let's go to Erickson's question. By the way, I posted a link to the survey right there in the chat. Come back to this. But I'm happy to take any questions you guys might have set out to see here. If you have a question for us, please do let me know. Let's go to Eric. So I have been kind of thinking about this for a few weeks and trying to apply stuff, apply to my work. But I think today, finally, it kind of came together into like some words of like what my actual question is that I've been struggling with. And that is like, how do you switch in your mind between like root cause thinking, why is this thing happening the way that it is and outward thinking to say what could be happening differently because they're way different. When I feel like I'm like looking at the data and trying to explain it. And then the other, I feel like I have to turn and look out into the big, the wide world of possibilities. And and sometimes I feel like, you know, a GM will be like, great, thanks for getting to the root cause [00:46:00] of that. Harpreet: [00:46:00] What are we going to do about it? And it's just like, I don't know how to, like, shift that like immediately. It's like task switching. How do we how do you how do you make that switch? Mark. Mark, and then we'll go to Richard. Uh. What's up? Can you hear me? I want a new device. I don't know if it's working. Well, yeah, so? So I'm definitely still learning this myself. Something that's been helped me a lot is noting, talking to my manager and being like, Hey, how can I switch this system? We have a good manager, but how can I make this switch? And the way I typically frame it is I did this analysis, identify this root cause, but before I go, the stakeholder, I wanted to brainstorm with you. What do you think is kind of like a great next step to suggest to them? And I'll provide my manager like three of them. We can do X, Y, Z, we can do A, B, C, so on, so forth. But then give my reasoning like this is why I was thinking strategically like why this matters, right? Or This is the context and the market or context and the political politics within our company of why this is happening. And the reason why I ask this constantly is that then becomes a discussion with my manager, who has way more context, typically above the work I'm doing and can guide me on like what interesting things are there. Harpreet: [00:47:33] That way you pick up and by doing that conversation over and over again, you start building that same sense they have and move away from conversations such as like, Oh, well, maybe you should consider this to them. Just saying, yeah, I agree with you. And that's how, you know, like you're, you're starting to kind of get that, that strong sense. So that's not like very tactical and it is like, how do you do that? But that's the process I use to pick up those skills. And so something [00:48:00] kind of go into the tactical thing like how do I switch? Like, how do I go from like I was like digging the data pipeline and like, oh yeah, we have this issue here, right? How would I make like a meaningful suggestion? And I'm completely blanking on, on it. I'm going go, go find it first and then send it to you. But in my first data science job, my manager used to be a nurse and so many times nurses would have to go and go to the physician and be like, I did all these checks on the patients. This should be our next step that you should combine your expertize with. And there's like a specific model that they use with every specific steps of explaining the situation and communicating it, and then also providing the next step. Harpreet: [00:48:46] I'm going to go find it for you because there's just so transformational and like how I approach communicating and it really allows you to kind of target exactly for the next stakeholder what's important? What do you need to worry about? What's the context you don't even really need to consider and just bring it up to them. And then what's the next step they can take today? That's really helpful. And I think another key thing I'll stop while other people speak is not focusing on what's the overall solution, but literally what can they start doing right now. And sometimes those are two different things. And sometimes the what you can do right now is just enough to kind of like let them spin their wheels, come up with a better solution. Sounds almost like an Eisenhower matrix type of thing. Are you describing the message that it was like from nursing? It was like. Like a researched, like model from nursing. I'm going to go find it. It's going to bug me now that I can't remember it. I'll be very helpful. Thank you very much. Let's go to Rashaad and ready to jump in here. If you want to jump in here. Typing the questions attached as to refresh. Great. Let's go to Rashad. [00:50:00] And then if you guys have questions, do let me know whether you're on YouTube or Twitch or even here, and I'll agree with you. Harpreet: [00:50:08] Hmm. So first, let me just clarify. You're essentially asking, how do you switch your mind between thinking about the cause of things versus thinking about what you should do about it? That's what it was. I was trying to follow it. I don't like it. Yeah. So it's kind of like the cause of things that are versus thinking about, like what we could do differently. Like how are we going to change this process? How are we going to improve our business? How are we going to make that? How are we going to get 30% more revenue in the next quarter or whatever? You know, just like something that I can't be like, oh, well, let me dig in and find the answer to that. You know, it's more, I guess, abstract or creative. Mm hmm. Mm hmm. I interesting. I have personally not experienced this dichotomy. I mean, what you're the way you frame it does sound a bit like deductive reasoning versus, like, sort of brainstorming, like, lateral thinking. So you're essentially saying the root cause is like this and that and like, okay, now what data do I have? And, like, getting going down and eventually you reach like an end point. And it sounds like the other exercise of creativity is like sort of think of it as like a vague sort of like the world out there. Harpreet: [00:51:29] There's like an endless possibility for action. Right. And which of the million actions could we take? I think there is a way to merge this. I actually think you can I think about it in terms of religion or metaphysics, like the book. The Book of John begins in the beginning was the word, right. So people debate what created the universe. Right. And you could say, you know, some some people would say, if you really like, you [00:52:00] can make assumptions or you could say, actually, you can't really prove it one way or the other. Right. You have things in reality that you can that you can observe in science. So some people say that's beyond the realm of science. So I think of like thinking of of causes as suggesting actions by concerning yourself with causes that you can do something about essentially so and so you're like, Well, if I have that metaphysical understanding, maybe you would suggest different actions I take. So on the business level, it'd be something like you could concern yourself with the root cause, but diagnose those causes in such a way that they suggest actions like people are leaving, people are turning because of this. Therefore I should do that. People like if you I can't think of getting to a real root cause and then having no idea what to do about it. But in terms of like changing the way of thinking, I definitely think that they activate different parts of the brain. Harpreet: [00:52:57] I personally found that that I should engage in those at different times of day. This is supported by research and Daniel Pink's book When The Secret Science of Timing. So it's say that most people tend to be morning or afternoon people and that's the best time for like very focused, logical thinking. And then at the time in your, I guess your lull in the day tend to be better at lateral thinking. And so you might want to engage in that sort of exploration. It's sort of directed, but not fully directed at those times. And maybe you'll come up with stuff probably on me and also like just going on walks, getting away from the computer. It's a very different actions that you would take, but I think that ultimately you can link a diagnosis of a problem to a course of action. There's more on that sort of thing. And Richard, from this book, Good Strategy, Bad Strategy, which I one of my favorites of all time. So maybe that would help. Amazing book. I've read it before. Can't recommend it enough. Which [00:54:00] one's a good strategy? Best strategy? Yeah, I say absolutely of that book win. Daniel. Anything? Daniel. Daniel, if you're listening, then I think we might be able to do the podcast. Yeah, I've sent him countless emails. We didn't go for. Speaker2: [00:54:24] If you get caught like in a meeting and you weren't prepared for the, you know, so what are we going to do about that? I'm going to kind of bust myself a little bit and hopefully nobody that's a client watch is. I always go, well, you know, I came here to get some expert opinions and that's really where I wanted to start because I find, you know, if I if I ask the experts, I typically get a good starting point. So, you know, what's the consensus here at 100%, if you turn it around like that, you will buy yourself some time to think because people will kind of jump all over themselves to meet the expectation you've just set of. I came to ask experts and especially if you're dealing with senior leadership, somebody at some point will jump up because this is their opportunity to show off. And, you know, they truly senior leaders love being I'm sorry, I shouldn't have said that. But they yeah, they kind of love stunting in meetings every once in a while. So it's you'll get somebody who saves you for a few minutes while you think it through. And if you need to really switch gears to go from I found a problem to now I need to find a solution. You know, that approach actually isn't terrible asking experts. But before that, if you're about to go into a meeting and present a problem like I found an issue, you always think to yourself, you know, you're going to always get asked, what should we do about it? You know, you're always going to get asked for a recommendation. Speaker2: [00:55:58] But realize you've gone from describing [00:56:00] something. Your data is descriptive. It's telling you, Hey, here's a problem, it's diagnostic, it's telling you, here's what's going on, but now you're being asked to predict something. And that's a completely different model. And that's the trap that a lot of data scientists find themselves in, is they'll say, here's what's wrong, and then they'll try to use the same data and the same model to say, and here's what we should do about it. And you can't do that. You have to do. Now they're asking you for a prediction or prescription of action, and the level of support that you have to have for either one of those is very much higher than just describing a problem or describing a situation. And sometimes it's a good teachable moment where you can say, this is a good time for me to explain the difference between descriptive data and predictive and prescriptive modeling and analytics. It's going to be a significant amount of effort to really give you a good answer to that is a good educational moment, too. So hopefully those are two good answers. One stall by time, the other one really to teach leadership that they're asking a very difficult question. You can answer it, but you can't answer it right away. Harpreet: [00:57:13] I love it. I love it. Go for it. And by the way, if you guys have questions that we know we're happy to execute or if you want to chime in on Eric's question here, just use the raise your hand or let me know and I'll execute. Go for it. Speaker3: [00:57:36] Yeah. I'll just go real quick because I think Mark found the all that he was looking for. Usually what I do is I kind of actually like Richard's point about the whole like if you if you're really close to the cause, the solution's probably not too far away from that. If you deeply understand, like, the root cause of what's going on. But normally what I do at that point is like if they want an answer, like, right in 30 seconds or [00:58:00] like a minute or what have you. Usually I'll say something along with like, I think I have some ideas, but why don't we set another kind of follow up meeting, let's say like a next day or two, and that gives you enough time to kind of like squeeze in some initial sort of questions or kind of figure your thoughts out and do some sort of exploratory dove. What I kind of like to do is I usually like to get Google sheets out, and if I know the cause, I'll actually just start like kind of using it as a pseudo decision tree sort of thing. Because what I've read is that kind of a lot of times the way some people like to solve problems, especially if they're visual learners, you literally just get like a huge, like drawing pad circle the problem right in the middle. Speaker3: [00:58:41] And then you literally start sketching out like all the different, like you draw the line out from all the different sort of things that could be like impacting it. And then you kind of like will then follow through like, okay, like, but that's relying on this thing, this thing, this thing. And eventually you'll get like a stack rank of things that are like short term solvable versus things that are like long term. You know, you need to invest in other and sometimes things that are out of scope. So for example, like sometimes we'll get random like we'll get random issues with like in video packages or libraries and it impacts some more pipelines. What can we do about that? Well, we can revert to, like, an older version that's stable. We can do a PR to NVIDIA saying like, hey, by the way, this needs to be fixed and hopefully they'll get back to it or any sort of maintainer of an open source package. But at that point, we're kind of like, yeah, we can't really do anything about it. We can maybe do a fork, do a patch. But I mean, it's different from an analytical problem, but there's like things that you can do. Speaker3: [00:59:47] There's very easily there's things that you can do not so easily. And then there's things that just are not within your scope of responsibilities. Right. And ultimately, figuring out like that stack rank of what you can do [01:00:00] that's easy versus what's harder. Usually I kind of lean on my manager or other, like there are other people who are also not in leadership roles, but maybe they've been the company for years and they know like the lay of the land, they know the mind holes, they know the people talk. Do they know the backdoor conversations you should be having? And so sometimes I'll, I'll pass by them, like, because they'll know they have boots on the ground. They know they know what's up. Sometimes the leaders don't like they get they have the higher, bigger vision. So that's kind of how I do it. But I really love drawing out trees. I love love tree. I hate the actual like trees algorithms because interviewers have asked me to try to implement them live and I go, You're not paying me enough to do that. Now or in the future. But, but like for ideation, all that, like those are really kind of nice. So that's kind of how I like it. Harpreet: [01:00:54] Oh. Speaker2: [01:00:56] I just want to shout out. Harpreet: [01:00:57] Four. Speaker2: [01:00:58] Mind maps. Right. Mexico. Harpreet: [01:01:01] Awesome tool. Yeah. My maps are awesome. Yes, Mark. Go for it. I just want to circle back real quick. I finally I remember I made a post about it like two years ago. I was able to find it. The models called SW Bar, which stands for Situation Background Assessment and their recommendation, and I found it was really powerful because the reason my manager gave it to me at that time is that I was doing these really deep dives. I was working in data quality, doing these really deep dives now, just overwhelming people and people just weren't listening and she's like, Yo, take a step back. Here's this tool. And it implemented and people started listening to what I have to say. And so to kind of give an example of how how can we utilize a safe, for instance, situation is people are trying to pull data and it's a month old and now they can't run their analysis background is it's in this system, [01:02:00] it's touching these endpoints. Right. And it's stopping at this point. Those were the errors being cause assessment is it's impacting 60 million records and it's for this set of time and doing XYZ. Right. And then your recommendation is like short term, we can just run something on the side outside of the pipeline and manually put it in. And those people can do the work right away. Harpreet: [01:02:27] But long term when you do XYZ, right? And so that goes from that deep dove and analyzing to saying like, this way you can do it again tomorrow. And what was really useful is like by having that format and putting in that format and keeping it concise for each one, it was very easy to people to follow and know what's happening without going too deep into the weeds. So. Well, thank you. And that's called s s the a r r it. Yeah. Ssb r s bar. And I posted a link in the chat blog from the nursing. So again, it's all from nursing, but you can easily apply it to other domains. Awesome. Thanks so much. Let's keep it going. Anybody got any questions coming in from link answering questions on LinkedIn or on YouTube? Of course you can. Come here. I have another question. Nobody's got a question. Go for it. Just a small question. Probably has a stupid answer, but so do any of you guys read the Morning Brew newsletter? I love The Morning Brew. I read very few newsletters, but that is one of them. So Morning Brew is awesome. And today they had like a little word challenge thing. Yeah. So they had, they had a little word challenge where it said, we have these four lists of letters and [01:04:00] if you put one letter from each of them in a certain order over and over and over, it will form a two words in English that are opposites. Harpreet: [01:04:10] And I thought I could sit here and stare at this for a while, but there's no way I'm going to figure this out. But I could probably make something in Python that would just make this really easy for me. And so that's what I did first thing this morning. And it spat out a bunch of like, you know, a bazillion different combinations. These are four letter combinations. And I scrolled through them and I thought I could do one better than this. I can find the enchant library, which I didn't know existed. That's a dictionary. And just say, look at these four letter combinations. If it's a word in the dictionary, show me those. And so that narrowed it down to like 20 words, 15, 20 words. And it was really easy to see at the point at that point that the opposites were work and play. But I thought and this is where my question comes in. So that's all just like calculated stuff, but where the language piece kind of comes in is how would you go about trying to compare, let's say, 15 words and figure out which one is the one's most opposite because. Yeah. How would you do that? I thought like, okay, well, maybe it's like a, you know, it's a word, word vector thing. Harpreet: [01:05:21] But the problem is one of the other words is worm and the opposite of work. I don't know if work has an opposite, but worm is way different than work. And play is not really that different from work sometimes. So I kind of wondered what your thoughts were on that. Have you heard of word to vet? Yes. I don't know if that was one thing that I tried. So we're going to take a word and find an embedding for that word. And then it goes embedded as it could be, like arithmetic, adding, subtracting. Probably more application. I'm not too sure. For example, if you were to do a [01:06:00] check on King minus queen, you end up with the difference which would be gender like king minus man, plus woman equals queen. Yeah, something like that. Yeah. I don't know if you can select the vectors like yeah. Work minus boring equals play. I don't know. When you think about like when you say words most opposite like semantics or like in terms just like edit distance letters type of thing. Semantics, which is why I'm a little bit like I mean, is work really the opposite of play? This was just a word, a silly word puzzle, but it really got me thinking about it. So I've had some. Sorry. Yeah. Speaker2: [01:06:43] I've had some experience. Training, glove, global vectors from scratch. Harpreet: [01:06:48] Eric. Speaker2: [01:06:49] And if you'd like, we could do a one on one and then report back to the group or record a video of this. But what happened was. Harpreet: [01:07:00] Pre-trained glove. Speaker2: [01:07:01] Models wouldn't work for the specialized corpus I was trying to to get synonyms for. And so it was a combination of. Three different. Harpreet: [01:07:14] Machines. I called the whole thing. Speaker2: [01:07:17] Dung beetle because it was trying to make. Harpreet: [01:07:21] A good structure from. Speaker2: [01:07:23] Crappy. Harpreet: [01:07:23] Words. Speaker2: [01:07:25] Actually, I wish that had come up with the name. A teammate came up with it when he realized. Harpreet: [01:07:29] What I was doing. Speaker2: [01:07:30] But yeah, glove was a major component of it because it it's a little stronger on associative strength than just basic word dieback. So if you want to do a. Harpreet: [01:07:43] Session on it. Speaker2: [01:07:44] I've got a good write up on it, even somewhere. Harpreet: [01:07:48] Oh, that's the one I was thinking about. Thank you, Tom. Yeah, go for it. So I'm just brainstorming right now because this just seems super fun. This is my brute [01:08:00] force way I would approach it. Kind of like av1 that would avoid kind of like all the ML stuff beyond just like the text recognition. So you said you already have the words that are possible within that. What I would do is I would create a dictionary of those words where the words, the key and the value are the elements of that word to reduce the variability. So like Rand would become run, swam, swim, things like those. And then from there I would take those limits and then run it through an API like the Dictionary.com API, or even to scrape it from the website, look up that, that lemma, and then get the list of every single antonym associated with it. And then from those antonyms, create the limits from those antonyms, and then compare the list of words that I have that are the dilemmas and the list of antonym lamas and see if there's any matching. Again, very brute force. But I think that would kind of get you closer to a potential match. There's probably be a lot of edge cases, but I think when when I approach these problems, I always try to see like, what's the simplest thing you can do that avoids as much ML as possible to brute force it? And then how good is that? And it was good enough. Harpreet: [01:09:18] I'm just like, All right, cool. Great. I'll move on to the next problem. But but then if there's room to improve and there's a business case where there's no business case because the puzzle for fun, but, you know, you can you can try to optimize from there. But then you have like various steps and heuristics and logic and you can try to focus then from there, like, what's the bottleneck that's making this difficult that ML could improve or some other process. But that's how that's how I broke it down, kind of like instead of trying like this matching thing, like where are the processes and steps to match? If I were to do this like visually in my head and then automate those. Bull. I think it's is that I [01:10:00] looked at where you Spacey said, Oh, yeah, okay. Spacey Cool. Yeah. You can use. Spacey You get the limos from it. But I think I think the dilemmas is really important because it reduces the variability. Yeah. Sure there's some kind. Speaker2: [01:10:22] Of wondering about. Harpreet: [01:10:23] Gpt two. Also. Speaker2: [01:10:25] It might be manageable in size and it might be able to do semantics. Harpreet: [01:10:30] But. Speaker2: [01:10:31] I was just confessing in the chat. I would have liked to find a transformer that can do semantics, but I just haven't located one yet. You'd think there'd be one that could. Harpreet: [01:10:44] This is great because I was thinking a lot of times when I come up with kind of ridiculous ideas of ways to think of things, I think of Mark Rober because he, like, spends spends like a ridiculous amount of effort to overengineered a solution to make it kind of something interesting. And that's kind of what I've been thinking about is I've been working on this, so I'm like, yes, this is this is good. I can learn. I mean, NLP intimidates me. So like having something like stupid small to work on is like, yeah, this is good. Let's, you know, dip my toe in that way. So all great solutions. Speaker3: [01:11:17] It's funny, I would approach it differently. I would ask my two kids to find those two words. I think that's like the fastest way for me. Harpreet: [01:11:27] There you go. That could work, too. Speaker3: [01:11:30] I'm going to try. Harpreet: [01:11:30] The kids rates. Are they for hire? Speaker3: [01:11:34] Of course, yeah. They're. They're willing to wait, make money. Harpreet: [01:11:39] Yeah, they got her in there. Keep. I've come to work early. The key to go for it. Speaker3: [01:11:44] It's funny because I could legitimately see a company like Fang asking that kind of question for like a technical interview because you don't even need, like, an email solution. Like, you could literally treat it as a data structure as now goes approach. And [01:12:00] that would I could reasonably see them asking like, how would you do that? Like they're like, we give you like a pint. Like we give you some like structure that is literally like the source or something. And what's like the fastest way to look up and to optimize. Like what would be the pairs like at 100% and see them asking that. Speaker2: [01:12:25] I just realized, Eric, forgive me. Part of dung beetle. The first. Harpreet: [01:12:30] Part was. Speaker2: [01:12:33] This was a this was a gold. Harpreet: [01:12:35] Mine. Fine. Speaker2: [01:12:36] I went to Thesaurus.com and I was doing an inspection on one of the pages. Harpreet: [01:12:42] And guess what? There is only a slightly dirty JSON. Speaker2: [01:12:47] Object with the synonyms and antonyms at various. Harpreet: [01:12:52] Strengths built right in to each of those. Speaker2: [01:12:55] Pages for each. Harpreet: [01:12:56] Word. Speaker2: [01:12:57] And you have to do very. Harpreet: [01:12:59] Tiny amount of cleanup. Speaker2: [01:13:00] And then you can it's JSON, you can just use it at the Jason level. I use that as the first stage in dung beetle. Then there's a mixture of different online dictionaries you can use that will like if you hit a. Harpreet: [01:13:17] Spelling error or you need a base word. Speaker2: [01:13:20] There was a series of steps I went through to. Harpreet: [01:13:23] Reduce down. Speaker2: [01:13:24] To the base word and then go and get the synonyms. Harpreet: [01:13:29] From those on the online dictionaries for ones that did not occur. Speaker2: [01:13:35] At the source dot com. And then finally I resulted the glove at the end. So it was a series of things. That's why I'm wishing there was a dang transformer that would just two synonyms. You think there would be one. Harpreet: [01:13:49] Out there, but. Speaker2: [01:13:51] Yeah. We can geek out. Harpreet: [01:13:53] If we get together on. So. Great question. Great discussion. Thank [01:14:00] you so much. Any other questions coming in? I don't see anything coming out on YouTube. Part of this would be in the background. Okay. No other questions. I guess we'll begin. I actually have a question. So lately I've been realizing that I actually don't know what the history of data warehouses are. I was watching Joe Ray's video with Bill Ehman, I believe, talking about data warehouses and how a lot of vendors lately have just moved away from the original data warehouse kind of aspect of it. My question isn't necessarily on data warehouses, but do you think it's worthwhile to pursue learning the history of different technologies and how they've transformed over time? Or is that it's kind of like a fool's errand if you're just interested as a hobby, and you should just be focusing more on just the implementation today and what the current context is. The history of ideas, man, I think that's super, super critical. Super important is to know the history of ideas and how those ideas came to be. I didn't start appreciating this thing. I started reading books about Marquis de Soto or Taste of Mexican Revolution, just to see how ideas of change or even George Lindbergh's book on geometry or just how ideas evolve and take shape. I think it's. But then again, there's there's only so many hours in the day. Harpreet: [01:15:36] So if this is something that you're interested in and it brings you joy in studying it and pursuing it, then by all means, spend the time doing it. But if. Is staying there, going to push you further ahead careerwise or. In that sense, I give you an example, is I'm very interested in data [01:16:00] warehouses. I especially like analytics engineering because I just feel like it's just a foundational piece to really drive ML strategy and data strategy. And so my thought process is knowing the history, would that would that contribute to me knowing the pros and cons of different approaches? Or is that focusing now on what's currently working today and what's the current kind of hot thing? Not trying to chase trends, but like, why is it worthwhile to go that deep in the weeds when I feel like this, this core piece of technology is very important? Yeah. Yeah. To me, intuitively, I would say yes, but let's turn it over to Mexico. But before I do that, I think history is a fun cool. As a content creator yourself and just think about if you read about the history ideas, people kind of take you along that journey with you. I think that's a good move. But let's go to a search engine and if somebody else has let me know and if anybody else has questions, do let me know as we start to wrap it up. Speaker3: [01:17:07] So I think there's like two or three kinds of history, right? Some of which is worth knowing, some of which. It is nice to know, but I don't know if it's worth investing effort. So for example, the history of the personalities within tech. So like I said, there's a couple of people I follow on LinkedIn who are like, you know, like they're like IBM fellows, they are distinguished engineers, part of the I, e e c, whatever society that I will never be a part of. You know, like they're whatever, right? And they'll have some, like, really interesting stories of like, oh, what was it like with. You know like being around when the first like I don't know real like CPU was was created or things like that. It's [01:18:00] really fun hearing those war stories. But at the end of the day, like they were one part. Grant in a very early stage, but they were only part of this really long, storied history of technology development. So those guys I follow in LinkedIn, it always gives me like chuckles, or if I hear the stories, I'm like, you know, like the MLPs, like happy hour, stuff like that. Like, I love hearing those stories. It gives me so much appreciation, empathy for the shoulders of the giant space that I stand on. And I do think that empathy is important. Right. And I'm seeing this sometimes, too, like there's like culture and sort of generation clashes, like in the workplace where I feel like I'm in the middle kind of in a way where like I'm seeing some of the some of the youngbloods come in and they are kind of sometimes a little disrespectful. Speaker3: [01:18:48] They're like, Why are we doing this? But we all did that, right? But they don't understand the problem that the tools were meant to solve. And so they want to yank out something and replace it. Not understanding like the blast radius. Right. That's not you, right? That's not Udall. And then they're but at the same time, like, I like learning from the people before me. So I think like the second type of history that is super helpful is understanding the broad themes of some of the technology ideas. For example, like some of the ideas around what some of the ideas around data warehouse is, it's less about the technology. So most people don't care whether or not you use B tree or whatever, right? But it's more about the scope of responsibility, for example, or the domain around which like for example, data mesh. We did that in my book club. Half the engineers were like, okay, so you push the problem to the left. And they were senior staff. Right. And the other half were like, oh, this is like a really interesting, nuanced, innovative way of thinking about technology and, you know, and all that. But [01:20:00] they had the context behind how sort of the data mesh came out, right? And the things with data mesh, right. Speaker3: [01:20:07] Like there's no one issue that some of the engineers have was there like there's no suggestion on how it should be implemented. It's like, yeah, because it's not, it's not a technology solution. It's not about a technology solution. It's it's about a higher level abstraction. But part of the reason why I think they didn't quite understand it and they didn't even understand it enough to say anything real against it. Was because they didn't bother knowing the foundational sort of history behind how that developed. Right, and all that. So I feel like the first kind of history where it's like the war stories, it's nice to know, but I don't know if it's worth investing in it other than like one on one conversations with people and like, what was your experience like the second kind of history of the history of ideas? I do think it's important because like for me personally, like even within the company, we do have some legacy technology. It was implemented to deal with a very specific problem and some of the conditions of that problem still hold, but some of them don't. And the ones where we don't, that's where we might want to retire some of that and consider sort of other technology, you know, but. Like, you know, so there's that. And then there's the third part about sort of like kind of more specific technology solutions. I do kind of feel sorry for a lot of people who I kind of feel like get hoodwinked by some of these, like startups that do middleware because a lot of times they're just like literally their managed services of really popular open source libraries and all that. Speaker3: [01:21:40] Right. Which is fine. I guess. Everyone has their own flavor, but. I don't know. I kind of feel like. If you don't have that serve under underlying that underlying understanding of what the technology is. So, for example, a row versus a calendar like [01:22:00] data storage, right. That's important to know. It's probably not super important to understand like the specific algorithm that's being implemented to store like the indexes and all that. But it is kind of good to know like, okay, roughly this is what like a row versus column, this is what the difference in performance is for read write. I don't know if it's that important to know every single person's various flavors like GCP versus us. I think it's good to know one. Then you can kind of compare. But I do feel like history gives you like the filter for understanding sort of more durable ideas, you know, and also like what what they were meant to solve. I think that's the big part is like what problem was this meant to solve and why? And do those conditions still kind of hold? But I get so confused by like a lot of vendor driven content. It's like very overwhelming. Harpreet: [01:22:56] Then that's fine if you want to say that that was really helpful in Mexico, that that breaking it down to those three different areas that that really connected the dots for me and definitely where I want to double down on is that what conditions change because there's technology that started out as like five years old and there's things where like, for example, airflow, ask them like, why don't we use airflow? They're like, well, airflow was coming out five years ago when we built this and it didn't handle X, Y, Z. I'm like, Oh, that makes sense. Speaker2: [01:23:28] I just need to point out real quick that you hurt several of our feelings that you would put wanting to be some distinguished technologist over the fact that we all fight over what Mexico said in these happy hours. Harpreet: [01:23:44] What the hell, chick? Speaker2: [01:23:46] Anyway, going ahead. Speaker3: [01:23:48] I don't even remember the name of these people that I follow, but I remember your names. I just see them like, oh, look, they they they I don't know said congrats to this other person that [01:24:00] joined their super secret cool Mensa club. Then the rest of us aren't a part of. That's cool. But I know all your names. Harpreet: [01:24:12] Then was. I want to go ahead. Speaker2: [01:24:18] I grew up with. So I heard somebody I grew up with, like really old engineers. And when I say that, like I, I worked at companies where people were in there when I was in my twenties, people were in their late fifties, early sixties at these companies. And they came out of like they remember where the Internet came from, like that, that kind of old school thinking. And I swear, if I ever heard another story about a punch card again, I never want to hear another punch card story again. There's some stuff where it's just, you know, and that's kind of the danger of becoming the older, you know, the older trope or older stereotype where you're telling the same five stories over and over again about the same topics and there's just no point. Don't learn that history. Don't become that person. What you want to do is find stories that are educational not only to you, but that you're going to be able to use in ten years. That's really the stories you want to learn and internalize and really dig into. Because I had a boss who like real engineering, you know what I mean? What I say we call ourselves engineers, but this gentleman was an engineer, told stories about how they had to solve certain problems when they were doing nuclear tests because you were trying to get data, but the wire was being incinerated like real engineering challenges. Speaker2: [01:25:59] And he [01:26:00] had these rich stories about how you go about solving truly, no one's ever solved this before and you're dealing with something that most people are going to say is impossible, but you have to do it anyway and you have to figure out how to approach a problem like that, you know? And so those are the stories you want to dove into because eventually you're going to get a problem where you Google it and there's zero search results. You know, you get that that terrifying moment where you go to Google and there's like six search results that are all in another language and none of them have anything to do with what you use to translate. And you're all know that doesn't. Yeah, that's not what I'm looking for. That's, that's the horrible moment. And you want to collect stories that will allow you to begin to understand. And, you know, that's where data warehouses are kind of interesting because if you dove into the history of it, did it, warehouses showed up before anybody had big data. Everybody had like really small data, but they were all saying, we're going to be gathering this massive, massive data sets. And so data warehouses showed up and nobody needed them. And so they had to, like, rebrand it. And so you can listen to the stories of Microsoft, you know, and all of the early data warehouses and by early adopter. And so it's interesting if you're going into marketing, those are the kinds of stories you want to tell because as a marketer. Speaker2: [01:27:21] You need to understand, what do you do if you get to market too early? What do you do? How do you pivot? And that's what's interesting about them is you begin to get into case studies where you can say, and this is where you're going to find yourself actually getting value out of these stories short term, because you'll come to points in your career where there's no there's no frame of reference and you have to go more abstract and start working with people who just figured something out that you are now dealing with something similar around. And so that's where I'd say it's really important is those two areas where one, ten or 15 years from now, when you're in your mid [01:28:00] twenties, you'll be mid twenties about ten years from now. You know, when you get to that point of being senior, I call them senior plus. Plus you're going to have these stories to tell. And part of these are stories that you heard, but other parts of them are really stories you lived through. Like you're going to be talking about the birth of data science and living through it. And those are the stories you want to bring with you because they're invaluable. You know, for me, learning about how you solve. So we got a nuclear blast happening and you got to, you know, because it's not like you can walk up to the wire and figure out what's going on. Speaker2: [01:28:35] You know, there's some and that's those are the great stories that you can take with you and teach people, you know, that like I got taught about the real fundamental challenges that you may one day face where there's no there's no guidelines for it. And then the other ones are really for you personally, as you're moving along, the stories you think are similar to some of the things that you might run into in the future. But kind of extrapolating out what happens when I run out of ideas, what happens when I run out of help? How did these people do this? And the more complex, the problems that they solve, the more rigorous the methodology has to get. And you can take pieces of their methodology and bring them into every project, and you start producing just better results on a daily basis. And all of a sudden people are coming up to you like, how did you even, you know, and that's kind of the moment of Zen is like when you've incorporated enough other people's ideas into your workflow and into your methodology. And all of a sudden people are like, Wow, that's brilliant. You're all know, I'm just stealing from other people. But don't say that because you came up with it just like I did. Harpreet: [01:29:51] I steal from Ben all the time, just like you know. Speaker2: [01:29:54] Well, it's good because I steal from you. I mean, I just, you know, mine's behind a paywall, so you never see it. Harpreet: [01:30:02] Love [01:30:00] it. I love it. Talk about love. The days before birth control and open source. Yeah, with open source. It's like steal it, make it better. Still on top of it. Being flexible. Love it. Tom, we have a statement for it. I think Na'vi. Were you waiting to talk? I didn't want to steal. Your response. Speaker3: [01:30:25] Yeah, no, go ahead. Go ahead. I was I was just kind of adding on to what Michiko said, so go ahead. Speaker2: [01:30:30] Okay. There's been a new. Harpreet: [01:30:32] State. Speaker2: [01:30:33] Improvement in our zoom grid, and I don't want to see it go away. So I. Harpreet: [01:30:39] Want us all to encourage and applaud. Speaker2: [01:30:42] This maturity change. It's the. Harpreet: [01:30:46] Color of Harpreet. Speaker2: [01:30:48] Beard. I think it's a great improvement and I don't want to see it go back. Okay, good. We're all feeling the same way. Harpreet. This man makes you look so like you look pretty. You're a good looking guy and you look great before, but damn, dude, you look good now. Harpreet: [01:31:06] I've had. Thank you very much. Appreciate that. Hiding behind the beard far too long. But it was just a slight it's a little bit more evenly distributed now than it was when I first started doing it. Yeah, really. Go for it. Speaker3: [01:31:25] So I think the original question about data warehouses, like I suddenly felt old in this conversation because the warehouses that I remember were where we were pulling data from like UNIX and passing those SAS commands and using my see prompts and whatnot and those in those days. But I think they you know, I wouldn't say that don't study them. I would say that the history of each industry is different. [01:32:00] The maturity of those warehouses, of these industries is different. And if that's where you want to go, then find the industry that you either work in or care about and go backwards, because that is going to vary a lot based on how mature that industry is and how they've built their warehouses or not on decades. So for me, I think retail, CPG, the Nielsens, the financial services have like 50 plus year old warehouses and their history is a lot more advanced and old and how they've evolved during this time. So, you know, you can always kind of go back there and see which are the more mature industries where you can gather more and see how they've actually started there and moved all the way to building algos, professionalizing them, using those models, getting into analytics and so on and so forth. So there are places that you can go, but I would cherry pick based on what your interest is. And yeah, because you can get lost pretty quickly also. So, you know, and I don't honestly I agree with Mickey. I don't know what you're going to do with it, but if that's where you want to go, then I would cherry pick where you want to go. Harpreet: [01:33:36] Yeah, that's super helpful. I didn't realize. But yeah, a financial industry is like a really good use case. So yeah, I'm totally can do that. I'm kind of inspired to write a blog on this, so look out soon. Maybe I'll be able to pitch it to someone who wants to to have me write it. So make it happen. Big question. Discussions with different people go for it. Speaker3: [01:33:59] Yeah, I think [01:34:00] the I think what it comes down to is understanding the problems and the solution and how it took the the tech form. Because like when I think about it, you almost don't even need to know like kind of the time points or the epochs, right? Like you don't necessarily need to have the chronological history, but like, so for example, the thing that drove me nuts is like when I was first getting to serve the more kind of engineering practices and people were like, Oh, you should use Docker. And I'm like, What's a Docker? And they're like, It's a container. I'm like, What the hell is a container? It's like, Oh, well, a container is blah blah blah and a VM, blah blah blah blah blah blah. I'm like, Stop, what is a VM? Right? And like. It's like they were answering the question, right? They were like, Oh, it solves this problem. And I'm like, But if I'm if I'm like a new like so if I'm new to the problem of the practice, like, please don't explain it to me as if I'm not smart enough to Google it and have seen that same explanation across like 50 or 60 different tech blogs like explain to me as a new user, fresh brand, spanking new eyes, why should I be using it? Why should I be using a container versus like a virtual environment? What are they? What problems are they meant to solve or things like that? And I kind of feel like sometimes to that sort of that laddering of here like first came via the first game, physical computers, then came VMs, then came virtual environments, which are basically just folders. Anyway, sometimes there's a runtime, but they're basically folders, right? But, and then came containers, right? And it's like I wasn't there for the prior two or three evolutions. Speaker3: [01:35:43] So in my head they're solving problems that I didn't even know existed because I'm like, so new, right? So I do kind of feel like there is a certain sort of like I love that quote that came from I posted it from the designing data intensive applications because it is kind of true, right? Like wherever [01:36:00] you enter into your engineering or computing or tech journey. Um, a lot of times there isn't really good high quality content or educational resources for that entry point. A lot of it is just kind of assuming sort of that you've already been there. So like, I don't know, like that's, that's the way I kind of think about it is if I can kind of articulate the problem and the solution and the chain of problem solutions, then that's like a good spot to be. I think that's a good kind of history. I think some of the other stuff is like, you can take it or leave it, but it's also an opportunity if you want to do that content as well. Like I was, I thought it was a great opportunity for me to learn more about containers and VMs when I was writing about it, because I was like, I wish someone had explained it to me in this way, right? Not about the war stories, not about. Not that I was like, not as if I was an enterprise company that had been in the trenches for 20 years. But explain it to me as if I'm new, I'm fresh. You? I don't know. That would be nice. Harpreet: [01:37:04] So I think this would be really, really helpful. As I say, it's really helpful because like for me, the reasoning is like beyond just learning because I'm interested is like I actually want to apply that knowledge that was in the past to the future and like it goes beyond just like the history of technology, but what's like the social and market history. And so I think a great example like kind of my mind is going back to is not necessarily history yet, but it's happening now. But why, why all of a sudden become super popular right now, even though it's been around before? But like, why did it become popular? And like I think I read some blog where essentially like storage became super cheap and data warehouses, cloud computing became very fast, like bakery and stuff like that. And so therefore it's just much easier to do transformations and data warehouses and cheaper and faster, and that's why people shift it towards that. And so like, I would love to know, like, what's that thought process like 15 [01:38:00] years ago where it was like, well, why do people choose Hadoop, right? Like, what was the problem? Problem use case with big data, why they choose to do and use the situation. And then finally, why do they leave it, you know, in such a quick time? I think being able to understand those problem statements and the solutions, they come up with it, giving that historical context, I feel like kind of within was saying give me clues of like when I face future problems or try to combine a unique ways when I don't repeat things that just didn't work. Harpreet: [01:38:31] And then too I can actually identify what's actually novel and that's kind of like the direction I'm coming from. So that hopefully provides more context. But you all have really excellent answers that really crystallize kind of the words around that. The old, antiquated, outdated, built into technologies is in order something modern, something scalable, something that. Just works a lot better, like for example. Anyways, Tom says the name Kubernetes stems from an ancient Greek word for helmsman someone who steer the ship like a container ship which extends the ship. Yes. And also the package manager in Kubernetes is called Help. All right, cool. Does not look like there's any of the questions. Thank you so much for that great discussion. So keep an eye out for a podcast released probably sometime this weekend. My editor ran into some issues, but we'll have these results with Danny Marr Livestream tomorrow with Kiko and Mark and Zac. It will be at 1 p.m. Central Time, 6 p.m. Eastern and Pacific. [01:40:00] So join in on that. This initiative undertaken by U.S., the helmsman of this project, talking about mental health and data science. It's a great discussion, so definitely tune into that. Anything to say about this event? Speaker3: [01:40:20] I you know, I think so. The the group of us tomorrow right. You know you free. Mark and Zach it's interesting because I feel like at like the same time all of us for individually having conversations with each other about mental health, I know we've all had our specific journeys and I know we've also been there in each other's lives when we were all going through burnouts. But it's fascinating because a lot of times. I do hear people saying, like, why does it like, you know, they'll say stuff on LinkedIn like, well, why does it matter? It doesn't impact your professional life. Yadda, yadda, yadda. Right. But it is really important. And, you know, the software is I think we all know we all know people in our lives who I think they would have benefited tremendously from not having that stigma, from having an open judgment, free place to kind of talk about their experiences. But more importantly, like for all of us, our journeys will manifest differently. Right. Like my journey as a as a queer Asian woman is going to look different from like your journey from Marc's journey, from Zach's journey and vice versa. And I think I think it's going to be fantastic. There's going to be all four of us. Speaker3: [01:41:37] We're gonna be under the same roof. And I'm really kind of hoping, if this goes well, that maybe we'll be able to do sort of more sessions. And the other part that I think is really fantastic is a lot of times I think when people talk about mental health, it's always in the context of like. People who obviously sort of like I don't say obviously need help, but they fit a certain stereotype. [01:42:00] And so the other part that I really love about this group of us is that we're all like we're all achievers in different ways. You know, we've seen that look, you know, you can move forward, but you do have to take control of your mental health journey and there are sort of voices in experience out there. So I think it's going to be a wonderful event if this goes well. I'm hoping we have more of these talks in the future, because I know there are other people that we know second, first, third degree connections who also have different experiences. So yeah, I hope people tune in, they enjoy it and that we'll have future sessions in the future with different people and talking about different experiences. So yeah, I'm excited. Harpreet: [01:42:45] Thank you so much for driving the initiative for this great conversation. I'm excited for what's happening tomorrow, so hopefully I can join in. Cool. We'll wrap it up. Thanks so much for joining us today. Thanks so much for being here. Appreciate your. I'm looking for a guest host next week. If anybody is available, please do let them know. I've got a cool thing happening. Next week. I'm going to go to this place in British Columbia called the also U.S., which is the southern part of the Canadian desert. We're going there for my son's second birthday with the kids, two years old and two years old. And here I am, turning 39 in a few weeks. That's crazy. But yeah, it's going to be it's going to be fun. But my my my parents, my wife's parents, my grandparents, all in one giant cabin, four bedrooms, four baths that's needed. So let me just say what I'm excited about right there on the lake. Just chill and relax. So hopefully that. Still during next week. And I can find somebody I can help out. Let me know. Guys take care. Have a good weekend and look forward to email tomorrow. When my friends [01:44:00] that when my son and I try to come to terms with the.