HH58-12-11-21_mixdown.mp3-from OneDrive

Harpreet: [00:00:08] What's up, everybody, welcome, welcome to the artist Data Science, happy hour. It is Friday, November 12th. Wherever you are, I hope, I hope you guys are doing well, staying warm here in Winnipeg. We got literally three feet of snow outside. It is madness. I had the craziest blizzard I've seen in like three years over the last couple of days. It's officially the holiday season, I guess. You know, that's that's kind of rings in the holiday season for me personally. I don't mind the snow as long as they don't got to go outside or do anything in it. I'm sitting by the window

Harpreet: [00:00:41] Having a nice beverage. I'm getting.

Harpreet: [00:00:44] But hopefully it's nice and warm where you guys are. Also, I hope you guys got a chance to tune into the episode I released today with a good friend of mine, the one and only George Farah Khan. We talked about how to turn the lights on Data, so we had a great conversation all about

Harpreet: [00:01:01] The governance and a bunch of cool

Harpreet: [00:01:04] Stuff from this course. Hopefully, a chance to tune

Harpreet: [00:01:06] In

Harpreet: [00:01:07] Just what's coming up in a couple of weeks.

Harpreet: [00:01:09] I've got over the next couple of

Harpreet: [00:01:11] Weeks, I've got Steve Cardinal. All turning ideas into gold, then crush al-Qaida. We're going to talk about NLP and philosophy, he's got the philosophy Data project, so that's pretty cool. Hopefully, AIs have been enjoying the medium publications that has been dropping all week for you guys. Definitely. Let me know what you guys have been thinking of those written pieces. If you enjoy them, let me know if you don't enjoy them,

Harpreet: [00:01:39] Let me know.

Harpreet: [00:01:41] Shout out to everybody in the room. What's up then?

Harpreet: [00:01:43] Serge, Eric, Marc,

Harpreet: [00:01:45] Matt, Jennifer. It's been so long, Jennifer. I haven't seen you in a very long time. And Russell as well. If you guys got questions on anything whatsoever, please go ahead and drop it right there in the chat.

Harpreet: [00:01:56] I will be happy

Harpreet: [00:01:57] To to take your questions. Also, [00:02:00] big thanks to Avery for taking over last week doing the emcee duties for me. I really appreciate that. Avery, thank you very much. Hopefully you guys enjoyed hanging out with Avery. So yeah, man, let's go. Let's get it going. So. I guess I guess. I released a piece today called The Data Science Mindset, so well, it wasn't called the Data science mindset, it says you might know Data science,

Harpreet: [00:02:25] But you don't

Harpreet: [00:02:26] Think like a data scientist. And the whole piece was kind of me laying out the data science mindset, which I said, it's like there's three piece

Harpreet: [00:02:33] Thing engineer, business

Harpreet: [00:02:35] Person and scientist. So I'm wondering, what does it mean to you to

Harpreet: [00:02:39] Think like a

Harpreet: [00:02:40] Data scientist? What does it mean to you to think like a data scientist? Let's kick it off with a surge, and let's go

Harpreet: [00:02:46] To then

Harpreet: [00:02:48] As well. Also, big congratulations to to mark as well. Yeah, I've definitely got to shout that out. Mark got promoted to senior data scientist about whom you so huge round of applause for a friend, mark. That's that's amazing. So we'll get to mark on a

Harpreet: [00:03:05] On on some pointers

Harpreet: [00:03:07] On what it took to to level up like that. I'd love to. I'd love to get into that. Let's start the conversation first. With this opening question, what does it mean to think like a data scientist? Then we'll get into the question here from Matt Damon talking

Harpreet: [00:03:22] About the Jupyter

Harpreet: [00:03:24] Notebooks to non-technical

Harpreet: [00:03:25] Users. And then we'll move

Harpreet: [00:03:27] Into a

Harpreet: [00:03:29] Into some questions

Harpreet: [00:03:30] For Mark. Let's kick it off, Serge. Go for it.

Speaker3: [00:03:34] Hi, I think there is no Data science without science.

Harpreet: [00:03:40] I think that's that's probably to me

Speaker3: [00:03:42] Like the key differentiator between being a data analyst and a data scientist. I mean, Data analysts could have engineering skills and know a lot about the business, but a data science must be a skeptic. In other words, not trust the data. Not [00:04:00] even trust intentions behind the data or the data generation process use. I think interpretation experimentation

Harpreet: [00:04:08] Statistics are

Speaker3: [00:04:10] All very important in in disentangling all the different things that come to make the data. And and and what the data represents in the real world. And so I I basically think although it's very valid to see it as a Venn diagram where you have all these different things that are needed. I think that's a differentiator in Data.

Harpreet: [00:04:36] Walk us through that real quick. Just like a high level, what does it mean to run an experiment in machine learning? Does that simply mean, OK? I've got a problem

Harpreet: [00:04:42] Statement about Data.

Harpreet: [00:04:44] I've got a suite of candidate algorithms that I wish to test on this data, and I want to put forth this hypothesis that that I could actually model this data generating process. And then here my my, you know, different models. And do I compare whether I get a good MSI or not? You get MSE, you know what I mean? Like, what does that look like in data science?

Speaker3: [00:05:06] Well, it's not even it's even before it gets to the modeling process. You can do that through modeling. Sometimes there's there's things you don't understand and you you don't understand them. Tell you, Madeleine. And sometimes through unsupervised methods, you can learn a lot of things as well. And my my work with with weather Data, I find sometimes through clusters, I find, OK, well, this whoever put this Data here doesn't belong here. You know, there must be, because it's just so far off everything else. You know, there's no way like the plant would have been planted here. You know, the crop would have, you know, things like that. I tend to find that way. But if I were just trusting the data and just running it as you know anything, I would end up with a certain percentage that would

Harpreet: [00:05:53] Be poor quality

Speaker3: [00:05:55] Simply because of that. And so I have to put the science first and [00:06:00] try to run some scenarios, and sometimes I don't even know what those hypotheses are until I explored the Data enough. Or maybe if I didn't explore it enough, I go through the modeling exercise and I realize I have poor results and then I have to trace it back. You know, to what? What misclassified and why?

Harpreet: [00:06:21] Very thank you very much. Hopefully, you guys took a lot of that, that's one that you might want to run back on. Repeat, sir. Thank you. Actually, let's go to before I get to Vin, let's go to Vivienne on this Vivienne. I don't know if you heard the opening question, but I will repeat it for you. It is. What does it mean to think like a data scientist? This is coming on the heels of a

Harpreet: [00:06:38] Blog post that published today, which

Harpreet: [00:06:39] Is so you know about data science, but you don't think like a data scientist. So I'm just curious, what does it mean to you to think like a data scientist?

Speaker4: [00:06:48] Now, I wish that I had read that blog post already. I got the notification on my phone actually to read it, but I haven't read it yet.

Harpreet: [00:06:54] No, I wish I had.

Speaker4: [00:06:58] Can I not go next? I think I need to think about this, actually.

Harpreet: [00:07:02] Yeah, yeah, no worries. Let's go to let's go to Vin. And then after Vin, maybe Eric, and then we'll go to Vivienne. Then we'll go to Mark because I like this twist that

Harpreet: [00:07:13] That Eric

Harpreet: [00:07:14] Had had put on the question. What is something to think like a senior data scientist? I think that is what I want to ask, Mark. So let's go to Kevin. Eric Vivienne, Denmark.

Speaker5: [00:07:27] You, like a data scientist, means a lot of different stuff now because every different domain has like a different flavor of data science, it feels like we all do a lot of the same stuff. We all use some of the same core tools, but thinking like a data scientist now means understanding your domain.

Harpreet: [00:07:46] And I think we forget that in a lot of

Speaker5: [00:07:48] Cases that the domain knowledge that other people have around you, the subject matter experts,

Harpreet: [00:07:54] Those people help

Speaker5: [00:07:55] You think like a data scientist who can build something for that part of the business [00:08:00] or for that particular use case. So I think search just killed it.

Harpreet: [00:08:04] So I got like literally

Speaker5: [00:08:06] Nothing else to talk about

Harpreet: [00:08:07] From a, you know, from a

Speaker5: [00:08:08] Traditional of science data scientist standpoint. But I think the one thing we often overlook is thinking like a domain expert thinking like a customer

Harpreet: [00:08:17] Or a business

Speaker5: [00:08:18] Owner or some stakeholder or senior leader whose jobs on the line. I mean, just there's so many different pairs of shoes you can put yourself in to understand the little things that they're not telling you about the project.

Harpreet: [00:08:29] But to figure out what

Speaker5: [00:08:30] Questions maybe you could ask or how you use all the tools that you have to build something that's

Harpreet: [00:08:37] That meets a need and

Speaker5: [00:08:38] Oftentimes meets multiple needs. So that's, I think, thinking like a data scientist now, especially as we start supporting different parts of the business and more complex use cases and we're starting to touch customers. It means thinking not

Harpreet: [00:08:53] Just like ourselves

Speaker5: [00:08:55] Walking outside of the walking outside of the machine learning team.

Harpreet: [00:09:00] Then thank you very much. Let's go to

Harpreet: [00:09:02] To Eric and then

Harpreet: [00:09:05] After Eric, go Vivian, then Denmark.

Speaker4: [00:09:09] I think my answer is probably going to be supremely unsatisfying. It's unsatisfying to me, too. So am I trying to think about it? I'm like, OK, what do I do that's different from, say, the people around me? And one of the things I thought it was like, Oh, we'll just like, ask an absurd amount of a certain number of questions, and I'm like, Well, that's that's good. And I think, well, actually, I have a GM who asks me all the questions I haven't thought of yet. It drives me nuts because I didn't think of that question. So I'm like, Maybe it's not just questions, or maybe it's the tools that I use or something. But what does it mean to think like a data scientist doesn't mean to, like,

Harpreet: [00:09:42] Just use a tool that

Speaker4: [00:09:44] Other people don't use? And so as I'm trying to compare myself to people around me who don't do what I do to figure out what the difference is

Harpreet: [00:09:55] And really like,

Speaker4: [00:09:57] Just the main thing I keep just coming back to is like, we [00:10:00] all try and think, well, try and think. Similarly, we all try and think about problems and break them down into little pieces.

Harpreet: [00:10:08] And how is it going to affect

Speaker4: [00:10:09] The customers or the bottom line or whatever? And really, because every

Harpreet: [00:10:13] Organization is so different,

Speaker4: [00:10:14] I think that thinking like a data scientist means like all those things, but just knowing how to apply them to the tools that you have that the person who works on the product team doesn't have, just like they know how to apply their tools in a way that I don't know how to do, and I'm not going to learn how to do. So I think it's super things really organizational specific.

Harpreet: [00:10:36] Eric, thanks so much. Let's go to Vivian.

Speaker4: [00:10:42] I'm something that I was thinking about that. Um, is sort of unique about my role compared to the other people around me is that I feel like I'm sort of a connector between some other roles, like the the product manager and like the Data engineer, something like that, like I'm the person that sort of like can connect all these people together. Know the product manager really wants to like use Data, but doesn't have the right know how to dig into it. The Data engineer has like really specific expertize of how to like, get the the Data pipeline up and going and everything. And then I feel like I kind of like and the connector piece of like how to actually like, get this Data. And then like, make it meaningful for a product kind of thing. Yeah. So that's what I thought of just when everybody was talking. I agree with everybody else, but that was kind of like the piece I wanted to add.

Harpreet: [00:11:47] Thank you, Vivien and Vivien. Can you talk to us about the metaverse at some point in the conversation? Because I would like to like to know what Facebook has planned for that.

Speaker4: [00:11:56] Uh, if I I don't know what Facebook has planned for that. I [00:12:00] feel like.

Harpreet: [00:12:03] Let's go to let's go to Mark. Mark. Yeah, yeah. Let's listen. Let's go to mark about what it means to think like a senior data scientist.

Speaker4: [00:12:16] Yeah, thanks, everyone, for the shout out to I really appreciate that and well, first of all, I answer like what I think the main thing that came to mind when thinking like a data scientist, that thing hasn't mentioned yet, but the role of trust is like your main social currency, an organization. And I've actually one of my first Data science roles. I was I actually presented like completely round numbers and I lost the trust. So a lot of stakeholders and it took me months to build that back up. And so like, you know, everyone wants to be trusted as a Data professional. Like, how can you create processes and like, define your assumptions and test those out? So even if you are wrong, it's like, Well, I document this process and like, this is how we got to that, and this is how we're going to fix it, right? But for the most part, you should get great answers for things. We're with a lot of ambiguity. And so that that that sting really stuck with me, it's like, All right, how do I create processes so that I trust my results and I can share that trust with others? So like, once they get that down, so going to the next stage is like, how do I think like a senior data scientist? And I think

Harpreet: [00:13:25] The main thing

Speaker4: [00:13:26] For that is you move away from like project

Harpreet: [00:13:30] Focus to like business focus

Speaker4: [00:13:32] As a whole. And so you start thinking, like, how does your work fit within

Harpreet: [00:13:38] Within the

Speaker4: [00:13:38] Entire Oregon drive the needle forward? And specifically, it's like you can do so much as data scientist is what you choose as can be most impactful. But then another piece that took me a while to pick up was What's the right time to engage in this? Because there's especially in the startup, there's so many things that to reprioritize. So like learning how to prioritize what's the correct thing to engage on was really [00:14:00] critical for that. And then from there, you know, it's one thing to say like, I have this idea and this is the right time to do it, but then be able to execute on it. So, you know, non senior data science, they may bring this idea to the manager, and the manager may be able to like pick up like, Hey, here's all these pieces you bring us all together. There you go. Here's this project. As a senior data scientist, you know, I'm the one going out and building up those relationships. I'm selling this idea and getting buy in, and I'm more so loopy and my manager to know the progress and so they can step in just in case they're like, Hey, actually, there's like change in the business here, right? And so combining the ability to identify like where the key business things are moving forward, what's the right timing for it? And then to be able to execute and get the people around you to deliver on it pretty self, kind of not self-motivated, but basically push forward and execute it and driven by yourself, you know, you definitely collaborate a lot of people, but like taking lead on that project, I think that's the key differentiator for me that I've noticed between my first day of science to like my work now. And big shots of beans, Data, I mentioned that all the time, but his workshop was really good because the figuring out what to work on was like, that's what I learned through that workshop.

Harpreet: [00:15:22] Yeah, yeah, that's a great, great workshop. Ais that that same session with you, Mark. Then whenever you are doing that next time, please do. Let us know. I'll be sure to to link out to that. It is well worth your time if you're trying to get to that next level in your career. Mark, when I want to talk real quick about the incidents you talked about, we kind of lost trust by presenting the wrong numbers like what is the process look like? First of all, you present the wrong numbers. It probably feels shitty. How did you? Yeah, personally, like, how did you respond to that? How did you take that? And then what was it like the process to gain the trust back [00:16:00] from your colleagues?

Speaker4: [00:16:02] Oh, man, that was that was a rough time. That that definitely sucked. To give context, I was working with ophthalmology Data. So AIs and essentially, you know, you're trying to get user accounts of people who present this disease. But something that's a weird quirk with ophthalmology. Data is like,

Harpreet: [00:16:21] Is it the left thigh or the right eye?

Speaker4: [00:16:23] So that completely messes up your accounts because it can either be one eye to AIs or one other eye. And so I didn't count for that because there's this new domain for me and all my numbers

Harpreet: [00:16:35] Are wrong, right?

Speaker4: [00:16:36] And so it was for a major client and the salesperson presented it for a large contract. And yeah, you live and learn.

Harpreet: [00:16:47] But essentially,

Speaker4: [00:16:51] You know, the way I built that back up was, I got really on it for my analysis plans and really saying like, OK, I'm actually forget trying to go straight into coding. So I was like, fast paced move as quickly as possible, taking a step back and really like thinking through like this, how I'm going to solve this and then going to the stakeholder like this is my thought process. You know, at this point here, here and here, I see your domain expertize being really helpful. Do you think my assumptions are great here? And by documenting that and also getting buy in from the stakeholder and the requester, that instantly builds in a lot of trust for that. And so kind of repeating that process. And so it went from like, Oh yeah, wow, Mark Mark totally messed up that the AI count thing to slowly shifting towards like, oh wow, Mark's really thorough. And Mark really brought me along this journey to see how he got those numbers for us. And so that that was a long process. But here's thing, I didn't win everyone back. There are some people who are like, Oh yeah, Mark's great. Other people are like, they were stuck with that impression for a while.

Harpreet: [00:17:54] Mm hmm. So when you kind of think back to that now, like if there's like. One bit [00:18:00] of advice that you can give the the the mark of the past before he looked at that data to do it more carefully. What would that? What would that be?

Speaker4: [00:18:11] I think the main thing would be the slow down, slow down. I was so hungry to you're like, Oh, I need to finish things quick, right? Because that was my mindset. When I first started, it was like. You know, when you're very entry level and before Data science AIs an operation, so it was a matter of how quickly can you get things done, like how many boxes and tickets can you complete, right? And going to the Data science is a completely different mindset. It's like, what's the quality and value it can bring? Like, if I bring 10 crappy

Harpreet: [00:18:44] Things, that's going to do

Speaker4: [00:18:45] Nothing compared to like one amazing value changing thing, right? And so I brought that old mindset into the job, and that was just the wrong approach. And so being able to slow down and really think through not just like, think through why I'm coding, but like really think through these assumptions once saved a lot of time in the long run. But also like I have this document I use as an asset to like kind of saved my ass, but also like, I really don't need to save my ass because I thought through it so much. And the stakeholder feels comfortable because I've put in so much effort up front.

Harpreet: [00:19:20] Mark, thank you very much for move on to the next question. Just a couple of comments here from Russell Russell says rule number one disconnect eagle opinion and emotion from all considerations in Data science then

Harpreet: [00:19:31] Followed up with strong

Harpreet: [00:19:32] Held assumptions and gut feelings. How dare you? Russell also says here that it could be mortifying more to more terrifyingly painful, to present inaccurate data, but owning the responsibility for the mistake and the corrective action, plus making sure all stakeholders know when the correction is implemented can go a long

Harpreet: [00:19:52] Way to regaining

Harpreet: [00:19:54] Any lost trust. Russell, thank you so much for sharing that. Also shout out

Harpreet: [00:19:57] To a Shantanu

Harpreet: [00:19:59] Is is [00:20:00] in the building. Dr. Dooley, thank you so much for coming back. Good to see you again. Also shout out to Christine and Ellen. Good to see you guys here. You guys have any questions. Please let us know if you guys got

Harpreet: [00:20:09] Questions in in here in

Harpreet: [00:20:12] General. Let me know. Just drop it right there in the chat or send me a private message on LinkedIn. If you guys have questions to watch on LinkedIn, please let us know as well and happy to take your questions.

Harpreet: [00:20:23] Let's go to Matt Diamond's question next.

Speaker4: [00:20:30] Hey, guys, thanks. Thanks, Harp. And the question I have is how to get people

Harpreet: [00:20:36] Who have used

Speaker4: [00:20:38] Excel for decades with a lot of vested interest in the tool, which is familiar interface, they're comfortable with it,

Harpreet: [00:20:45] But it just

Speaker4: [00:20:46] Doesn't do the

Harpreet: [00:20:47] Job that I needed to

Speaker4: [00:20:49] Do, and I set my organization needs it to do. But there's such a lack of familiarity with the ins and outs of data science that I'm stuck in a hard place. Getting people to taking the horse to water proverbially. But getting him to drink has been a challenge. I can imagine

Harpreet: [00:21:06] That someone

Speaker4: [00:21:07] Here is his face, something like that. I just I'd love to learn

Harpreet: [00:21:11] How

Speaker4: [00:21:12] To get people to to understand what what Jupiter can do and the types of database I can pull off without overly intimidating people into writing code and Jupyter notebooks, and anything along those lines would be super helpful.

Harpreet: [00:21:27] Yeah, definitely. I want to start off with Ben for this question after Ben go to a go to

Harpreet: [00:21:31] Go to Mark, but just one

Harpreet: [00:21:33] Tool right off the bat that might be worth looking into is Nas Dot II notebook. As a service, I AIs Jeremy Jeremy Régionales project, and it's meant as a way to just make it really easy to communicate what you're trying to communicate to non-technical people. So that's a tool worth checking out.

Harpreet: [00:21:56] Jeremy, thanks for

Harpreet: [00:21:57] Creating that awesome tool. Also know Jeremy just [00:22:00] had a baby a couple of weeks ago, so congrats to Jeremy.

Harpreet: [00:22:03] Let's go to Vin Vice-Versa.

Harpreet: [00:22:05] Then we'll go to Mark. And then if anybody else wants to jump in here, Serge, Vivienne, Erik, Jennifer, anyone just let me know just like that. Like, raise the hand on the reaction thing and I'll actually thank you.

Speaker5: [00:22:21] Excel is kind of like a cult. So you're in you're in a really, really tough spot like the normal generic advice that I give doesn't work for some stuff like Excel is one of those niche areas. It's like trying to get somebody that codes and C++

Harpreet: [00:22:35] To acknowledge

Speaker5: [00:22:36] Other languages

Harpreet: [00:22:37] Exist because

Speaker5: [00:22:39] It truly is. There's like this cultish devotion to it. And so that's really what you have to do is somehow get the people out of that cult mindset. It's not so much like they're scared of other tools. It's that they love excel so much that there's this irrationality around it. And you know, it almost doesn't matter how cool the alternative is that you show

Harpreet: [00:23:02] Them, you know, they

Speaker5: [00:23:03] They really have to have some sort of crazy, huge pain to break out of that groupthink.

Speaker4: [00:23:10] Yeah, I mean, the pain is it's survival. I'd argue the industry is not going to be competitive. I mean, our position in the industry will not be competitive because there are other firms using that plot live and getting better results from it. But there seems to be I don't know if it's it's as much. I mean, it's the sense of fear and a clinging to it. It's just trying to to show that there is an alternative out there, but there hasn't been an infrastructure put in place and there hasn't been a user friendly explanation of what it can do without delving into the technical details, which is fine. I'm happy to be that ambassador, but. Getting there, it's more about behavioral science than anything, I would imagine. [00:24:00] I'm completely empathetic to that, but I'm sure there are things that I don't know. I'd love to hear what other people have done in this super helpful men.

Speaker5: [00:24:10] Yeah, it's it's basic programing. And that's the, you know, the first piece of programing is you can't call out the person who's stuck in the cult. Yeah, for being wrong or stupid or crazy or anything like that. You almost have to just be there and go, Yeah, yeah, no. Excel is amazing, but you know, and you get enough of those you

Harpreet: [00:24:33] Knows and you open

Speaker5: [00:24:34] Up enough opportunities for conversation. Someone is finally going to walk

Harpreet: [00:24:38] Through that door.

Speaker5: [00:24:39] But the problem is, you want it to happen fast and it doesn't. The way that it feels

Harpreet: [00:24:43] Like you're going to get

Speaker5: [00:24:44] Fast results is to show how wrong and how lacking and how bad excel is. And you can't really do it. They have to come to their own pain. They have to see it. And all you can do is kind of provide evidence and facts

Harpreet: [00:24:59] To support

Speaker5: [00:25:00] Using Jupiter. Or, you know, if you can find some intermediary, some like gateway tool that'll get you there. That's another good technique. Like I said, this is really excels like and you have to program people. That's, you know, that's really step one is just starting conversations where they can ask some questions and not feel like excel is terrible,

Harpreet: [00:25:21] But feel like

Speaker5: [00:25:22] Maybe there's something that can get bolted on top of excel and then you rapidly wean

Harpreet: [00:25:26] Them off of,

Speaker5: [00:25:29] You know what? What's going to continually pull them back? And that's the other problem with Excel is it gets you so comfortable that if you ever get out of your comfort zone in a notebook, it's like 15 seconds you're back to excel. And no matter how many hours it takes you to do something that would

Harpreet: [00:25:43] Take you 10 minutes of

Speaker5: [00:25:44] Googling to figure out. They'll go right back.

Speaker4: [00:25:47] Yeah, I mean, luckily, I'm in a position where if my team

Harpreet: [00:25:52] Just starts running

Speaker4: [00:25:54] Over with the Jupiter notebook and using Data size day to day out, we're absolutely crushing other [00:26:00] teams and other organizations. As callous as this sounds, I'm willing to let those who want to cling, to excel, cling, to excel and through their convictions. That's I don't want I want to be part of the solution and not part of

Harpreet: [00:26:16] The problem in my

Speaker4: [00:26:17] Organization. And thankfully, I've found like minded people who are willing to support the effort. I just know that I'm going to face a ton of resistance along the way, but anything to handicap that is super helpful. So thank you and thanks for voicing this. And this is it's

Harpreet: [00:26:33] It's going to

Speaker4: [00:26:33] Mentally handicap things.

Harpreet: [00:26:36] Only Dave Langer is here to hear all this, I wonder what what David would say. Dave, if you're listening, you know, this is the opportunity to come in and

Speaker4: [00:26:44] Be up on LinkedIn Data speak

Harpreet: [00:26:46] Up for sale. Let's go to a mark after this. Let's go to mark then I from Mark Rubio to Russell and surge. By the way, can we just acknowledge how cool that silver streak of hair and Vince hair is right there? That is amazing. I wish I had that going on. Instead, I'm just I'm going bald. Let's go to let's go to mark. They have a Marco Rubio, Russell and then Serge.

Speaker4: [00:27:06] I yeah, so I have not implemented this. Take that with a grain of salt. I'm also pulling different contextual things and bringing it together to brainstorm with you. So I just want to caveat with that. But again, I talk about all the time. I talk about the stars framework and I post a link from HBR article that goes into it, but specifically star science saying where startup turnaround, accelerated growth, realignment and sustained success. And specifically, you're in realignment based on what I'm hearing is that you've had some success. It's growing. There's like set in their ways, but you see there's not a problem today, but there are definitely be a big problem in the future if you maintain this. And so you need to realign the ship, right? And so that requires a different set of strategy and influence as compared to like if the ship was going down and burning, [00:28:00] they'll be like, Oh, yeah, whatever, wherever is a great idea, like when you pick something now. But things are working now, so you need to really figure out a way to influence to to make that happen. And you know, for me, the way I think about it is, you know, within an organization thinking about entrepreneurship, innovating within your your organization is I view myself as a startup of one, maybe I'm a team as a startup within within this and all my colleagues are my market and I'm trying to capture this market and sell to them.

Speaker4: [00:28:31] And so for me, my mindset would be like if I was in a situation, I'm like, OK, who are the champions, who are the key people that are going to be like, Do I need to influence and get them on the team? I'm board for them to influence other people. So instead of me trying to be like, Let me convince everyone, right, I want to go identify, well, who's going to have the most like power to make a decision to, like, convince people be like, Actually, this is our new OK. We're using Jupyter notebooks for like 30 percent of our projects, something like that, right? And so my ever be focused on like, who are those key people to sell to? And then again, thinking about selling is, you know, you you want to focus on what's the problem that's being solved for them and how your proposed solution fills that need so perfectly for them. And so it's less about like, well, don't you obviously see this is way better than excel. You understand their pain points and really deeply understand, like, OK, you know,

Harpreet: [00:29:32] They really

Speaker4: [00:29:33] Love Excel. But when talking to them, you know, say, like, Well, what doesn't work for you? For Excel? Sometimes, you know, it's something that's taking really long that you can really try to try to accelerate, right? So for example, I'm thinking top of my head, if you ever working with Excel Data and maybe you have like, I know you work in finance. I don't I don't know any finance things, but like, say, for instance, it's like we'd say pricing Data in Excel, we could find a web scraping tool [00:30:00] and we could write a Python script, automate a pricing check four times a month as opposed to getting some some excel monkey to manually check a web page, manually put the pricing data into

Harpreet: [00:30:13] Excel and having that file

Speaker4: [00:30:14] Stored locally. That's that's what I'm dealing with. So I would, I would argue, think even simpler at first, because again, like you're already at the Promised Land, right? You already see it, but other people don't see it right. And so like for me, like what I'm thinking about is like something as simple as like when you have like text Data. So say friends like organization name, you have this whole whole Excel file. It's so painful to manually clean that throughout that. And so you can say like, Hey, you know, you're totally using Excel. You fix that column Typekit like three hours. What if I showed you a way to do it? This one selling notebook that'll take you like two minutes, right? Maybe take longer to code it, but to run it? Yeah, a few minutes, right? Right. And then you're like, Well, also, I can show you how downloading into CSC so I can put it back into excel, right? And so like, you solve this one pain point and you get them hooked. And from

Harpreet: [00:31:05] There, you can build build from

Speaker4: [00:31:06] There again, just brainstorming. But like, that's the framework I'm thinking about of like, how do I navigate and like, get something adopted? I feel like Greg. I probably have great insights, too. Yeah, Greg, I see your chat and guys, I'll respond to you in the chat. This is super helpful, guys. Thank you.

Harpreet: [00:31:27] Ok, thanks very much. Let's go to let's go to Russell and then and then Serge, by the way, if anybody else has questions, please let me know either in the comment section on LinkedIn on YouTube. If you're in the chat

Harpreet: [00:31:37] Here, let me know

Harpreet: [00:31:39] If you've got questions. And also shout out Christine Ellen Johnson. I see you guys here. Good to see you. Let me know if you guys got a question, Russell.

Speaker6: [00:31:47] A.m. So excel, yeah, I think one of the main things to say about Excel is it's been the the main tool on the market for some 130 [00:32:00] plus years, so it's almost ubiquitous throughout most organizations General of America, not not data science. So everybody who's had it normally bundled on their laptop, so it's been there for you. So everyone's had to play with it. So it's really kind of ingrained as a first line of defense for anything. So I think that's why it's a it's almost a comfort blanket for people. The way I approach trying to wean people away from excel is to point out its its weaknesses because it has some strengths. And again, as you said, Harp, it would be nice if they had language here to be able to really go on about those. But the weakness is, as I see them, are stability and fidelity of data in it and the propensity for breaking or corruption and fragmenting. Okay, so recent Enterprise Office 365 has helped so you can have a single version say on SharePoint and share it, rather than having a version of the file that's then copied, saved AIs and emailed around 100 times. So you end up with 100 different versions of the same file and version history. It very difficult to to work out. So so if we assume that most of the people you're talking to have got that enterprise level so that they can share a single file, the weaknesses are still, even if you protect that file, it's still open to having inadvertently protected cells overwritten because Excel is not very good at keeping it protected. Self protected I don't know if you guys are experts on Excel, but if you have a protected column and you do an array paste that goes from either side of that column, it'll actually paste over a protected cell.

Speaker6: [00:33:47] So it's really irritating in that way. So try to point out the weaknesses of it and then really upsell the complementary strengths of whatever you want to propose as a solution. So [00:34:00] be that say and again, if Data language, you know you talk about power as being able to do some of that and push it back out to excel. Python can do the same thing. I do a lot of work also with Power BI, which is kind of a successor to Excel, at least from the power dividend power query, plus other stuff. So I try to wean people off excel to the next best. Solution that still has a tangible connection to it and try and move them off, as I say, win them in time, don't just cold turkey them, don't say right. Excel is rubbish. You need to do this now. Let them find the better solutions and coach them to move away from it rather than order them to move away from it. And even then, I think, as Ben was saying, you know, it's it's a comfort blanket as soon as something doesn't work the way you want to. You go straight back to excel. And one of the most common questions I get from people that are not really comfortable with non excel solutions is, Oh yeah. Data. But that's amazing. How did you do that? Can I export that to excel and have a play with it? So it's Excel is just such a hard habit to break. I don't think you're going to do it in days or weeks or even months. It's going to take a long, long time, but just keep pushing those iterative improvements and

Harpreet: [00:35:21] Eventually we can move

Speaker6: [00:35:23] Away, but try and sell them as a supplement to excel rather than replace them to excel.

Harpreet: [00:35:30] It a certain bit of satisfaction you get, like when you're actually manipulating the data yourself, like it's one thing to look at it in a static Data frame where you can't really, you know, get down and dirty with it. So I could see where where some people are reluctant to

Speaker6: [00:35:45] Get rid of it. Well, you know, a play with that. The cynic in me, Harp, most people want the Data exported to excel, so if they don't like a number or if they disagree with a number, they can just over overwrite the number, you know, I mean, that's what you don't want in Data. [00:36:00] That's the whole reason to move away. But that's the reason people want it. I can't. I can't tell my manager that, no, let me just change it. I. That goes on more than I think anyone would care to admit. And that's the biggest reason to move away from it. So.

Harpreet: [00:36:14] And redressal. So let's go to Serge next. And then coast up and Eric, have you guys had one talk on this topic will go to coach Kevin and Eric next, and after that, we'll jump into Greg's question. You are muted search.

Speaker3: [00:36:32] Keep doing that. Russell, just hit the nail on the head, I was going to talk about that a bit more, but yeah, he he beat me to it. Yeah. Like, I think. The gateway drug to something better than Excel is definitely bi tools. So as soon as they see what you can do with bi tools

Harpreet: [00:36:53] Like Power, BI

Speaker3: [00:36:54] Or Tableau, they'll they won't want to use Excel. Plus, it will allow them to use even more data that they can already if they haven't reached that limitation. Better plots, nicer looking, too. And it has a cleaner user interface, although they might be a bit lost without that kind of cheat first approach. They also, you know, you could point them to Alteryx or something like that as well. They might like the kind of pipelines and all that, and thinking about Data in that way is also a healthier way to look at it as in sheets. I think it's very limiting. So, yeah, to me, it's like, OK, things that are it's like. Presenting a three or four dimensional world in two dimensions, you know,

Speaker4: [00:37:51] Yeah, it's like, it's like, yeah, flatland.

Speaker3: [00:37:53] Yeah, exactly. So that's that's it. And I think people that live in Flatland [00:38:00] that you mentioned probably don't know they live in Flatland, right? So yeah, that's that's my my talk about that.

Harpreet: [00:38:08] All right. Let's go to a coast and then Eric. Then after that, we'll jump into Greg's question. By the way, if you're in the room and you have a question, please do

Harpreet: [00:38:17] Let me know. Or if you want to contribute

Harpreet: [00:38:19] To a certain part of the discussion, it's not too late to raise your hand. If you're listening on LinkedIn and the other question, please let me know. Also, if you're listening on LinkedIn and you haven't already smashed it, I should probably do that as well. Uh, yeah,

Harpreet: [00:38:35] I just want to I just want to

Speaker4: [00:38:36] Turn this question a little bit on its head, kind

Speaker3: [00:38:39] Of partially Fidel's reasons, right? Are you trying to solve a problem that isn't your problem to solve? Right, like, think about it now, there are people who've been working with Data with Excel spreadsheets for 20 30 years, right? They're going through their careers and they've seen all the benefits that they need to see in their day to day lives. Is your job there to transform the whole business, to getting it to using Python? Or I mean, is that a problem you necessarily need to solve right now?

Speaker4: [00:39:09] It's akin to asking, Is it my responsibility to take the frog out of boiling water before he realizes that it's spoiling for my team, for my specific department? I'd argue yes for the rest of my organization. Absolutely not. Right. But my my point is like,

Speaker3: [00:39:29] Do you think that just by doing it within your team and showing

Harpreet: [00:39:32] The results that over

Speaker3: [00:39:33] Time and this might be potentially

Speaker4: [00:39:35] Over a few years? Right. Other parts of the organization would come along the story by seeing those winds, potentially. But that's that's a prerogative

Harpreet: [00:39:43] That the rest

Speaker4: [00:39:44] Of the organization has one temporarily. I don't think I'll be here in three years. I can just freely admit that, but I do want to showcase the benefit of what can be done outside of Excel and with the benefits [00:40:00] that everybody here sees day in and day out because it really is an existential crisis, the the industry needs to move. It just doesn't know how to. And I like to think that I could propose how we can

Harpreet: [00:40:13] Move in a way

Speaker4: [00:40:14] That is beneficial to itself and to its clients

Harpreet: [00:40:18] Because other

Speaker4: [00:40:19] Organizations are doing this. So our competitors are doing this. We just don't happen to see what exactly they're doing, and I'm in a position where I can see that, but I just want my team to be part of the solution and not part of the problem. Whether the organization decides to adopt our practices, that's that's up to it. But I don't want my own trajectory and my my standing to lose because of behavioral inertia and a close fidelity to excel

Harpreet: [00:40:47] Like that's just that's just

Speaker4: [00:40:49] Ridiculous, in my view. So that that's the long winded answer to the question. Yeah, I mean, like to my mind, it's it's very much in your case, it sounds like that

Speaker3: [00:41:00] Is kind of your job to make

Speaker4: [00:41:02] Sure that pushes through. I think a lot of the time I've fallen into this trap before where I've

Speaker3: [00:41:08] Gotten so almost on a high horse, right to say, Hey, this is the better way to do something that I wasted a lot of time trying to convince people that it's a better way to do something and not enough time

Speaker4: [00:41:17] Focusing on the job at hand. So that's the only reason I ask that question.

Harpreet: [00:41:21] No, no. And you're you're exactly right.

Speaker4: [00:41:23] There's a principle age problem that could develop here. What what? I'm trying to eliminate that possibility by say, I'm fucking what's been fucked up for a long time. And so I'm trying to anticipate what that problem can be and eliminated before it becomes a problem.

Harpreet: [00:41:48] Thank you very much. Great comments, Kosta. Thank you. Matt, let's go to Eric on this topic and then search. Mr Hand up, as well as that

Harpreet: [00:41:58] By accident, if

Harpreet: [00:42:00] By [00:42:00] accident. Yeah. Cool. Let's go to to Eric on this topic. Then after Eric, we will go to your next question.

Speaker4: [00:42:10] So I'm I have a title to add on what everybody else has said, but so in a previous job pre Data stuff, I was a I guess you could call me kind of a process improvement leader and my job was to pull together cross-functional teams from different parts of the company

Harpreet: [00:42:29] To fix problems that needed to be

Speaker4: [00:42:31] Fixed and like an

Harpreet: [00:42:32] Intense, intensive like.

Speaker4: [00:42:35] A blitz like to use a lean or Six Sigma termite as a blitz, and so I did not know anything about the problems, and I didn't even know what problems existed to start

Harpreet: [00:42:47] With and my job

Speaker4: [00:42:48] The big part of my job, even though I had some tools that could be used to fix things. I spent most of my time like sitting down with people who knew what the problems were, asking them about the problems, and they would just tell me what they didn't like.

Harpreet: [00:43:03] And it's like Paul

Speaker4: [00:43:04] A. says, Fix what

Harpreet: [00:43:05] Bugs you and everybody

Speaker4: [00:43:07] Has something that they don't like, and that's where you can start with stuff. But you don't know what those people don't like until they tell you what they don't like. And when you find that thing they don't like or they tell you, you'll get all the input that you could possibly need for it. And you know, maybe some people are comfortable or complacent or whatever adjective you want to use for that, but somebody somewhere will have a problem that can be solved and that is open to it, you know, and if nobody anywhere has a problem that needs to be solved, well, then fine. I guess you're off the hook, but it doesn't sound like that's the case. And so it just kind of goes back to, I guess, what I said earlier about, like thinking like a data scientist is just like, ask an absurd number of questions because eventually you'll get an answer and that might give you a marching direction and something to do that will then allow you to meet a need that somebody really wants to have met and feel like you're all paddling in the same [00:44:00] direction.

Harpreet: [00:44:02] Thank you very much, Matt, lots of great tips and advice for you there. This will be released on Sunday, so make sure you go back and listen to some of that stuff. Shout out to Richard Smith. Just joined us in the room. Also shout out to Ellen and Christine.

Harpreet: [00:44:19] You guys got any questions?

Harpreet: [00:44:20] Let me know. He shout out to Shantha. And by the way, Shantha is hiring for a machine learning engineer. Intern. You guys are interested in that. Go visit her LinkedIn profile. Find out more information. Let's go to Greg's question.

Speaker4: [00:44:39] And so forgive me if you've already talked about this one, but I can't help but bring that zero conversation here. So I hear so much. Oh, it failed. Blah blah blah. Let's blame. Let's bring these folks. These are scientists, and we never really know

Harpreet: [00:45:02] What really happened.

Speaker4: [00:45:04] So I saw an article saying that actually people using the the tool, they were actually not trusting what the system recommended and kind of changing the numbers, the sales work and the buy recommendations to meet their quota while we're sitting here not knowing the true story. What what would you have done differently as a data scientist to to make this work, to make it more of a success story than what it is today?

Harpreet: [00:45:40] I'm a go ahead and shut off in for this one because that his his articles just gave you a look at machine learning feature is I.

Speaker4: [00:45:47] No, I like that, I like that. That's why I brought it up. So I want to hear. Yeah, for sure.

Harpreet: [00:45:54] He also made it to the front page of Hacker News with that, with that post. So that's awesome

Harpreet: [00:45:59] Vignette.

Harpreet: [00:46:00] Let's [00:46:00] hear from you on this.

Speaker5: [00:46:05] Well, it it's interesting, and I think I think everybody is kind of tired of hearing my post about Zillow, so I'm not going to wax on too much because I think legit, everybody's seen it, everybody's seen opinions on it. I think the comments that I got on Hacker News are potentially more entertaining than the post itself. And they're informative because you hear and this is the the part that I think hasn't been covered enough. You hear this dichotomy in Data science now

Harpreet: [00:46:33] Between

Speaker5: [00:46:35] What I'm starting to call legacy data science and sort of this new generation of data scientists who are looking at data science as a value generator

Harpreet: [00:46:43] And a revenue generator.

Speaker5: [00:46:45] But they're also cautioning that, OK, you're going to start booking cash on this. It has to be more bulletproof, not obviously bulletproof, but it has to be more bulletproof than it used to be in the past.

Harpreet: [00:46:58] And I don't think we're doing a great

Speaker5: [00:46:59] Job advertising when we hit that barrier

Harpreet: [00:47:03] Between. This has

Speaker5: [00:47:04] A whole lot of safety nets in this, you know, these models have people or processes that are protecting it from

Harpreet: [00:47:11] Itself because

Speaker5: [00:47:12] Models fail. And I don't think we're doing a good job of advertising when those get taken away. And when we start really relying on models

Harpreet: [00:47:23] For revenue and not

Speaker5: [00:47:24] Just like the models supporting a feature or enabling something,

Harpreet: [00:47:28] You know, when it's

Speaker5: [00:47:29] A frontline

Harpreet: [00:47:29] Feature like

Speaker5: [00:47:31] Microsoft right now has monetized GPT three in three

Harpreet: [00:47:34] Different ways.

Speaker5: [00:47:35] And all of them are the literal hardest ways possible to monetize the model. And they've they've done the rigorous work necessary to create

Harpreet: [00:47:45] A product that won't bite them.

Speaker5: [00:47:47] And they're doing things like

Harpreet: [00:47:48] Copilot where they're

Speaker5: [00:47:49] Trialing it and figuring out,

Harpreet: [00:47:52] You know, where all the

Speaker5: [00:47:52] Rough edges are before they start staking revenue. On top of it, they're even selling access to the model itself to retrain it. [00:48:00] And so they're doing it really intelligently. And so you can, with large scale models, make money. If your data scientists understand that fuzzy line between we've got safety nets and those will protect us. And the other side of that line is we are taking these safety nets away. This has to work a whole lot more reliably, and there's no one in the C-suite who understands how to ask the kind of questions necessary to keep them from really just getting obliterated from a business model standpoint, walking into a set up to fail.

Harpreet: [00:48:40] And so

Speaker5: [00:48:41] I think that's the piece that we're missing is sort of that the branching off of Data

Harpreet: [00:48:44] Science, where the legacy

Speaker5: [00:48:46] Side of the field won't support revenue, the way companies expect it to, and the newer side of the field more rigorous, the more scientifically supported side of the field. It feels like we're not doing a great job of saying, look, this is when we need to start doing

Harpreet: [00:49:03] Hard science, not just analytics

Speaker5: [00:49:06] Anymore. And we're not doing a great job of advertising to senior leadership.

Harpreet: [00:49:10] Hey, if we don't, this fails.

Speaker5: [00:49:13] You don't make money, your investors bail. And you know, we can almost point directly at Zillow and go, Yeah, that happens. So if you don't spend the money, if you don't invest the time, that happens.

Harpreet: [00:49:23] And here's you know,

Speaker5: [00:49:24] Here's a better description of how this is going to work and how we can support it. And so I think, you know, you can blame data scientists to a point, but you also have to ask the C-suite

Harpreet: [00:49:34] To be accountable to know

Speaker5: [00:49:35] Enough about data science or have somebody in the C-suite have somebody in the strategic planning process have someone at the leadership level who understands data science if you're going to create a business model that is completely based on it. That would be like having somebody in charge of Microsoft that didn't understand technology in

Harpreet: [00:49:53] Any way, shape or form. It just, I mean, sure, maybe.

Speaker5: [00:49:57] But it sounds like a bad idea to me.

Harpreet: [00:50:00] If [00:50:00] you run it like a Data science organization, like you need to put yourself in a position where you're not fragile to volatility, right, you need to. You need to put yourself in a position where you have more to gain than to lose. So you get more upside than downside, right? So you put yourself in a position which requires you to have a ton of information for you to be successful. There's a lot of asymmetry, right? You can't capitalize on that asymmetry. The future is complex, but there was, say, Babe Ruth or Yogi Bear that says making predictions as hard, especially about the future. Right? So maybe buying houses outright would not have been the right move. Maybe the right move would have been buying options to buy

Harpreet: [00:50:39] Houses, right? I'm sure

Harpreet: [00:50:41] An options market exists

Harpreet: [00:50:42] For that.

Harpreet: [00:50:44] But yeah, I think any time you're using data science to make or machine learning to make huge decisions like this, this is not the same thing as object

Harpreet: [00:50:52] Detection, right?

Harpreet: [00:50:53] This is not the same thing as trying to say Hot Dog, not hot dog. This is a completely different use case of machine learning. And in this particular domain, you need more information. Right? So make decisions where you need less information to be successful. That's that's what I would do differently if I was a data scientist.

Harpreet: [00:51:15] I'd love to

Harpreet: [00:51:16] Hear from from Serge on this if you got any insight or anybody else, for that matter.

Speaker4: [00:51:22] Before Serge. Real quick. So if it's true that the people using these models were trust in it, it makes any model ever fragile to right. So they don't trust what you've put together as a data scientist, especially if it's based on probabilistic inferences. So in this case, they're going to go with their gut feeling, Oh, there's no way this house is worth that much. We can purchase it for a little bit lower or a little bit higher or blah, blah blah. Let me tinker with it and not trust the system. That's like a recipe for failure, too. So it's kind of like, you know, yeah, blaming the data scientists. But did you do enough to promote trust [00:52:00] between the ones who already the scientists and the ones who would be using the tools that the data scientists would build? So I think it's a mixture of a lot of things that made this thing fail.

Speaker5: [00:52:11] And I'm just jumping, jumping on top of that, there's a behavioral component. And I talked about that a little bit in my article, and this is the part that I think many, many business models are going to fail around is they pretend that people are rational actors and they're not. So if you don't have a behavioral model supporting your inference, then you're almost kind of ignoring one dimension of the problem. And Greg just brought up the internal side, which I didn't even bring up,

Harpreet: [00:52:40] Which is the it's

Speaker5: [00:52:42] Not just your customers, it's the people inside where we're trying to short circuit this. You almost need like another behavioral model to prevent your people from doing all the crazy and stupid things that people who think they're smarter than they are do. Yeah.

Harpreet: [00:52:57] Almost like it needs

Harpreet: [00:52:58] To a bit of responsibility on the Data AIs team to be like, you know, sure, we did well on our training set, sure we did well in the testing Data. But is this an actual good thing for us to try to do this, this this this project make sense. Should we not have invested that much money, like should we have maybe used as a small proportion of of capital to test this at least

Harpreet: [00:53:26] Like, you know,

Harpreet: [00:53:27] What am I? You know, colleagues, Dhruv has this great saying, I absolutely love it. He's like, There is no such thing as the test set. The test set is the real world. So, yeah. Anyway, let's leave it at that go step. Go for it. And then after a coastal wave surge, wants to jump in? Yeah. Yeah.

Speaker4: [00:53:45] Yeah, I think

Speaker3: [00:53:46] You hit you hit the nail on the head. The question is how much how much power do you give? How much ownership do you give a data science team to give them the ability to turn around [00:54:00] and

Harpreet: [00:54:00] Give senior management

Speaker3: [00:54:01] That feedback, saying, Hey, I don't think we're using this model for the right reasons

Speaker4: [00:54:05] Within this business right now. That might be an insidious case, potentially.

Speaker3: [00:54:10] We're not using this model to, you know, to adequately cover the domain that we need to cover to make the right business decisions. It could be a moral compass, right? Like one of the things that attracted me to the role I'm at right now is that the company set up like a bit of like an advisory committee effectively, and it's effectively a voice for the machine learning engineers and the data scientists to turn around and say, Hey, guys, I don't think we're quite going about this the right way. And having that two way feedback,

Speaker4: [00:54:40] I think is really important. So it's really a cultural thing. Well, there could be a cultural solution to a lot of these problems, I think. Go for it.

Speaker3: [00:54:51] Yeah, I think there's also a technical solution along the lines of Harpreet Sahota. Yeah, I think it's it's all about experimenting. You don't go full on and test it completely. You take a sample size on different markets, see how it behaves, you know? You also you have to be skeptic and see OK and realize that your your models have a feedback loop and they impact the market and there's people tricking it. So, I mean, you it all. I think there's a lot of techniques that have been long existed that are under leveraged, you know, things like, you know, sensitivity analysis, a b testing, you know, these can be applied to every field. And I think these, you know, whether it was the data scientist, I don't know the particulars of the case or management, but someone dropped the ball on that.

Harpreet: [00:55:50] So much anybody else want to jump in on on this. I had a question coming in from LinkedIn from from Ben Taylor. We can go to that question next.

Harpreet: [00:55:59] Nobody else has anything [00:56:00]

Harpreet: [00:56:00] To say here.

Speaker4: [00:56:02] Thank you for your answers, guys. I just love that we were talking about this economics and about rationality

Harpreet: [00:56:09] Because of course,

Speaker4: [00:56:11] We all know that we underestimate the errors

Harpreet: [00:56:16] And we overestimate

Speaker4: [00:56:18] How we know about a local market, right? And to what Serge was saying.

Harpreet: [00:56:23] You just get greedy

Harpreet: [00:56:24] Because you haven't properly

Speaker4: [00:56:25] Tested your market and got enough sample size and enough data

Harpreet: [00:56:28] Over a long enough period of time. A question coming in from Ben Taylor, what is the future of AI explainability? What new tools do you think will be available? It's a great question I wish I had. Dennis Rothman is here to help with that. Go for it, Greg.

Speaker4: [00:56:52] Well, my my opinion is that explainability will never be one of those, oh, I'm going to have access to what's under the hood, right? So GPT three, who's going to know what's happening with those billions of parameters, right? It's more of a rise of observability, whether it's Ml or Data observability that can tell you what changes have happened and the input data and what of the results of the output and kind of in the map of all of that exploration and flagging you to explain to you why this output is changing based on some influences that are happening to the to the input? To me, expandability is about trying different things which are model and understanding what that model does based on what you feed into it. It's not necessarily understanding what's happening inside that black box. If you want to be a poor engineer, you can go ahead and

Harpreet: [00:57:59] Understand [00:58:00]

Speaker4: [00:58:00] It and build something like it in a deep dove into it. But not everybody will want to be there, but everybody wants to have some sort of understanding from explainability standpoint. So the future is around the rise of observability. That's tools that allows observability at scale because you're going to have hundreds of deployed models out there. You want some sort of scalable automation to help you trigger retraining fast and redeploying fast because you don't have time to open each hood to understand what's happening.

Harpreet: [00:58:39] I thanks so much. Let's go. We'll go to a surge after this, then then then after that and then shot, then if you're around, I'd love to hear from you because we talked about this a little bit on the one time you're on my podcast. But I mean, does it make sense to drill this further down when we talk about OK explainability? Like, does that mean we need to answer the question, OK, why did this model make this prediction? How did the model make this prediction? And then what does this prediction imply? Like, are those all the same questions?

Harpreet: [00:59:07] I'm not sure. That's just something that's

Harpreet: [00:59:09] Floating around in my head, but Serge, go for it. Then, after Serge Vigne and Shantanu, if you're around, I'd love to hear from you on this.

Speaker3: [00:59:16] Well, to expand what Greg said, yeah, I think observability is important also. Yeah, of course, drift catching drift human in the loop. I think for not so much, not only for inference, but also for training. Right now, it's, you know, pretty manual. But I think when AutoML takes its course, you have to always have human in the loop. And that's where all these tools, you know, to give a constant feedback to the machine learning practitioners so that things are always, you know, guardrails are placed so that it doesn't do any crazy, anything crazy. And as you realize, [01:00:00] what are the weak spots of the model, you place more guardrails in place because you understand that there's a, you know, Data distributions and there's things that the model cannot learn because they don't exist in the Data, they're not represented there. So I think

Harpreet: [01:00:16] That's where

Speaker3: [01:00:17] It's heading in that aspect. Also, there's new tools being

Harpreet: [01:00:21] Established with causo

Speaker3: [01:00:23] Eml things that leverage also counterfactuals. I think that that will lead the way to a lot of exciting opportunities.

Harpreet: [01:00:34] Speaking of causal inference

Harpreet: [01:00:37] And

Harpreet: [01:00:39] Factual. Be sure to tune into the episode that I am releasing, I think it is December the 3rd or December the 10th with Dana McKenzie, coauthor of the book of

Harpreet: [01:00:50] Why

Harpreet: [01:00:51] The guys will enjoy that conversation. Then let's hear from you on this. Anybody else wants to jump in? Please let me know if anybody else has a question. Let me know, because I think we'll begin to wrap it up after this round of discussion.

Speaker5: [01:01:06] When you said goodbye and. For it's the mix tape, get used to it. That's where we're going. That's the it's not a neutral, it's a sort of new tool and sort of new applications for it.

Harpreet: [01:01:19] But the more

Speaker5: [01:01:21] I look at the regulatory

Harpreet: [01:01:23] Frameworks that

Speaker5: [01:01:24] Are coming out and I think this is where when we talk about explainable

Harpreet: [01:01:27] Ai and

Speaker5: [01:01:28] Reliability robustness,

Harpreet: [01:01:31] I think we're going to

Speaker5: [01:01:31] Have a couple of drivers. One is adversarial attacks and eventually you're going to see a significant enough model exploit happened publicly, and that's going to be another like, I'm going to get another 50000 views on my Substack for that one. It's going to be another Zillow type event where someone exploits a model publicly with serious financial consequences. And that's going to be one driver

Harpreet: [01:01:56] That pushes

Speaker5: [01:01:56] You towards causal. And the other one's going to be regulatory

Harpreet: [01:01:59] Because [01:02:00] the regulators

Speaker5: [01:02:00] Don't understand what they're regulating, so they're overregulating the heck out of it. They're asking questions that there's just no real way to answer unless you go down a complex rabbit hole that involves some causal and I don't know which

Harpreet: [01:02:17] Theory of

Speaker5: [01:02:18] The case is going to end up winning,

Harpreet: [01:02:19] But at

Speaker5: [01:02:20] Some point we're going to have to start incorporating more hard science, and that's going to be causal inference or causal modeling mechanisms of some sort. We have to those two drivers are going to be it's it's unavoidable.

Harpreet: [01:02:37] You didn't think so much. Anybody else want to jump, oh, shot that is around shot that you want to jump in here? Just in case you didn't hear the question, Vin was asking, What's the future of Vin? Ben was asking, What's the future of AI explainability? What new tools do you think will be available?

Speaker4: [01:02:56] Yeah, no, I heard the discussion. I'll mention

Harpreet: [01:02:59] Two things. One is that,

Speaker4: [01:03:03] You know, I still think that there is a lot of value and explainable shallow models rather than running to deep models for everything. Having said that, of course, certain fields

Harpreet: [01:03:14] Just don't lend themselves to making a lot of

Speaker4: [01:03:17] Progress without deep neural networks. So that's one. The second is, you know, folks have mentioned drift,

Harpreet: [01:03:26] You know, moral drift. I think specifically, I think what's

Speaker4: [01:03:30] Going to be really important is Data governance as well. So, you know, not not thinking about particular tools,

Harpreet: [01:03:38] But

Speaker4: [01:03:39] Part of this

Harpreet: [01:03:40] Mlps and all

Speaker4: [01:03:42] Orchestration pipeline has to be Data governance, Data lineage Data sanity checks. And that will at least I mean, it will at least help prevent like things going terribly

Harpreet: [01:03:56] Wrong and

Speaker4: [01:03:57] Sort of sort of the best case scenario it [01:04:00] can, you know, if you're able to sort of

Harpreet: [01:04:02] Understand, really understand your Data,

Speaker4: [01:04:05] Then you can do a lot of

Harpreet: [01:04:07] This. You know, you can do

Speaker4: [01:04:09] A lot of explaining from the Data. You don't really have to rely on the model too much.

Harpreet: [01:04:15] Ethnic group, that entire Data lineage, Data, you know, governance, management, all that stuff at the end of the day, like if we're building models, we're just consumers of Data or like downstream users of Data having all that upstream

Harpreet: [01:04:30] Processes in places

Harpreet: [01:04:32] Critical. Anybody else want to jump in here? And next in the last minute questions, any last minute comments, let

Speaker4: [01:04:41] Me know it's been satisfied with the answers.

Harpreet: [01:04:44] Uh, then hopefully you're satisfied I don't see him on the on the comment section of LinkedIn, but I do know that he is. I think he's headed to the airport or maybe even sit in at the airport right now. So Ben, hopefully that was interesting to you. So let us know. Come back next week.

Harpreet: [01:05:04] Talk about it.

Harpreet: [01:05:05] No, the question was from anyone. We'll wrap it up, guys. Thanks so much for joining. Shout out to everybody that came through. Christine Jude in the audience. You guys here hopefully AIs come back next week. Let me know if you got questions. So episode released just today. George Farah Khan Data governance on the heels of what Jonathan, I just said, Data governance is important.

Harpreet: [01:05:29] Well, it's a good thing.

Harpreet: [01:05:29] I got an episode that we talk about data governance. So tune in to that. You can also listen to the episode I did with Jonathan Knight as well. So search for that. Next couple of weeks, a lot of interesting episodes coming out next week is with Steve Cardinale talk about how to turn ideas into goals. Then after

Harpreet: [01:05:47] That crush

Harpreet: [01:05:49] Al-qaida, he has the Philosophy Data project, which is

Harpreet: [01:05:54] Taking NLP

Harpreet: [01:05:55] Combining with philosophies. He's got this really interesting web app.

Harpreet: [01:05:59] We have a good conversation [01:06:00] there.

Harpreet: [01:06:02] Greg AIs, your immediate any last minute questions or comments? Go for it.

Speaker4: [01:06:07] It's always good to be here.

Harpreet: [01:06:09] Awesome. Thank you, guys, so much for come and thank you for hanging out. Hope to see you guys next

Harpreet: [01:06:13] Week and

Harpreet: [01:06:14] Hopefully you guys stay safe, stay warm. Like I mentioned, three feet of snow today out here in Winnipeg, it is wild and I got to go try to get groceries right now. Hope you stay on the road, guys. Take care, have a good rest of the evening, afternoon, whatever it is. Remember you got one life on this planet. Why not try to do something? Here's everything.