HH77-15-04-22.mp3 Harpreet: [00:00:09] What's up, everybody? Welcome. Welcome to the artist and data scientist. Happy Hours. Happy hour number 77. Last week celebrated two years of the show with around two years and now 77 of these happy hours. So, so great. You've been doing this for that long. I appreciate your being here. Shout out to everybody that is in the room so far. Russell, what's going on? What's going on? Eric Gitonga also is in the building Eric one of these days and gonna make it to reflective Monday night so I say something worth reflecting on shout out to everybody super excited you all are here. Hopefully get a chance to tune in to the episode that was released today is with the one and only Christina STEPHANOPOULOS, the international woman of data. I say this a number of times, man. I got to get better at promoting these podcast episodes as they release. And I will I will one of these days when I. When the baby lets me get a full night's rest, I can wake up early. That'll happen. But yeah, hopefully I get a chance to do that. So it is a great, great chat. It was live streamed a few months back, might have been in October or September when we had live that. So if you're looking for on YouTube, it will come up as a new video. You'll have to search for her name on my YouTube channel. But definitely check it out. Harpreet: [00:01:23] Great conversation. I've known Christina for about almost two years ish, but it's the first time we've actually ever sat down and chatted one on one. And that's great conversation. That's great connecting with somebody that's going to enjoy that episode. And the other news will complete my second week over at Pachyderm just onboarding, learning the product and learning more about Kubernetes and doctors than I ever thought I would need to learn. But Dan is fascinating, interesting technologies, super powerful product. I'm excited to start building out examples in these cases with that, with the product. So super excited [00:02:00] that hopefully you guys are going to see Boston. If you guys will be there in Boston, look for me, be at the pachyderm table giving out swag, saying hi to everyone, checking out some of the awesome events that are happening as well. If you do find yourself in Boston on the 19th around 5 p.m., go ahead, make your way to Trillium Brewery over at Fenway. We'll have a nice gathering of people there from from a number of different communities. It'll be good. So please do come through. As always, if you want to support the podcast, there's a link in the show notes that you can go ahead and support the show. I don't know when the hell my basement is going to be repaired. Nobody's been back in over a month to start working on it, so hopefully I have an office soon as I've been taking far too long. Harpreet: [00:02:53] So that's just been gas. So let's go ahead and kick off the conversation. I got actually I got a message from Karthik Power on email asking the question he wasn't able to join it because this is obviously an inconvenient time for people in India. But if you are in India and you do listen to this, you know that you can always send me and I'll be able to add your question into the session. So this is a breaking into data science question. You guys know that I'm kind of like trying to move away from answering those questions because we've done so many of it. But there's also something more important about this question that I want to address. Right. He's saying that should I leave my job to prepare for a career in data science? Now, if you kind of abstract that away, it's like, okay, should I leave my current job to learn something new, to break into another field? So I thought that was an interesting question, right? Me personally, I feel like you shouldn't ever quit your job just to break into another job. Like I would say, wake up at least an hour earlier, an hour later, and just remove any distractions from your life and stick with it for a longer time [00:04:00] horizon. It might take you 6 to 9 months as opposed to whatever, three, six months to get your first data science job. Harpreet: [00:04:07] Not saying it will take you that long to learn data science. That's preposterous in that short amount of time to learn data science other than just I've got a graduate degree in math and I'm still don't know shit. But yeah, should you, should you quit your job to focus completely on learning something new so you can pivot and break into that field? That is essentially the crux of the question. Let's go to the set. Let's go to then. I think I've given my perspective on this. My perspective is wake up an hour earlier, wake up 2 hours earlier, don't watch any TV shows after work. Stay up an hour later. Do what you got to do to get to where you need to be. But still make sure you're getting money to put food on the table. Do what you got to do. Then what do you think? Then let's go to Russell and then if anybody else is chime in. Erika also shout out to everybody watching on LinkedIn if you guys got questions on LinkedIn. Let me know. I'll add your question to the queue or if you want to join in here live in the Zoom room. If you're watching on LinkedIn and want to come in to the action, let me know. I'll send you a link to the zoom room, then go for it. Speaker2: [00:05:20] Trying to think about leaving your job before you have another one is always scary. But I mean, I understand where he's coming from. I understand where the question at least is coming from is if you don't have time to focus on a career path because you don't have time to do the educational side of it, I mean, I kind of see where that sentiment comes from, but I look at that as sort of a holdover from the college academic mindset where you have to be in school full time in order to learn a new field or in order to transfer jobs. And I think I'd look at it more like a night school type of paradigm, if you want to look at it that way, where you've got your day [00:06:00] job and a lot of day jobs are actually fairly flexible when it comes to giving you an evening off or giving you some time off to take some classes. And I don't mean like a ton of time off, like when you leave at 4:00, so you can get to so you can get your learning path and you don't have to do everything in person anymore. There's so much that's offered online. I think you can do that sort of night school model where you're learning as you go and if it takes a little longer, I mean, it's not like the field's going anywhere. If you're in a hurry for one reason or another, maybe that's a reason to leave your job. But I think that maybe instead of thinking I'm going to get there in six months, think about it's probably going to take about a year or two to do the transition. And you're going you're learning fast. It's going to be so much higher quality if you take the extra time. And the only way you can do that is if you have an income. It's really hard to say, I'm going to quit my job for a year or two years, but saying I'm going to stick with my job and then slowly just kind of concept by concept, learn data science, it's just more feasible. It makes more sense that way. Harpreet: [00:07:04] And here's like an interesting twist to the carpet situation. He's currently working as a data engineer. So to me, I'm like, all right, well, you're already you're already in the game, man. What else is there to learn? If you're a data engineer, you do like Pi, SPARC and Databricks. It's not that much else to learn yet. Pick up a copy of Hands on Machine Learning with with psychic learn or handsome machine learning with PyTorch and study that in our hour and a half a day and you'll be good. But what are your thoughts on that? Speaker2: [00:07:39] I think there's a lot more to learn. And I think this is the interesting segmentation of data science. There's no such thing as the data scientist, but if you're a data engineer, like you are a type of the data scientist, if you're an ML engineer, you are a type of the data scientist. But when you start thinking about really, [00:08:00] where does our field go from a value creation standpoint, from an accuracy standpoint, from solving bigger, more important problems? Now you're going to applied research. And I think that's where when you talk about like I am a data scientist, you have to decide which kind, which flavor you're talking about. And each one of those career paths has a certain amount of longevity. I think data engineering has the least longevity, so it's a good one to pivot out of, especially if you have a year or two because it's not like it's disappearing next week. It's going to take some time to go away. But there's so much automation in that space that I think you're going to see sort of a pullback from data engineering. And so that's not going to be a the data scientist anymore. Machine learning engineering is going to take a whole lot longer to automate if we ever do it. I just I think it's too complex. I know other people disagree with that sentiment, but I think it's too hard to automate machine learning engineering end to end in any way. But you could be looking at making a pivot to something where you make more money as an applied researcher do that. The data science, I think it's really it's important to make the distinction of what you want to do in the data science lifecycle. Harpreet: [00:09:17] I want to dig deeper on something there about data engineering having to be fun to pivot out of because it might be going away. But before you get there. But yeah. Absolutely agree that that. You should probably specialize and figure out what it is that you want to do. The number one mistake that I feel people make when they say I want to get a job at data science is they all want to be data scientists. But it's such a huge spectrum. A huge, huge spectrum. But I'm curious, like, so data engineering. You think that that's possible to get completely automated? How would that work? Also shout out to Cathy Bailey. Good to have you here. If you have a question, please do let us know. I'll [00:10:00] get your attitude, your work, and when I add something to the conversation, go ahead and use the phrase hand icon. I'm happy to get you. Q So did engineering some of the pivot? That's something I've never heard. Speaker2: [00:10:13] I think that an automation is never like 100% thing. There's no field. I don't think we have many jobs that will ever be 100% automated. But I look at a field as a good one to get out of. If more than 50 or 60% of your job can be automated within the next 2 to 3 years. That's where I'm starting to look at fields and saying, If I can do half of this with machine learning or with some sort of automation, it's probably a bad field to be in because you're talking about like a reduction in staff and headcount that's so dramatic you're going to be impacted by it. Whether that's you lose your job because you're downsized or your salary prospects are lower than they should be. And so that's where I go. I look at data engineering as something that's shrinking, and I know that's dumb to say while the field is expanding. I know like I'm calling it at a really, really stupid time. But when it comes to looking at what the roadmap is for companies like Microsoft who are offering these types of automation and infrastructure, automation services and product lines, yeah, they're going straight at what data engineers do. It's it's almost like they intentionally looked at data engineers and goes, Yeah, I don't like you guys. I'm going to automate you. It's like somebody got angry at them at some point along the way. And there are just a ton of companies that are trying to just automate data engineers. Harpreet: [00:11:39] So is okay. So should we be more worried about auto data engineering instead of AutoML? Speaker2: [00:11:49] Well, I think that's what AutoML mostly is. You know, AutoML, like AutoML is just, you know, I give a ton of users the ability to use their data better. But in order [00:12:00] to do that, and this is really the reason why there's so much of a push towards automating the data engineering side of it is AutoML is scary and dangerous if your data is bad. Like that's why you need such a smart group of people working in data science right now because I mean, it's not like the models are that complicated, but that data is trash that's sitting behind most of the models. And so you have to do so much like. I don't know. It feels like I'm doing some sort of dance. Every time I use a data set to. To make it work and not do a whole bunch of irresponsible stuff. And that's kind of the AutoML sweet spot is once you get really clean, reliable data that you can build models with. It doesn't take a whole lot more work to get good analysis out of it. And so being able to use AutoML as a user only works. If you can trust your data, then you can kind of you can take your hands off the wheel a little bit more. Harpreet: [00:12:58] So when it comes to jobs in data science, maybe from an abstract way like tech at large. What types of roles are ones that. Or kind of not easy to automate or automatable. Borrowing the research aspect of the research scientist aspect of it. Because I know you go there first. So we take that away from you there. But yeah. What are some of the types of rules? Let's start with data science as in general, machine learning in general. What are some some roles that we simply just won't be able to automate? Speaker2: [00:13:33] So I broke that question down into basically you're going to be able to automate logical processes, but we're going to have a very hard time automating intelligent processes. And so when you look at any given job since they're so different, it's really hard to say this one job will be resilient versus this one won't be. So what I would do is just break your job into the workflow [00:14:00] that you do and look at what elements of the workflow are intelligent, requiring a level of synthesis of what you know to novel problems in order to be able to implement that part of the workflow. Versus What part of your workflow is just variations on a theme where you could I mean, if you gave it complex enough instructions, you could with a whole bunch of ifs statements, automate that part of the workflow. You know, like I said, it could be like the world's most complex tree, but you could using a tree, automate that part of your workflow. And so that's what I would look at, is how much of your work is an intelligent process versus how much of your work is a logical process. And you can just apply that across the board to jobs in general and data science and machine learning. When you look at like what data engineers do, there's a whole. Harpreet: [00:14:53] Lot of. Speaker2: [00:14:55] Logical process. Harpreet: [00:14:56] Work. Speaker2: [00:14:58] But you look at the architecture side of what they do know, that's an intelligent process because there's no one size fits all for every business case. And so you have to use a significant amount of synthesis of what you know and understand to look at the problem in a way that you can design an optimal solution and architect an optimal solution. So there's kind of that two sides, you know, the two roads in the woods, I guess you've got your intelligent processes and you've got your logical processes. And yes, Eric, I is if else statements. Yes, that's what we do for a living. Why did you say that out loud? We're all going to take a pay cut now. What are you doing? Harpreet: [00:15:40] I don't know, man. I've seen the architecture and are just some. Initial kind of research on the dolly to model like the text to actual photorealistic images like holy shit, that's got to be a shit ton if statements to go from [00:16:00] natural language to photorealistic images. That's insane. Speaking of statements and you talk about tree branches. I saw a diagram on LinkedIn earlier and it was about how Slack decides whether a user gets a notification or not. It's this extremely complex decision tree that we take for granted as a user because it's so seamless for us. But wow, it's a lot of engineering on the back end there. Thank you so much. Kind of kind of straight away from the question that that Karthik has. So I do want to circle back, maybe hear from from Eric or Russell or Cathy, if you would, down to to chime in here. But the question is, should I quit my job as a data engineer to upskill and learn what I need to learn to become a data scientist? Russell Then. Eric And then if. Cathy If you'd like to chime in. Speaker3: [00:17:03] I don't think I have an opinion only because I'm someone who's trying to break into the field. So I don't think I'm at liberty to to an opinion. Harpreet: [00:17:12] No worries. No worries. Russell. Russell looks frozen. So we'll go to Erik first and then come back. Right. So my, my my unqualified opinion is like, you know, it always totally depends. But if I'm going to imagine that I was in that spot, that I was a data engineer, well, where did all these big SQL mussels come from? And then and but I wanted to be a data scientist. I mean. In my parent company and probably some other places that I've worked. If I wanted to become a data scientist and I was working as a data engineer, I would probably have like in my current position, I would have management support to help me find opportunities to like start working that into [00:18:00] my job, you know, stretch assignments or cross-training, whatever stuff to help me get those opportunities. And I think that if you're in a place that I mean, if you're in a place that has both data engineer and data science functions, then you might be able to have your cake and eat it too a little bit to help you make that transition. But of course, if you're not in a place like that, then I guess you either have to decide if you want to if you want to try and do the data science data scientist stuff on the side like we always talk about. Harpreet: [00:18:30] I mean, it kind of sounds similar if anybody, anybody talking about wanting to break into data science, like how do you do that? Well, how do you break into a different part of data science? It's kind of the same to me. It's kind of same playbook because, you know, there are certain things I really, really like, but I don't do it my daily job. And if I wanted to get a job that would allow me to do that, I would just follow the break into data science path to do it. So I guess I guess that's my answer. Non-answer to that question. Thanks so much, Eric Russell. Let's hear from you, by the way, you guys that are tuning in on YouTube and on LinkedIn. If you have questions, please do let me know in the chat and I'll add your question to the queue. And after this, we'll get to we'll get to Eric's question about Nfts or somebody have questions about NFT. But Russell, you. Speaker4: [00:19:25] Sure. Okay. So I break this down into a couple of different layers. So firstly, if any person wanted to break into data science and they were in a non-related profession, say, I don't know, a kitchen staff or something, I can understand them more, wanting to quit their job, go train and go all in on that. If you're already in data engineering, I think you're kind of you're more than halfway there, to be honest. Data engineering and data science are inextricably linked. And I would defer to to Vin's previous comments, [00:20:00] you know, do some additional training outside of the work hours without giving up the job entirely. That being said, if you're one of the you know, if you're in a different profession and you feel that your only option is to give up or to quit your job and go all in on the train, be very careful about that. Consider how long it's going to take to the training. Make sure you've got finances in the bank to support you financially for not working for that amount of time. And not only that, however much you think it's going to cost, make sure you've got three or four times that amount as a contingency because you never know what the job market's going to be like. Speaker4: [00:20:39] Even for data science. It's a you know, it's a really buzz word at the moment. There's so many people going after jobs, you may really struggle to get it. So don't assume that as soon as you've completed the training, you can walk into a job and then circling back around. If you're in data engineering, yeah, you are over halfway there. And data engineering is a super interesting subsegment of the generalist data science, not the not the specific machine learning or statistical elements, but super interesting. So if you're in there and you're not getting the buzz from that, then maybe, yes, you don't want to be in data engineering to look for some of the other sub streams. But yeah, I think you probably would be better at trying to do something additional around your current work whilst you've got income and then move into something at a at a a more protected pace and timeline rather than, you know, pull the rug out from under your own feet to perhaps self motivate you. You know, you've got to do something, try and try and find that motivation without, you know, pulling that rug out. Harpreet: [00:21:49] Yeah, 100%. But you'd be surprised how much you can accomplish if you just wake up one hour earlier and just carve out half an hour or one hour of TV [00:22:00] at the end of the day. But 2 hours adds up to an hour's time. Seven 1052. That's a whole lot of learning that can get done. So the question that Russell had here is about the nfts losing value. Carlos is here for Carlos and ages. But have you seen that Jade? Jack Dorsey's first tweet value plummeted nearly 1,000%. I haven't heard about that, but I'm not surprised. I'm not too up on the on the web3 or nfts. You know things. I know when you've been doing a lot of research into that, so I'll defer to you on that. Speaker2: [00:22:43] Yeah, I don't think. I mean, what was the value of that tweet in the first place? You know what I'm saying? You can put technology around whatever you want to, but that doesn't make it valuable. And I think that's the interesting piece of it is you put technology around Bitcoin. Does that make Bitcoin valuable? Well, what does it do? And if you look at some of the early experimentation in El Salvador, not much. It really flopped. You know, I think it was El Salvador that they put it in and, you know, like the government got behind it. And it just the technology behind whoever kind of rolled it out there. Bombed. It didn't work. People stopped using it. You know, transactions didn't work. Right. So is bitcoin worth anything? Is in an FTX worth anything? It's really an interesting question because everything behind those is mostly technology based. And then on top of that, you have this. It's a piece of art. And if the piece of art or the piece of digital, whatever doesn't have any value, then. You have to look at the technology and say, does the technology have value? And if neither one of those things has value, I guess, you know, what are you buying? And that's that's what I have to ask every [00:24:00] time I look at an NFT is what am I buying? I mean, I get it. Speaker2: [00:24:03] There's a board ape and I can put it on my Twitter avatar. I mean, cool. Does that ape have value? Like half a million worth of ape, really? And I think that's that's what everyone who's buying an NFT has to ask themselves. Same thing with cryptocurrency. You just have to ask yourself, like, what am I actually buying? And is that actually worth anything? Because at the end of the day, if you want to trade that in for something else, someone else has to agree with you. You have to have someone willing to buy it from you for that price. And it seems like, you know, board apes on boats sell pretty nicely. And, you know, Bitcoin seems like people agree. So does that hold long term? You know, it's the same thing as any investment. Will it hold long term? I don't know. Elon Musk might tweet next week something wild and Tesla falls apart. You know, anything could happen when it comes to any sort of investment, no matter what you put behind it. So it's I think looking at an NFT right now until you get something where the technology is so compelling to be exceptionally value just on its own, you have to evaluate it like any other investment and say, maybe. Harpreet: [00:25:22] Yeah, there's that. The book Sapiens by Yuval Noah Harari. And he talks about this why money exists. It's because of this intersubjective reality. The fact that humans can collectively come together, make up a story, and all buy into that story. Right. There shouldn't be anything useful about a paper. I can't kill anything with it. I can't eat it, can't do much with it. But we all buy into it that this thing is worth something. And you see that happening with Bitcoin, with Nfts, things like that. These intersubjective reality type of concept. And maybe people are no longer buying [00:26:00] into the subjective reality of nfts. I do not know. But. Well, if there's any any other insight here, Eric, you say you got concerned about the Twitter thing. Tell me about that. And by the way, if anybody has questions, please do let me know right here in the chat or on LinkedIn or YouTube. Happy to take your questions. We also have a cute baby in the room that takes. Sure. So it's not data? Well, I mean, it will be data related, I guess, but it's not directly data related. I'm just I'm not I'm I'm a little worried about, you know, if Twitter was to go private, I don't know what that's going to look like because people talk a lot about talk about free speech. And we should be able to say whatever we want, blah, blah, blah, blah, blah. But like, that's because Americans don't understand the Constitution and what the Bill of Rights actually means and crap like that. But yes, you're exactly right, Russell. Harpreet: [00:27:07] That free speech is like when we say we want free speech, what we mean is we want to be jerks and and not have, like, any consequence for it. And and so I you know, who was I talking to recently? That's what I was talking to a guy who's an American citizen, but he lives in a different country. He has for like more than a decade. And so he's kind of had an outsider's perspective looking in watching things. And he just kind of talked about like the the degradation of democracy and things like that have happened over the past 15, 20 years or whatever, you know, and I'm I'm young enough that I haven't really been paying attention to very many, like very many like election cycles and even fewer administrations as presidents are sometimes voted for the second term and stuff like that. But I'm just worried about what that will look like from a [00:28:00] free speech perspective, because also somebody made a good point saying if it was really about just being able to say whatever the heck you wanted, then the other Twitter kind of clone apps, Parler or whatever other apps are out there would have done better because you can say pretty much whatever you want on those platforms and yet they didn't take off. So clearly it's not an appetite for just being able to be a free speech person and say whatever you want, whenever you want. So and I don't like Elon Musk is just enough of a loose cannon that I just have no idea what direction that's going to turn and what like what blowback would come from is. Harpreet: [00:28:39] I'm worried about that regardless of your political beliefs. I'm worried about that. Yeah. It's interesting. I wonder if it'll actually happen if the like how does this process have to work if somebody wants to buy a publicly traded company and take it private? So I feel that some filings with the SEC but do like all the shareholders have to agree to being bought out? How does that work? Anybody? Anybody know? I'm not sure. Yeah, no clue here. I know that wasn't it. Like, Vanguard just bought, like, a larger stake than Elon Musk has. So, like, yeah, Elon Musk owns, like, nine point something percent of the company and they bought like 10.2 or something like that. And so I have no idea what enormously valuable machinations are at play behind the scenes here. It's way beyond my pay grade and mostly be on what I care to know. But yeah, like there's clearly, I don't know, like, are they on the same team or are they just all thinking, Oh, Elon is just like trying to get everybody hyped up. So Twitter buys Twitter. Twitter like Mark Cuban thinks, you know, that Elon wants to get Twitter really valuable by getting everybody hyped because that's what he does and then just sell his Twitter stock and make a crap ton of money off it. So that's like a great plan B if he can't just get Twitter [00:30:00] for a song, you know? So I don't know. Then what are your thoughts on this? Speaker2: [00:30:08] In order to be bought, the board has to vote. And I know different companies have different definitions of board and shareholder rights and all that sort of thing. But I think in this case, what they have to do is get the board to go, yeah, we agree. We vote to sell the company for way less than it was worth like eight months ago because that somehow makes sense now. So you look at the offer and the fact that he said this is a one and final offer. And the cost of it makes you kind of wonder. I mean, is that a legitimate. Did you expect that to be accepted? And if it was, he said he had enough money at one point to take Tesla private and then he had to walk that back. His offer letter also says, you know, dependent upon financing or funding or something like that. So, again, you wonder if does he actually have the the funding to do this or is this exactly what you're saying, where, you know, this is a way for him to back out of something that he probably did on an impulse and maybe regrets, but you can't really speculate what's going on inside of, you know, the kind of mind that Elon Musk has. And so, you know, there's been a whole lot of people who have been trying to speculate about what it means and what he'd do. But I look at it this way and this is something I'm actually going to publish tomorrow, is if it was about free speech, why wouldn't you just use like Mastodon and build an alternative? And if it was really that awesome, people would leave Twitter and go to your awesome replacement, that you could build on a completely open platform and host at a pretty low cost. Speaker2: [00:31:56] I mean, for somebody, especially for somebody like Elon Musk, it wouldn't cost much. Wouldn't [00:32:00] that be the far more efficient way of doing this is to just make your alternative. And if it rocks so much more than Twitter, then. You know, you've represented free speech and everybody's happy and you don't have to worry about shareholders and profitability. It's on an open platform, so who cares, you know? So if it's all about free speech, why wouldn't you just make an alternative? Instead of forcing your alternative on people who are you know, if they're not leaving Twitter, they're pretty happy. Right. Because you look at alternatives like Parler and Gitter and all those other ones that are out there. They have to moderate people just as much and they moderate number one, to stay legally compliant. But on the other hand, they also do moderation because their audience gets angry enough and says, cancel this person. And so they do. It's all about who it is that ends up being the monetized commodity on your platform. If you have influencers on your platform and they get angry. Well, if that's enough of your enough of your revenue stream, then you'll do whatever they ask you to. If it's advertisers and enough advertisers band together, then, well, you're going to do whatever your advertisers want you to. If you're a company like Facebook that straight up doesn't Honey Badger, then you have enough money to be a honey badger and you can put whatever you want you on your platform and you dig data from whoever you want to. Speaker2: [00:33:27] You know, the EU finds you $1,000,000,000. Who cares? You made 30 of them last year. Know whatever it is that they're making now. So it's yeah. And that's I think that's the reality of every platform. And so when we talk about free speech, it's like that doesn't really even matter once you become an international platform because our definition of free speech, China's definition of free speech, India's definition of free speech, Saudi Arabia's definition. You know, we keep going around the world. I've named a bunch of different countries and they're all defining free speech differently. Like there is [00:34:00] no consensus in any of them what free speech means and what you're able to say and not able to say. So, you know, yeah, it's cool to be American centric and I like my country and I love the way that we built a constitution, but we don't have the right to impose that constitution through companies on other countries. So, you know, there's all of these different implications of Ellen taking Twitter private, but at the same time, like, well, any of those, you know, it's awesome to want things. My dad used to tell me that, you know, it's good to want things, but it doesn't mean you're going to get them. It'd be great to think about this platform as a bastion of free speech, but can you actually do that? Harpreet: [00:34:46] It's an interesting point that they didn't think about that sort of business to kind of impose their will on other countries. But very interesting. Something to digest. Appreciate that. Questions coming in here on LinkedIn one from Patrice Johnson. Patrice says Nfts metal blocks that coalesce to buy virtual versions of each tract of land in San Francisco. That's pretty cool. Hopefully it doesn't cost as much as the actual truck land. And Patrice is also asking, where do you see information architecture in the work that you do? It's a very interesting question. I'm not sure I know what information architecture is anybody familiar with this term that wants to jump in, talk about it. Oops. Googling it right now. I'll go to my go to person then because he knows more than all of us combined. What do you do? Information. Architecture. You talk about architecture. What information architecture is. How does this fit into the work that the scientist does? And [00:36:00] I've got to Google myself. Speaker2: [00:36:03] If it's an information sciences term, I kind of understand where it's coming from. But I mean, it's a pretty open question. I could spend 20 minutes answering it, so I'm going to avoid doing that because I'm pretty sure I would answer the actual question that Patrice was asking. So if if she can, I don't know, narrow it down a little bit to help me understand a little bit deeper. Harpreet: [00:36:31] So just Google in term, it says that information architecture is the structural design of shared information environments, the art and science of organizing and labeling websites, intranet online communities and software. So it seems like it's a way to classify things and organize. Speaker2: [00:36:51] Yeah, that's what I thought she was asking you. Your. That's huge. That's like a crazy that's that's a massive question. It's a good one, but it's a massive question. Harpreet: [00:37:02] Yeah. Patrice, I think you were here last week and I think you have the link. So if you want to jump in, please go ahead and let me know. Are you free to jump in shape in the building? Good to see you here, man. And the chat on LinkedIn. So it's good to have you here. You've got a question, so go for it. Yeah. Thank you, Harpreet. I've been meaning to come for some time, obviously, because I actually live in London, as you can tell from my accent. So it's around 11 pretty late, but I've found the time today to just work on myself and try and get some information up. I just want to say a big thanks to your service you provided for the last two years with lots of data science. I've learned a huge amount, so it's a really big service and I just want to thank you again for that in terms of my question. So I recently just started a role within data scientists and I'm a month in and I was wondering [00:38:00] just from your side if you have any advice going into like the free to six month mark, how you would navigate or if you've actually been on a managing, managing side, how would you like to see the support of your team do well in those six months? Do you have any advice on that? Yeah, let's hear from from Eric on this. Harpreet: [00:38:21] I think it would have some good perspective here, so go for it. Yeah. So let's see here. I started my job last June, so it's been like, what, eight? Eight or nine months, something like that. And so my manager, when I very first started, said, there are three things I want you to focus on in your first 30 to 90 days. The first thing is know where the data is like. Be able to just make a notes, making a couple of notes here so I don't forget. There we go. So know where the data is. If you're going to go into the data warehouse, like do you know the fact tables? Do you understand how they how dimensions work and connect everything together? It's like, okay, I could do that. I got to find stuff. Second relationships. Like, Are you meeting people? Do you know people? After 90 days, have you established relationships? Do they trust you? Will they ask you if they have a question? Do they think that you know what you're talking about? Are they just like, do you have that that good working relationship with people? And I'll tell you, I have probably invested in that piece more than like anything else. Harpreet: [00:39:31] And like I don't I think it's probably the most valuable is the relationships portion because don't know where to find something. Your relationships can help you find it. Don't understand how something works. The people can help you find it. So, you know, it's like this sort of cheesy thing to be like, Oh, people are the most important. Well, yeah, they are, because they made the system. And so if you don't know the system, the people can help you. And then the last, the third one was understanding the levers that we can pull to affect [00:40:00] our KPIs. So like for example, like I work with a lot of marketing data and so understanding, well if we are coming up short on say lead volume for the month or sales volume or whatever, which of our paid channels can we impact or or which of our paid channels have the greatest impact? Or could we use to drive that volume most easily? And so like now I understand that better than I understood on day one, obviously, because I didn't know anything. So like focusing on those three things is like, where's the data, who are the people and what relationships do you have? And then what levers can we pull to drive KPIs or the most important things? Speaker4: [00:40:43] Excellent. Harpreet: [00:40:43] Thank you for your answer, Eric. Russell or Lynn or Eric Gitonga. If you want to chime in, let me know. Welcome to jump on. Russel, any any input here? The rest might be frozen or that, you know. Speaker4: [00:41:04] I'm here, but Zoom crashed on me, so I've just come back and caught the last bit of that. I missed the first part of the question. Harpreet: [00:41:12] Could you repeat it for me? Yeah, sure. So. In regards to the situation I'm a month in in that they say they sent throw I don't know if it's helpful at first to say the industry or domains of industries like financial regulation, if that helps. So one of the things that I wanted to know was now one month mark, how could I make it better, easier, the pathway for me in the next three or six months? Or if you find it easier as a manager, let's say if you're if you're looking after a big data science team, what would you want to see from this? From the subordinates, like your employees within that data science team? Speaker4: [00:41:57] Okay. Yeah, that makes a lot more sense. And I caught [00:42:00] the relationships comment from some others and I would echo that. Absolutely make relationships, know the team, create free movement of knowledge, knowledge sharing between everyone and the team beyond that, really get to know the data, understand the data, just, you know, live and breathe the data so that if you see something odd in a data field, even though you're not seeing the the analysis output, you maybe will be able to pick up some odd occurrences in the data at the raw level and then follow those, those rabbit holes to the conclusion. And yeah, be be open and positive with everyone, even if it means making yourself more vulnerable by identifying something that you might think being a weakness. And I would say an awful lot of people say this on LinkedIn, but, you know, identifying weakness and owning up to your weaknesses, that can actually be a strength, you know, providing it's not something that's completely, you know, antithetical to the work that you're doing. If you say, you know, I'm not I don't think I'm as strong as I'd like to be with this one element. I'd like to be you guys have obviously been here a lot longer. What would you be doing in this instance? You know, my my gut feeling is telling me to go down this way. But, you know, I'm here to do data science work. I don't want to rely on my gut feeling. I want to learn better ways to do this and see how that that type of open, open transition of information is. Is accepted in the wider team. And if there are if they're a proactive, open team, I think they'll respond well to it. Harpreet: [00:43:45] Thank you so much. So I won't go to them next. But before I do, I just kind of give my little bits of advice here. I would just get as curious as you possibly can about the business, how it works, how they [00:44:00] make money, who their customers are, who their competitors are. Again, ultimately. Become a fanboy of the business, right? Because your success isn't going to come from you knowing how to write the most efficient SQL queries or write the most amazing algorithms. Your success is going to be predicated on can the things that you do and the things you spend time on actually make money for the business? Either make money, save money, or reduce costs or reduce risks, right? So let's get super, super curious about the business, about how it works, about how they make money, about how they lose money, and also figure out what it is that your boss gets promoted on. What's what's your boss like? What are his metrics like when it comes bonus time or hurts or them when it comes time? Or what are their metrics like? How are they evaluated? Figure that out. And then also, I don't know if the company operates on OKRs or whatever, but really take those to heart, understand those that that goes along with kind of understand the business well. But month end, after you've met the people familiar with the data scape and all that stuff. I tend to just go in on the business if it's a publicly traded company, read the CEO's letters to their shareholders, if it's a privately held company. See if you could find any type of annual report, quarterly report, things like that, because I think that's what's ultimately going to drive the most success event of it. Speaker2: [00:45:41] Pretty much everything, Harpreet said. And then before that, everything Russell said and before that, everything Eric said, I would just agree kind of across the board. Those are all great pieces of advice. The only thing I'd add is something that worked for me when I was first starting out at a new company. Not necessarily like expected in your first six months, [00:46:00] but it helped set me up for some success is I figured out what the team hated doing. And I tried to figure out how much of that I could start doing myself, because that sort of work is no one wants to do it, so they're all bad at it. And if you start taking it on at an early stage, yeah, it's not going to be the most glamorous work, but it's going to get you like the most credibility as fast as possible because you're contributing in a really obvious way by taking pain away. And same thing with external group. You know, we talked about relationships, but find out what team, the team that you're on has the hardest time interfacing with. And you're going to realize that because you are new, you have like this magic glow on you where you're part of the team, but you're not really part of the team yet. And so you can go to that other group as like this brand new person who can talk to them without all the baggage that comes with the reason why that relationship wasn't awesome. And so you might be able to be the conduit between two teams that have had a hard time interfacing with each other just by being new. And so those are kind of the two things you can do when you're new that will earn you. Like I said the most, it's not a huge impact, but it's the most obvious impact because those are two big things that the team hates. They're painful. Harpreet: [00:47:30] Thank you very much, Ben. Thank you very much. And everything that I said. So, like I said, any other feedback on that? Any tips, words of advice or choice? Let me know. Shout out to everybody else that just posted to the building. Patrice is here. She said they might be here to double click on the information architecture question [00:48:00] of Jurassic and Dare George Barris going on. And I think that's a big deal in the room. What's up, everyone? A couple of things coming in here. I'm looking at Charlie Littleton, who's a data and analytics manager at Procter and Gamble says that as a tip, just stay humble and ask for help when you need it. I love that advice. Thank you so much, Charlie. If you want to join us. I'll send you a link. So go ahead. Let me know if you get follow up questions or anything. I think someone called Kathy was saying, how am I liking everything so far? So I don't know if this is a space to just comment on that. Go for it. Yeah, go for it. How are you liking your first month on the job? It's your first job. Yeah. No. So I think when you first started your podcast, I mean, this session sorry is happy. Harpreet: [00:48:53] I was I was in marketing at the time, so. And then after that, one year after that I did a masters pretty much and I completed that. And in the last three months or so I did. I just found this job. It's pretty cool. It's like different domain from marketing to the, what you call it, financial regulation. But the funniest thing I'd like to say is I didn't think I'd actually get there, but but when I spoke to my boss and the reason why he said he liked my CV is basically I worked with Google Data a lot, so I worked in the marketing domain before. And I think that kind of just goes to show like even if you're not within that domain, you shouldn't be discouraged from it. You should probably just apply anyway because you might have you might have something attractive or meaningful to kind of provide. So that's kind of one thing I would say as well. If people here are struggling in that sort of way, I mean, just go for it. If you have projects or if you have experience, then make use of it, I guess. And then more so in terms of the first month with what Cathy said, [00:50:00] it's quite interesting because I think one thing I was kind of expecting was even though state of science, I thought there would be a lot of querying skills. Harpreet: [00:50:11] So I kind of did prepare a bit more on scale too much when that actually wasn't the case. Ironically, it was more software engineering, and it's just something I'm still trying to build. I'm not the biggest software engineering geek or whatever. I still need to improve. But definitely they say that things are edge of your scope. You learn the fastest. And so that's one thing that definitely stood out the most is like what my manager who's been there for like five or six years, he's, he's really, really good at like Python engineering and coding and it's like he but he nevertheless he showed me like I've been here for five years, there's no need to compare. And so just learning from people like that, I'm kind of happy with, but in other things like the CRM being involved in that and so that's one of the projects I'm kind of doing is just even though it's not pure data science, like it's not coding, it's just more about data literacy for the other team. Like I think then or somebody else said it is the department that had some of the. Speaker4: [00:51:18] Other the most. Harpreet: [00:51:20] Pain points. They just need a bit more consulting from us where like for context, the department is like 60 people and then five of us like a data science unit. And so we're helping out that that specific team, that's like ten of them, but they're not really good at data literacy. They find it hard to do data visualization. And so we're just coming in and being the hearers, really, and just documenting and and making sure that whatever project requirements they have for that CRM is coming to fruition. Because the. Speaker4: [00:51:56] First solution. Harpreet: [00:51:56] That my boss said was right, we might need to do Tableau, but after [00:52:00] the meeting that was in the case where it was just using a legacy CRM system and just reworking their reports they already have. So Stephanie, a good learning process is good problems and sometimes it's not always about Python programing. It's also like what legacy systems they have and just using your knowledge as much as possible. So I think I hope I kind of gave a good gist of what I was doing and what kind of you can learn from it, I guess. Kathy, any other follow up questions about the first month of the job? Speaker3: [00:52:33] No. Thanks for sharing that insight. I'm always curious about everyone's journey and you sharing that even though you're you might be underqualified, just shoot your shot. So that's what I kind of started doing since last week. I have some skill, some python, some tableau under my belt. I'm going to finish my master's science and data analytics this summer and has got some data science techniques and elements incorporated. So I'm just trying to see where the the where it lands. Harpreet: [00:53:05] I think I think you're ready to hit the ground running. Just remember, there's more in data science than just the job title data scientist, right? There's obviously data analytics, analytics things in there that might be something that you might be interested in. Product Analytics Insights Analyst is another thing I've seen. So there's all these jobs that don't have the title of scientist, but when we look at it like, Oh, okay, well, that's work I would enjoy doing, right? Like if, if I like doing this type of things, and even though it doesn't have that title, I would still enjoy that job. So when you do search for job that maybe look forward. If you do a search on LinkedIn, do search on skills rather than on actual job titles. Speaker3: [00:53:49] Oh, I actually never thought about that. Thanks for that. I'll add that to my little notes. Harpreet: [00:53:53] Yeah, not everything about having a job as a scientist. I used to have a job to be a scientist. Then I was a lead scientist [00:54:00] and now I'm not even data scientist anymore. I'm in marketing now, but highly technical, highly high leverage type of marketing. I make data scientist job easier by helping them with educational tools and things like that. It's awesome. Joey, thanks so much for coming. Thanks for the question, Patrice. Thanks so much for the follow up question. Let's circle back now that Patrice is here, because I think Russell. Had a bit of insight on the information architecture thing. Speaker4: [00:54:39] Yeah. Yeah. So I was saying I heard some of that before Zoom crashed to me, so I'm not sure if anybody else answered it, but the sector I work in is half data and half program, project management, project controls, PMO, etc. So we have a lot of different, shall we call them, work streams of information and information and data are kind of almost two, two sides of the same coin. Okay. So if we were to classify information into course categories such as cost or financial information, time or schedule information, quality information, those types of things then aim to productiveness and stabilize those streams of information. So you've got information, pipelines, as it were, and then transition those information pipelines to data pipelines, and they can then come into a data model. So you'll end up with both a data model and an information model, and they should gel 100%. And if they don't, you know, you've got an issue and you either have to look at resolving something in a data model or resolving something in the information model. And that could be in the way your production using and analyzing the information model itself, where it could be in the information pipelines. Very often it's the information [00:56:00] pipelines because data input quality is just it's the biggest challenge I think that affects everybody in data. But information also, as I said, because information and data are very, very much connected. So so that was my that was my comment. So, Patrice, I don't know if that if that was the way you were looking at information architecture. Speaker3: [00:56:21] Yeah. So I, I have come across information architecture as this new field that I'm really interested in and maybe have been doing in some ways under other titles and names. But my challenge is I run into people who are not in UX and they think they have an understanding from hearing information architecture of what that is. But then they hear me say what I am interested in. Then they're like, Oh, I was thinking of a different information architecture. So now I really want to know what the other things people mean when they say information architecture. Ah and I think you just described that from a, a very yeah. It was helpful to hear like oh this is, this is a thing that people might be hearing when I say information architecture that might be different from what I'm experienced or it might be the same in some ways, but it is at least inside of UX a not or it's not that it's not defined. It's defined by too many people in too many ways for there to be one, maybe for there to be one easily shareable definition, and it overlaps with a lot of other things. So yeah, any instances of information architecture that people are coming across are using or encountering in their work? I'm interested in hearing how that shows up. Harpreet: [00:57:52] Crickets. Russell, anything bad there or not? Anything bad there or anybody? Yeah. I'm [00:58:00] just. Speaker2: [00:58:01] So sorry. Go ahead. Speaker4: [00:58:01] Russell. No, I was just going to say I've seen the lengths that you've put in there, so I'm going to check those out. So then go ahead, please. Speaker2: [00:58:10] The reason why everyone has a different definition of information architecture is because you've hit on like the largest. There's an entire field that is that deals with different ways to architect information and to go from having data to having something that's usable. And you start going almost immediately to relationships between data and you go from your categorical taxonomies, you move one, and that's sort of the hierarchy and the structure. You go one step above that, and now you have ontologies which define the relationships between objects or concepts or, you know, really anything that's in information and ontologies contain sort of a crude domain knowledge. And that's why you're getting so many different interpretations of this, is because you ask anyone from like 15 different fields and you will get like you can ask a biologist and they're going to give you a different answer. I'm going to start talking about like hardcore ontologies that they build out to classify different types of knowledge and different types of biologists will define it's it's crazy how big the topic is. So when you say information architecture, I think I would always put like an asterisk after it and in parentheses put exactly what segment that you're talking about and to what extent. Speaker3: [00:59:45] That that would make sense. Yeah. Harpreet: [00:59:53] All right. Any other input on this topic? Let me know. Yes. [01:00:00] I'll stop. Speaker3: [01:00:04] Really? Yeah. I think the categories you make, the relationships you make, the things you put inside of other things and the connections that you make to them are that. Yeah. I guess that's a basis for user experience in terms of someone making their way through a product, but it's also the basis for how people can access or can't access information in a database or I don't know, maybe there's like maybe it's a basis for like ten more or 100 more things that. Different fields are built maybe in any field where you're building something, the the way people can get to what's available there could be considered how it's set up could be considered information architecture. Uh, behavioral ontologies. Interesting. Harpreet: [01:01:02] Yeah. This stuff goes way over my head. These guys talk about ontology. Speaker3: [01:01:09] At least the way I learned it was just the thickness of a thing. Is it a thing or not? If. If it's become a thing, then it's. It has an ontology or is an ontology. I'm not sure which one would be correct there. So I guess the way people do things would be a behavioral ontology by definition there. Speaker2: [01:01:29] What? What you're building out is an ontology of. Someone interacting with whatever it is that you've built out. And so that's a behavioral ontology because you're not interested in what it is that they classify it as or segment it as. You're trying to segment by behavior and you're trying to get a particular type of reaction out of them. You're trying to fit into a particular paradigm. So you're trying to get a response [01:02:00] from them by providing something, some particular type of experience. And so you're building behavioral ontologies, and that's going to end up being a segmentation because you can serve the same thing to two different people. And if they have different behavioral ontologies, they'll react differently. So and I think that's what you're trying to get at with UX is how to figure, to figure out how to know what to give someone in order for them to respond to the way that you want them to, to your particular design element or design aspect. At least that's what I'm guessing that you're using the behavioral ontology to do. Speaker3: [01:02:36] Yeah, because like, I guess one of the big challenges is a lot of a lot of products or even things that are I think this happens in real life too, right? People, they build a it happens in the built environment, but maybe more online because like it's there. But people don't know how to find it or don't know what it's called or don't know how to get to. And so that's actually different. Speaker2: [01:02:58] That's just a knowledge because you're doing a search and you have to figure out for that person's ontology, what does that search mean? Because if you do fancy restaurant, depending upon your ontology, that can mean a ton of different things. And so no, that's more of a traditional ontology is trying to figure out how that person connects objects together. So it's just concepts. And so if I say fancy restaurant, for me that might mean a really nice Italian restaurant. For somebody else, that might mean a steakhouse for somebody else. That's sushi for somebody, you know. And it could be that your ontology connects fancy with the type of food, or it could be that your ontology connects fancy with the price point. Or it could be that you have some combination of environmental factors, something that you look at in a fancy restaurant. And so that's really what you're. Yeah, that's the that's a very traditional ontology. Speaker3: [01:03:54] Yeah, I think it's also. So Abby Covert wrote a book in which she made the case [01:04:00] for this is something that everybody does like and any I know, any writer does this right. So say you write a book, you have to decide what are the chapters going to be called? Where are you going to segment them? How are you? Are you using chronological order or are you making your own order? That makes sense because of the way you of the things that happened in the story or some other way that makes more sense to organize them than to tell them in the order that they happened. So I think it's like a I think she would say information architecture is the way you organize things so that people can make sense of them very in in a very, very, very broad way. But I think if I I'm starting to appreciate, just like I knew there was something technical that had information architecture in it, like the people who build a, I don't know the basis for, I don't know, company computer systems or whatever else is getting built out there. But I think maybe way beyond that, there's many fields that have a way. They look at this and it may not even be called information architecture. They may not think of it as that, but I guess it's an organizing. Everything has a system of organization in some respect. Speaker2: [01:05:16] You can scare even hardcore scientists like Post. Harpreet: [01:05:19] Doc. Speaker2: [01:05:19] With the word ontology and creation of ontology, you can literally scare an entire room of scientists out where they will jump out and leave. And someone who can build an ontology is seen as like Iron Man or Wonder Woman. You are legitimately that level. If you can build an ontology in a scientific domain, it's it's like the top of the food chain in science. Speaker3: [01:05:48] Medical informatics. Harpreet: [01:05:51] That did it. Did a quick search on Google. Information architecture with data science. And it seems like there's a lot of research and just [01:06:00] topics and articles written on data architecture versus information architecture. So that might be something worth looking into to see kind of that intersection. Speaker3: [01:06:11] Yeah. You know, was it was it here last week? There's this there's this distinction that people make between data and information in different ways to that, I think is kind of fascinating. Harpreet: [01:06:31] Wisdom hierarchy? I think it is. There's the raw data. That information, so forth, builds on top of each other. Great questions. Let me know if you guys have any other questions. Shout out to Mark Freeman in the building. What's going on? There's a highly specific question here from Cristiano about shop and cap boost and. He's asking about bass value. So let me ask you this. Let me set the question. How have you approached trying to find the answer to this? So what? So I was explaining this to my manager who does not have any idea about Shab. So when I was explaining the whole shop value to him, when I was explaining the first plot, I saw that there is something called as base value, anything and. Can I just project my screen? It would be just easier for everyone to see. Yeah, definitely. Look, I'm not allowed to share my screen. My sure might be system preferences on end because people should be able to share screen. But pretty much your question is on sharp [01:08:00] force plot and we're trying to find out what the base value represents. So if anybody has inside of us, this is a we're sure that we need him because he's a naturalist. But yeah, go ahead and expand on that. Anything. Harpreet: [01:08:15] So base value like in one of the articles I read was it represents the average probability for if you're doing a binary classification, it's the average probability of the entire training set that it might be either zero or one. It I see it as that threshold or that splitting point, but I was just not sure because when I explained this to my supervisors in my company, I saw myself stammering. So I just wanted to cross-check and, and this question kind of popped in my mind. So is it like is it like just is it anything that's on the left hand side would be classified as zero or the right hand side of the base value that would be classified as one? Is it like the threshold or how? My question. Anybody has any input here? Definitely. Feel free to chime in. Top of my head, I don't. Don't have any. To add that, I'm just curious, are your people that are reporting? Are they data scientists themselves? Or are they just business people? Sorry. Yeah, this is like the. Yeah, it's the business people and with little bit knowledge on data science. So I, I found myself like a little confused myself, and I was just trying to explain [01:10:00] that plot. Okay. Yeah. When we were trying. Speaker3: [01:10:04] To solve for. What are you trying to solve for? Harpreet: [01:10:10] So I was trying to solve a whether let's say an app, an employee is quitting. The company are not quitting the company. What is the average probability? So the base value basically represents the average probability of like quitting the company. Let's say it is 3.36 to. So on a scale, if you can imagine anything that is on the left hand side of 3.362, would how would you see it. If I after I build the model now I'm just trying to test it out on a test set. I take out one zero from the test set and I apply my model into it. The model says it is like a little bit on the left side of 3.362. Does that mean it kind of indicates that probably, let's say its employee is unlikely to reach early the company? If it is on the right side, it is probably more likely to leave the company. So is 3.36 to the split point or what? Mark if you want to go. So that why is that? Mark, go ahead. Explain then, Russell after that. Because he's got some comments in the chat. Perfect. So this is going to be on the model side of it. I'm just from what I'm hearing, I guess, like the pain point I'm hearing is you shared this information from this analysis you did with business stakeholders and just did not stick with them. Right. Is that is that [01:12:00] correct? No. They asked me what what was the they asked me what I could not properly I was myself not sure about. I was trying to explain it to my colleague and that was myself. Not very sure about what the base value really represents in the scale. Okay. So my, my advice is going to be around adopting kind of this technical knowledge to business stakeholders. But based on what you clarify, that doesn't seem like a real problem. So almost a silent thing. I don't think what I'm going to say is going to help in. Russell. I see some comments here in the chat. Russell might be a bit frozen. Yeah. Speaker4: [01:12:48] No, no, I'm. My laptop's just hanging a little. Yeah. So this just seems to me like it's normalizing an analog proposition to digital return. So whatever metrics are being utilized to determine the probability of a worker leaving and for context, I'd say I don't know. Perhaps you would look at their average salary, their appraisal scores, their, you know, annual leave percentage and any number of factors that are using or being used to determine their score. And you've mentioned a 3.62 or something similar as the threshold value. So if they're less than 3.62, they're unlikely to leave or less likely to leave. And if they're over that, they're more likely to leave. So there's going to be a strength or a weighting to that. So basically, you want to assume that your 3.62 is your zero bed, and anything that's less than that means they're likely to leave. Anything that's more than that means they're less likely or vice versa, depending how the calculation is. So it seems to me that they're asking you to just predict any worker. Are they likely to leave [01:14:00] within a period of time or not? And they're not worried about the strength of return. So you can get someone that's like a to zero, which is like really quite far away from the 3.62 datum. Or you could get some that's 100, which is way more in the other direction, whichever way this is done. And it sounds like they don't want to know that. They just want to have a rough calculation of the overall field of workforce, how many people are likely to leave within a specific time frame and how many people are likely not to leave within the specific time frame? So I'd classify this as simply normalizing an analog return to a digital yes or no, and I think it's as simple as that. Harpreet: [01:14:40] I like that a lot. Thank you very much, Russell. So Russell knows his shop values and what they mean. So there you go. It'll be there. Recorded on YouTube right after this. If you need to run that back, go to the YouTube channel and and just kind of rewind it and you can take notes. Let's go and see if there's any other questions or comments going on. In the charts anywhere. Don't see anything coming into the chat here. Don't see. Oh, dear. George has a question. If you still here. Yes. Okay. Go for it. All right. Thank you. Yes. Okay. I've got two questions. First is. If you are in a training space, how do you prepare your curriculum so that it will reflect what is currently in the industry? And of course it is always the industry is always changing and all of that. So how do you set it in such a way that it's also dynamically changing [01:16:00] or reflecting what is currently obtainable in the market? That's the first question. The second is that we're trying to reach some fans and companies to get that business story right. So I would do this story for a compilation. We try to compile this so that we can. Which is out there for people to learn from and all of that and probably tell their story. So do you think it's something they will want to share and if they're to approach them, how will you go about that? Those are my two questions. Harpreet: [01:16:40] Still the first question I a first question first and then we'll circle back to the second one. So first question was, if you're in the tech training space as a trainer in data science, how do you make sure that the critical curriculum you create reflects what's going on in the industry? Good question, Mark. Can we take a stab at that? Yeah, I can definitely take a stab at that. I think a key thing is there's, there's passive ways of doing that and then there's active ways, a passive way of just being on LinkedIn, following the correct people, reading newsletters, reading the current books, coming out, reading the various research papers that you can do. And I say passive because you actually have to read it, but you don't have to identify people in the sense of like getting time with them to talk over a, over a certain amount of time, start building a network and you can start seeing what people are currently implementing in production. So another key thing to think about is like what's being talked about maybe in academia. So those papers and what's actually being implemented within companies. And so building up a network and talking to people, learning about their tech stack, learn about what they're implementing, more importantly, where the problems are facing problems they're trying to solve. [01:18:00] That's how you kind of get more into contextualizing your your content for what you're doing. Harpreet: [01:18:07] And I think it's kind of hard in our space. Our space moves really fast. I mean, you have the basic stuff that's probably not going to move around. So like statistics and training and test, right? But going to the more kind of like modern data stack, what's that? That's constantly changing almost every quarter, right? And so being being aware of who kind of like the thought leaders in that space or the companies in that space. So for example, our prefab this really great post about was it feather for LinkedIn coming out with their open sourcing their packages so being aware of things like that, that's kind of like how you stay on top and make sure that it's really still industry standard. I think it just requires you to being embedded within the industry and more importantly is by you being embedded in the industry, you're essentially that's like a service in itself, being aware of that. And that's the value you bring to companies when you do your courses. And how you can differentiate yourself is like, Hey, I'm not just putting out random stuff. I'm embedded in the community, I'm embedded in the industry. I know what's really happening right now. I can adapt accordingly, but I'm curious what others have to say. Let's hear from Patrice. Patrice. Absolutely ahead. Something along similar lines with what you're saying, Marc. Speaker3: [01:19:24] Oh, just one thing that I've experienced as a student is the instructors will invite people who are data scientists or who are working on whatever the the content of the day is to come in and talk about a problem or talk about a slice of the industry that they know. And that's one way to just I think that's a very effective way to take a day in the life of the industry and insert it right into the course. And it's also I [01:20:00] think it's it's possible to find people in industry who want to make time to do that because they maybe have been away from schools and instruction for a while. And they want to they want to see what you're doing in the classroom as well. I don't think I have much more to add on that to. I mean, it sticks in people I follow in LinkedIn and in the chat, but yeah, that's it. Harpreet: [01:20:26] Then any input there? Speaker2: [01:20:31] As far as keeping classes and the content relevant to the industry? Yeah, it's really just about having practitioners and people who are building the tools that are being used in the industry because having those people you're going to get kind of a diverse background and perspective. Whereas if you have a practitioner who's been working at company one for three years in company two for two years, it's going to be fairly granular. But if you're talking about trying to get a broader perspective of the industry, yeah, I bring people in who are building tools because they're having to talk to a bunch of other companies and get those companies to adopt their stuff so they understand a broader range of workflows and they'll probably be able to interject, I mean, because they're really, really granular and really applied, you know, they're building a tool. And if their tool doesn't work the way that the people that they're trying to sell it to do, there's no way that's going to be adopted. So you're going to get a good perspective, a good, realistic perspective of how things work. If you talk to like a data scientist or somebody that's architecting those those types of packages or those types of products. Scuse me. Harpreet: [01:21:46] Any other input. Anything to add there or that's why I quickly add. I love talking to vendors who actually really [01:22:00] think about it because they're talking about that go to market strategy all the time and they talk to like hundreds and hundreds of customers. I play in the chat. Vcs are actually really great to talk to because they're often evaluating and doing due diligence on many startups who are coming up and to get to the VC level. Not like seed investing, but VC to validate a market need and actually have customers. And so they're seeing early trends way before a lot of people are. And so there's some people I think there's some medium people who are who are VCs who run blogs based on what they're seeing from their deal flow medium to medium blog blogs. Yeah, like a medium blog. I think one of them are names. Anastasia, I'm blanking on it, but she, she, she's a VC or works with VCs and she basically talks about like up and coming things and data. Yeah, nice. And something that nobody else here said. But I would, I would again just go back to the tried and true things that are always really true in the data science field. And that's just teaching basics. Like make sure you teach people basic fundamentals of data literacy, skill and essence of how to communicate data to an audience using storytelling and obviously how to write some code. Harpreet: [01:23:27] Any follow up questions on that first question, dear Jorge, before we go into your second question there. Okay. You want me to talk the second question now? I will if you have any follow up questions on that first question. Not full time on some. No. The responses have been so brilliant. Thank you. Thank you, everyone. Really appreciate that. We're able to pick so soon. So many points that will really be. Thank [01:24:00] you so much. I appreciate that. So I'm okay with the first question. You want me to go to the second question? I can do that. Yeah, let's do that. All right. Thank you. So we're trying to do a compilation, a compilation of data storage companies, more like their case studies are using them for case studies and all of that. So the question is about how would you have pushed them for their stories, for their data stories out? They have become experts and make an impact to data. And also, do you think it's something that we want to share? Okay. So if I can just make sure I understand. The question is you're trying to communicate, try to come up with a bunch of case studies to kind of talk about how companies have extracted value from their data. And you want to figure out how should you approach companies to see if they're willing to talk to you and share the stories? Yes, I share their stories. Yes. Harpreet: [01:25:13] So definitely to put this over to anyone in the chat, that has something to add there. But I would say, you know, maybe first start by just looking and seeing what people have written on the engineering blogs. So Mark mentioned how I posted something about LinkedIn engineering, right? So try to go to the blogs and see if you can pull stories from there that if you can't if you can't get anybody to disclose how they how they made the sausage to speak. That's one option. But let's turn it over to Mark. And then from there, whoever else wants to jump in, let me know. So I really love your point regarding just pulling what's freely available online. I think the one question you ask yourself is when you get this collection of case studies, [01:26:00] what's the goal of that? You're trying to create like a free resource. Are you trying to create a product that you're selling that changes the dynamic of the relationship because you're trying to create a free resource? You know, they may be interested. They may see that, oh, it's a marketing opportunity. People will be aware of our service, right? But if you're doing a paid service and you might want to get into contracting, like who owns what regarding kind of like the copying and things like that. So you can get a little tricky in that sense regarding approaching them is I would avoid kind of like, hey, you'll get exposure, especially when you're a company, you don't need exposure, you need you need to pay the bills and fulfill your fiduciary duty. Harpreet: [01:26:42] And some random person asking for for something and nothing return. Right. That's that's not a very that's not very appetizing. I'm trying to blank on the word, but it's not a very appealing kind of proposition. And so you have to identify what leverage point that you have. What value can you bring to people you approach through your your services? So you may be like, hey, we're building these case studies because our committee I'm just brainstorming right now, but we're building these case studies for this community that we've built. We've built there are a whole bunch of data practitioners that we believe is your target market. We would like to collaborate, see how we can share kind of stories that could benefit both of us, right? So that the conversation is really targeted towards, you know, building, building value for for both sides. And so I would highly, highly recommend not approaching it where like, what can you do for us? But more so think about how can you cultivate value for everyone involved? And that may require financial means as well to get those case studies. But let's hear from you. And if anybody else wants to jump in here, please go ahead and feel free to use the raise hand icon and I'll be sure to call you. Speaker2: [01:28:00] Yeah. [01:28:00] I think the hardest thing that you can get out of a company is their case studies because most of them are pretty guarded about them. A lot of times you have intellectual property tied up into a case study just as a consultant. It was so hard to get anyone to give me permission to publish a case study that was detailed enough that I've just I stopped. I gave up, you know, between the NDAs that I signed and what their concerns were with privacy. I, I've never gone down this road because it's too complicated. So I think when it comes to getting case studies out of companies, what you're going to find is what they share with you. They're going to tell you, Oh yeah, that was last week. But no, it was more like three years old. They are notorious liars. And this is something that the Big Five consulting groups run into, too, is trying to get those case studies written that you see on their websites. They have to offer a ton of freebies. I mean, it's a lot a lot of what Mark said is trying to get that out of them is it's an expensive proposition. In some cases, what you can do to get high level case studies for data is start watching some of the financial channels. Speaker2: [01:29:14] They always have like a tech hour. And CEOs will go on and talk about what they've achieved last quarter with data and you'll get some very high level case studies. And you can dove into them then by following people within that business who will talk about individual pieces of it. Like there's a whole bunch of Peloton insiders that I followed to build out a post about what Peloton was doing and what they are doing with data. There's a whole bunch of insiders at other companies that I follow in order to create. Like Zillow was one that I followed a while back to create sort of this tapestry of what was going on inside of the company, because they won't tell you directly. But if you follow enough people in a company that's working with data, [01:30:00] you'll start to get like you can piece together enough where you can create a case study. But I would just be really, really careful and anonymize it. Go by industry and company size. Don't use company name. So like a major retailer in whatever country's marketplace or a major manufacturing company in don't actually use their name because they can they can take exception to that to. Harpreet: [01:30:31] It's super hard to even make case studies from your own clients that have been successful using your product to get them to like rush it. Hard to be very, very hard to be. So if you are looking for just case studies to kind of excite and. Inspired people maybe just look for it's already publicly available on engineering blogs or inside looking at and things like that. I think I'll go a long way. There's some comments here. Russell said, avoid asking anything that relates to IP because that will get you shut down in the blink. To have any chance of interaction, the topic should be generic and nothing that stands to adversely affect the target organization. 100% agree with that. People, not people. Cathy is asking you to follow on LinkedIn Will. Let's start behind me. I think that's the start. And then looks like a bunch of other people that. Patrice has listed. So yeah. Pretty much. Go to my podcast, the artists of Data Science and look at everyone I've interviewed. There's 200 episodes. We're gonna have to go through quite a bit of them and just follow everyone. Because we're all quality people as well as everyone in this chat. So that's a good start. Cathy has a question. Awesome. Any other questions [01:32:00] coming through? I don't see anything on LinkedIn or on YouTube. I'll begin to wind down the hour then. Be sure to tune into the episode that was released today again with Christine. Stephanopoulos So, Kathy, that's something you should follow for sure, Christine. Stephanopoulos If you look up the hashtag hashtag book a week challenge, you'll see her all over there. Definitely follow Christina. She's awesome. Harpreet: [01:32:24] And tune in to the episode that was released with her today. Great episode. If you're going to be at odds in Boston on the 19th 22nd of April, so that's next week. Holler man, I'll be there. I'm looking forward to meeting as many people as I possibly can. I went ahead and got my booster shot. So, you know, I'm triple vaxxed. You wear my mask and everything, so dapps and hugs for everyone. So please come through and say hi. I'll be over there at the at the packaging booth. Mark Odessa, London, June 15, 17th. That is not on the agenda. I'm going to email ops world in June, June 15th to seventh. So I will probably be too tired to travel on the 15th, 17th, but maybe next year or you went to Denmark or London. I am. I am. I booked my tickets before finding out that everyone here was going to East, but I'm going to London, so that's fun. Are you presenting there at all? No, my my company is just like they gave us a stipend for events and conferences. I was like, cool London. Nice, nice. Yeah, yeah. I probably won't go to London. That's two back to back trips or to close. If anybody lives in Denver, though, let me know. I'll be in Denver May 23rd to the 26th. Be happy to meet up with somebody for a beer. So shoot me a message. Send me an email from Denver. That'll be dope. Again. Be sure to tune in to the episode [01:34:00] that was released earlier today with Christina and tune in to all the episodes. They're all amazing. The science podcast is pretty dope. Harpreet: [01:34:08] I might be biased, but it's a good podcast. Listen to it. And also, if you haven't already, be sure to follow the hashtag 66 days of MLW Ops. I just completed day ten, but over the next remaining 56 days we go more and more to MLW Ops. Over the next few days I will be listening to and sharing my takeaways from some episodes from the Ops Community podcast, starting Demetrius Bregman and David Aponte great podcast. I love that podcast. Marc, I know you're on that show today. We trust that we know, so I'm looking forward to your episode. I think I think that would be the first episode I would review and talk about. But I'll be I'll be sharing what I take away from there. And then I'm going to make up specific topics and I'll talk about what I've learned in those talks. But then after that, I'm going to go deep on some technical stuff. We're going to start with Docker moving to Kubernetes and then talk about all the different deployment. Stacks will go from cube flow to Seldon. I'll talk about obviously talking about Pachyderm. I work there and then talk about how all those things fit together and talk about the work that my colleague Dan Jeffries is doing over at AIA. That is the Infrastructure Alliance. So keep an eye out for 66 days of ops. So go ahead and follow. That said, y'all, thank you so much for being here. I appreciate you all spending some time with me today. Remember my friends? You got one life on this planet. Why not try to do some good shows of love?