HH87 - The Artists of Data Science Happy Hour #87-KUEpn6uiapM-128k-1656270688400.mp3

Harpreet: [00:00:09] What's up, everybody? Welcome. Welcome to The Art of the Data Science podcast, the artist data science happy hour, that is. It is Friday, June 24th, 2022. I'm super excited to have all of you out here. Hopefully you get a chance to tune in to the episode that was released today. Finally released the episode that I recorded with Darlena quite a long time ago. So that's finally out. So definitely check that out on the podcast. We had a great conversation. I can't remember when this was recorded. Must have been like late last year. It's been a while, but check it out. It is a great conversation. Yeah. Just real quick, I just want to talk about what's happening in the news today. So they overturned Roe versus Wade. And I think it's a sad moment in US history for sure. I think people should have like women especially, they have rights to their body like this is kind of just messed up situation. Like I can't, I can't really articulate it, but I know there's so many reasons just going beyond the fact that they're being denied rights that are in body. There's like economic and societal implications of this as well. So I know that most of us are data people and I was doing some research and came across a video by Vivian.

Harpreet: [00:01:33] Your rich bf on Instagram is her name and she just threw out some statistics. And I just kind of want to put this out here to to really just maybe help educate us and, and educate me and hopefully educate us, but some of the societal implications of this. Right. So first, she says that abortion isn't just about women's bodies. It's also about economic opportunity and money as well, economic security [00:02:00] and opportunity. So first, access to abortion increases a woman's probability of graduating college by 72%. Right. And not only that, they have a have shown studies that delaying motherhood by one year. So delaying motherhood by one year due to access to legal abortions helps increase women's wages by 11%. So there's a lot of implications just beyond that. And I want to bring that to light. Look, I mean, it's a touchy subject I know for, but if anybody here has any insight into into this, I'd love to hear it. I know it's. Most of the men on this panel. But I feel it's kind of our duty to truly speak up about this. So I hate to put anyone in an uncomfortable situation, but let's go with Canter's.

Speaker2: [00:02:52] Yeah. I mean, something that I think is really important is not not just the the female implications here, but also the economic implications of a decision like this. One of the first books that really got me excited about data science and math and statistics was Freakonomics, which I read quite some time ago. And it's either the first one, Freakonomics or Super Freakonomics, that talks about the implications of abortion in different communities. And the relationship with that to crime, to economic upward mobility, to other opportunities was was pretty fascinating. I wish I had the exact numbers, but that's something I would really recommend reading up on and revisiting because it isn't just like, Oh, this is an individual problem. This is something that is a lot larger than that. And it's at least pretty scary on my front to think about how damaging this is, especially for people who do have less access to normal resources. [00:04:00] I think something really terrifying to me is bringing up children without that that are, for example, going to be brought up in scenarios where they're not going to be loved, where they're resented, where they're put into a really bad position and essentially set up to fail. And when we're creating a structural incentive to fail for a lot of youth, that is not a very good outcome. And in my personal opinion. So not to get too political, but I think that there is a lot of statistics and a lot of research done into the broader economic implications of a decision like this that I think should be explored a lot further.

Harpreet: [00:04:47] I can't. Thank you very much. I appreciate you sharing some insight there. I remember reading that in Freakonomics as well. It's interesting study that they had had them. Look, if anybody else has anything they would like to speak on this matter, just any point in the conversation, feel free to just drop a comment in the chat section or drop comment in the live or if you're watching that. That being said, if everybody has any questions, please do let me know. Or if anybody has any comments, let me know. I'll open it up. Anybody wants to chime in here. Then I think.

Speaker3: [00:05:26] I got to be really careful what I say because. Not happy. But I mean, my first reaction. Good luck enforcing that. You are going to see the first time you try to enforce it. You're going to see what happens. I mean, they're in the F around stage and that's over now. Now they're in the find out stage. You're going to find out. It's that's where it is for a lot of us. It is. That's it. You know, it's gone from. [00:06:00] You're skirting around things and will let systems and hierarchies handle that to know. No. That's where I'm at. Nope. Sorry.

Harpreet: [00:06:16] Yeah. Thanks so much. I guess so. You guys got any questions on any topic in particular? Please do let me know we're kicking it off, I guess. Let's go with the. Let's go with what we've learned this week. How about that? It's been a it's been a question. If you're yes, please go for it. Okay. Okay. So I am trying to understand. Data like data base, like design schemas, like I know they're like different things. And so there's like the star schema, which I guess is the one I'm most familiar with where got your fact and your DIMMs and there's like snowflake schema which apparently does everything to a snowflake is probably around before snowflake. But I don't exactly understand how it's that much different than a star schema except maybe like a star that has like legs coming off of the point of the star, maybe, I don't know. I read something about an activity schema that narrator A.I. uses, and they think it's pretty awesome, but I don't really exactly understand what it is. It just sounds like it's one giant table. So I was just curious, like, what have you worked with most? And that doesn't necessarily mean you would recommend it. What would you recommend if you were trying to put together, let's say, just like a fairly simple database? They set up for for like a company. Yeah. Any input? Yeah. I'm not well versed on all the different schemas. To be honest. What have you experienced most? I've experienced [00:08:00] more of the star, but I don't know all the different alternatives, to be honest.

Harpreet: [00:08:05] I guess that makes me biased. I don't know. I think I think the database design should have a lot to do with, you know, not only, you know, it being, like, intuitive for those that are going to work with it. Because I've seen some some things that are done, you know, like pretty much to convoluted the whole process and make it what's the word very difficult for for people to use. And then the other one is efficiency with efficiency in mind as well. But there has to be kind of like a like a balancing act between both and another one that always gets me is like the naming of things, you know, proper naming conventions and everything, because again, otherwise it becomes a question of usability in the long run. Yeah, I don't have any more comments beyond that. I don't know. I, I, for a while I became somewhat well versed on this because even though I wasn't like the database database administrator, I wasn't a database, it was a part of my job. It was one of the hats I wore in web development. This is a long time ago. So yeah, I guess I just kind of feel like, like, you know, Mark Freeman a few days ago posted about like data modeling and sort of like reading through the comments because it was something I was thinking about as well.

Harpreet: [00:09:40] And I was reading through the comments and there are some there was like a one that was like 35 comment, long thread back and forth and it was great. It was really interesting and it was for the most part pretty civil, but it was like these people who are clearly very well versed in the structuring of a database. And I just feel like as I read, it's like people just [00:10:00] like trash the star schema. But I'm like, if it, if people trash it, I want to know is it because it's not actually, you know, like that. Great. Because it sure seems to be pretty pervasive. And or is it like mostly an academic thing where we're saying we can do so much better than the star schema, but surge to your point, like is it going to be like intuitive for people to use and that sort of thing? So that's what I'm kind of trying to figure out is like, is the star schema the scapegoat or is it really just like old and crappy and we need to use something better. Well, this, this again, it's not my area, but I know a data modeler. His name is Francesco Puccini. And and he he wrote a book with Bill Inman called The Unified Star Schema. And it looks really interesting. It's it's like it's strange the way it's laid out, but it makes sense, you know, given the kind of problems that happen, you know, like when you're when you're doing SQL and querying all these different tables, like there is I forget what he called it, but there is a phenomenon that happens where you even if you do join the right way, you end up like duplicating your records.

Harpreet: [00:11:24] And that's why you see all these cases. I've seen them all the time of people ending up putting distinct, even when it's not really needed. Just just in case, right? There can't be never enough enough distinct. But I, I forget what he calls. He calls it. But look him up. He has some YouTube videos where he explains the concept and explains the reasoning behind it. And it's to avoid all these pitfalls that happen by using precisely the star schema. But I haven't seen I don't know the alternatives [00:12:00] if they actually if they actually counter these problems in any way. But, you know, the fact that Bill Inman got involved in drafting this new schema, there might be some something to it. Found it here that looks like there's like a data vault YouTube summary. So that's going to be a good place to start to. Thank you. Yeah. Russell chimed in here and after us. And maybe then. And if anybody, that's. Harper. It's needed. But anybody else? Something. Yeah, right. Thank you. Thank you. I was. I say I. Let's go to Russell. And then after Russell, maybe then and if anybody has questions in the chat, just drop comment. Two questions.

Speaker3: [00:12:48] Okay. Thanks, Aubrey. So talking about schemes, I think there's a lot of snobbery about schemers in the industry and perhaps because Star Schema is one of the easiest to start with, especially if you've got a low number of data sets, you know, you've got your facts and your dim table. It's very easy to set up something with, you know, three or four that all relate back to a single central point for the star scheme. However, in my opinion, very often they're not implemented as rigorously as they should.

Harpreet: [00:13:22] Be.

Speaker3: [00:13:23] As surges maintaining, you know, you might need to put distinctness in more often because the the tables haven't been structured and clinked as well in both ends. And it depends if you allow cross filtering in both directions across either point of the central star as well, that can give issues. Very often I'll also add additional nodes onto the outer points of the star. So it'll kind of become a star connected to a star or be an extra factor able to a to a single dimension table somewhere. I'm sorry the other way at an additional dimension table to a single phone [00:14:00] table.

Harpreet: [00:14:01] And if you do that a.

Speaker3: [00:14:03] Lot, it then migrates into what I think is the snowflake schema. And I think I think that's in the, the text that I'm going to put the link.

Harpreet: [00:14:11] In, in, into the.

Speaker3: [00:14:12] Chapter. But yeah, I think it's more for snobbery because it's almost the most accessible.

Harpreet: [00:14:19] Schema for.

Speaker3: [00:14:20] People. First coming into data modeling, it's the first one you get to. So people want.

Harpreet: [00:14:24] To.

Speaker3: [00:14:24] Consider themselves above it and move on to something else, even though it might well be the best for some some basic databases.

Harpreet: [00:14:34] Cool. Kind of what you said there made me think. You're talking about, like, how sometimes you get into trouble based on the design. So, like, I guess the step back from that is, are we identifying our entities like, like the process of even getting to the point where we're now going to like lay out what we want this thing to look like once we actually pay money to build it. So I guess maybe understanding that process, I would love to understand that process better as well. I'm sure there are frameworks and books all about that as well. Russell, thank you very much. Anybody got anything to add here or can we design a database? I can honestly say this is something that I have not spent any time thinking about for my career. So it's good to hear all you guys have discussions that of my head. Then go for it.

Speaker3: [00:15:24] I would just say that. So it's optimization at the at the lowest level, they're arguing over optimization. And, you know, and Russell kind of nailed it when he talked about relationships and the complexity of the relationships. And beginning to understand your schema is really the connection between your categories. And if you think of every column and every table, you're thinking of a different category, supposedly. And some of those categories are related to each other by business logic. Some of them are related to each other by something a little bit more natural, just basic [00:16:00] domain knowledge and domain expertise. And the more complex you get down this rabbit hole, the more you start doing performance monitoring and analysis, and then that tells you what schema you actually need. It's not the schema you start out with. It's What about that schema doesn't work, and then you fall in love with one or the other. But remember, there's also a different type. You could do no SQL and you get do graph, and sometimes having less structure is more optimal. Sometimes it absolutely makes your life a nightmare. And once you get into a certain level of complexity with your relationships between data, that's when you want to start exploring a graph database because it is so much more elegant, because you're no longer just querying along the lines of data, you're querying along the lines of relationships, and you don't necessarily want to learn about the data.

Speaker3: [00:16:59] You want to learn about the information contained in the relationships between data, and that's where you begin to evolve into the graph structure. So just always remember that what you start out with, like just build a relationship table and you're good. Like that's, that's step one and they'll tell you to build out like this. Most databases have tools where you can map the relationships and everything and get a visualization of it. And that will give you an idea of how horrible what you're encountering is. So at that point that you actually put together a database and start creating relationships. It is that visualization. After you have about 15 or 20 tables that you start realizing like the architecture and why we have all of these different schemas and different types of databases and everything else. So that's that would be what I would add is just realize that at the very beginning it's not that important except to begin to document relationships.

Harpreet: [00:18:01] And [00:18:00] so here you guys talk about all this. I know there's probably some people listening and even myself, there is this shit that I should know how to do. Like as a as a data scientist, as a as a someone who's primarily doing machine learning type stuff. Is there something that a scale that we should have to. Well, then. Yeah. Awesome. Well, guys, I think we all ought to learn upon follow up. So any questions? I don't have a qualified answer to your question, Harpreet. Let's take Vince at Jake. But one thing I will say is I didn't understand, like. Even like the star schema. I didn't know how to tables. I mean, I got the idea of how tables are supposed to relate, but I didn't understand the logic behind that. Like, until like a year ago, like when I started working in my job and I was like, Oh, this is brilliant. Like, why didn't I think of this before? Like, Oh, because I never needed to, for one thing. But like, if I would have known about it, if it had even been like mentioned during school or something like that, it'd be just like, it just seems like pretty low hanging fruit to say we're going to think of data in just a little bit of a structure where we're going to have a fact table, a couple of different dimensions, and that's how you're going to join things in ways that make sense. So I don't think it needs to be something like go out and like build a database yourself. But I think conceptually, even just having like a good conceptual knowledge of that would have done me a lot of good. And in the comments you said that you love graph databases.

Harpreet: [00:19:42] So real quick is also talked about graffiti this is so what do you love about graph databases? What have they made easier to do for you? Well, I think for for lots of applications that [00:20:00] could leverage the kind of tools where you're seeing things in terms of nodes rather than in terms of records. You're like things like route optimization, you know, like for my startup I use graph and know SQL a lot and my startup had a lot of geospatial data because it was about places and events, so it was very important to know what was near. So it's really good for that kind of application. It's. It's also it really depends what you're working on, like what Ben said. It's not something you can apply to everything. Or at least, well, maybe you can, but maybe you should. So I. I can't. I'm drawing a blank on, you know, but there's a vast amount of different applications it's useful for. And it's, you know, it could do with social networks, like you're analyzing connections between people or maybe maybe in things like maybe Ken here can tell me if it's, there's something applicable in, in sports analytics, but I'm sure there is because there's so many relationships between people. Like, if you want to ask the question of, you know, the what is it, the six degrees of bacon or is it seven? Well, how close is one note to another, that sort of thing? It's really good for that sort of thing. I don't think you can do that as easily with a traditional like DB seems. Have you heard of the use of scrap databases in sports analytics?

Speaker2: [00:21:51] In some sense, yes. I mean, we do a lot of work with geospatial and positional data. Um, to be perfectly [00:22:00] honest, I haven't really worked with too many that I, that like off the top of my head I can think of. But there's definitely applications. It's more about adoption and then like what companies, what teams, what organizations are willing to experiment for either efficiency or ease of use.

Harpreet: [00:22:19] How does that work? You said geospatial and positional locational data. How does that work with sports analytics that like trying to say like if somebody is running across a field and they're striking taking a football or something like that or a soccer ball, where does it go? It's how to use that type of data in sports analytics.

Speaker2: [00:22:38] Yeah, well, so a lot of it is just the positions of all of the athletes on the court or the pitch or whatever it might be in the position of the ball. And so from that it's all time series. So you can play out the entire game in a series of points and you can predict what will happen next in a frame. You're using a lot of time series predictions to figure out what might happen next or if there's some instance of injury that might happen in a certain circumstance based on a bunch of the moving parts. So it's pretty it's a lot of opportunities. You can also get finite events from that time series data like when someone takes a shot or any of those types of things. So you have a lot of different information that you can discrete ties if you really want to. So at the most basic level, you can have all these points and it can be refined into specific actions.

Harpreet: [00:23:36] Now I'm curious, like, how do you even even get collected? Is it like wearable devices or is like somebody actually going through a video and saying, oh, it was at this.

Speaker2: [00:23:48] Point they're.

Harpreet: [00:23:48] Not doing it right.

Speaker2: [00:23:50] So there's two, two ways they do it. So one is video. So for each, for example, basketball game, there's a company called Second Spectrum that has [00:24:00] their equipment set up and it takes really high quality video of the court and it maps all the players as they move around. In the NFL, they have trackers like little chips in their football pads that are tracking their position of all the players like that. So in the future, I expect I'll probably be more sensors where you could get things like heart rate, blood sugar. Crazy stuff. I mean, we could do that now, but athlete unions are very much against that because they believe that. Teams will use that to lobby against the players or in contract negotiations or things like that, which would be. It probably a really good thing for for the teams, but not a good thing necessarily for the athletes if they're not working and help them improve a bunch. But it'd be one thing that could be held against them. So that's generally the way that that information is collected as of now.

Harpreet: [00:25:05] Is that sort of like some type of, I guess, bringing this back to databases when you get that data? Is it just like a database like JSON blob or how does that get sent? Yeah.

Speaker2: [00:25:18] I've only worked with it in a database format as a like a normal SQL structured database. I would imagine that the companies that collect it don't store it in that format because it doesn't seem like that would be the most efficient way to keep track of all of it. I would probably have to ask them, which maybe I'll try and do on the podcast one day.

Harpreet: [00:25:41] Yeah. That's interesting. Camping so much. And you recently did somewhat fast. That was pretty interesting, right? Using machine learning to get better at swinging in baseball. How did that how does that work?

Speaker2: [00:25:54] Honestly, that was sick. So they did this project to teach about data literacy. So essentially [00:26:00] they built this batting cage at the SAS campus and they trained a pose estimation model on well, they didn't train the pose estimation model. They used the pose estimation model to get embeddings for college baseball swings. And then they built a model that would compare your baseball swing and where the angle of all of your your swing was at at each point in the swing to determine how far you were away from a college baseball players baseball swing. And then based on that, they told you how you could improve your baseball swing through that process. So something I've been talking to a lot of companies about, especially related to sponsorship, is it's really hard to sell a machine learning product to someone who doesn't know a whole lot about machine learning. And this is a really cool and simple way to show value that machine learning is creating and something that a lot of people understand, which is sports. So you can go in, you hit balls, it tells you what's wrong with your baseball swing. You make those changes and in general, you hit it harder. So they were using, I think it was ten kids over six weeks to document where they started and where they finished. And they didn't have a control group. So I gave them a little a little flak for that. But the progress that some of these kids were seeing was really impressive. We're talking about like 15, 20, 30, 100% improvement in exit velocity when hitting the baseball. So if anyone wants to learn more, I made at least what I think to be a very fun video about that.

Harpreet: [00:27:33] Definitely. I'll link to that right here in the in the chat and then also on the show notes. Like it's interesting because you do this with SAS and that's like I use SAS back in the days in insurance and when I was in bio stats and it's not something that I typically associate with machine learning. Did you get a chance to see how these data scientists were building? These models are doing. They're like right shoulder to shoulder and like seeing how they're doing all this stuff in SAS. And [00:28:00] if you did it look just complete different Python workflow.

Speaker2: [00:28:03] No. So I mean, honestly, SAS is making a really strong push right now to integrate well with open source. So you can integrate with, I believe, the majority of their infrastructure with Python or with our, you know, everything was Docker ized, everything was put into these confines that people are familiar with and using now. I mean, the way I look at it is it's another platform that lets you use a suite of machine learning tools rather than having to specifically tweak them on your own. So just like a lot of these other products that are more like out of the box, maybe like a hybrid between like what a data scientist would use and what an analyst would use, it's a little bit more of like acuity in those types of things. I didn't have hands on with the product. I was more just there to talk about the use case. But I will say their campus is outrageous. It was one of the coolest places I've been in forever. It was. It was sick.

Harpreet: [00:29:06] Really, really cool then have you it sounds like you've been there. You've checked it out. So tell me about about your experience using SAS like over the course of your career, I'm sure you've probably seen some crazy things where people are deploying SAS models into production. Like what does that look like?

Speaker3: [00:29:23] So yeah, their campus is unreal. I've given a couple of talks there and it's, oh, massive. And you don't even realize how big it is from some of the conference areas until you walk around and it's just like, Oh my God, this keeps going. It's yeah, it's, it's huge. And my relationship with SAS, I'm just I'm not going to say anything because it's, you know, if they say if you don't have anything good to say, don't say anything. I have nightmares about SAS, I'll just put it that way. It has its place in the scientific community. It is very useful there. It's like [00:30:00] the stats field and economics field go to. But yeah, I don't know why you would do that to yourself and I'm sorry. Please invite me back. But yeah, I can't. I just.

Harpreet: [00:30:14] I can't.

Speaker3: [00:30:15] I it's hard. It's hard to use. It's hard to.

Harpreet: [00:30:21] Use. Yeah, I do have nightmares about SAS. I was like, the biggest point for me, I guess my transition to the alliance was after having been nothing but a SAS kind of statistical programmer and statistician for like five years was changing the way I think. So I can write a more general programing language like Python because SAS is. It's really. It's different. I can't even compare it. But luckily there's a there's a page on the pandas documentation that was Python and pandas for SAS users and that help help make that transition. I definitely don't have fond memories of using SAS and studying for the SAS exams. If anybody has any questions, please let me know the chat or right here in the room. There are no questions them. But be sure to ask happy hour, that's for sure. Coastal worker. You are needed coast to coast. It. It's happening. She spent the last three years, everything's like, oh, you're on mute, right? So question for you guys. Have you guys seen in the tech space, the data science space? I'm talking a mix of like short projects, long projects, four day weeks. [00:32:00] Have you seen it work? Have you seen it fail? Have you seen it executed at it? I mean, what what's your experience with it? Do you think it could work? Do you think it could? I mean, what are the signs of it potentially working and what are the signs of it really struggling? I'm just curious. And do people actually see value in it? Right. I'm just wondering what people I mean, mostly in the States think because there's different views of it across the world. Right. So. Yeah. Let's go to Ken and then we after Ken, we'll hear from them. And by the way, if you get any questions or anything to add, please do let me know. Ken, go for it.

Speaker2: [00:32:42] So out of the gate, I would say with the whole new worker mode movement, there's probably a lot of people that are already working four day weeks and are just like, I don't know, like leaving Slack on or doing whatever it might be. I don't think that that for any stretch of the imagination really hurts productivity if you're getting a work done. I mean, the nature of our work is largely project focused. And if you're delivering the things that you need to be delivering every week, and if you can do it in four days, who cares? Right. I mean, if you can do that, at least for me, I could probably do like three, 4 hours maybe of really good solid work each day. And those are the number of hours I'm putting in. And that's all I'm going to get done in that period of time. Why should I be in the office or why should I be working and doing like really subpar work or work that's detrimental during that time? So I'm a big fan of just saying like, Hey, these are the things we have to get done. If you can do it in four days, great. If it takes you five days, great. But we shouldn't be putting like an an hour requirement. It should be based on business need. I don't think that that's a common sentiment among companies themselves. But [00:34:00] every company that I've talked to or worked with that had a more flexible schedule, more of like, Hey, be here if you need to be here. Like, don't be here if you don't have anything to do. Has, from what I've seen, worked relatively well. I'm sure that there are examples on the opposite side. I mean, Elon Musk is a stark proponent of the complete opposite, and he's done some relatively neat stuff in his lifetime. So I could definitely be off base on this one.

Harpreet: [00:34:33] But to add some flavor or some context to that. Right. So obviously there's you're right when it's like project based work and it's kind of time boxed and time limited, those kinds of functions are quite easy that I can say, Oh yeah, cool. Course you can finish that in four days. Great. Three days, great. Whatever. But when you're talking about providing proactive, providing like a long term service, for example, you're going to need support, like tech support. You're going to need when your product goes down, you're going to need all these site reliability engineers, you know, customer you know, customer success managers. How do you manage all of those moving parts? While I assume what we're going to see in the next ten years is a general transition to an understanding that four days is a normal thing, right? Is that like from an expectation management standpoint as well? You're trying to hire the best talent. You're trying to set a competitive a competitive difference, right. Where you're saying, okay, we are leaning into the four day workweek. That's better for you. So come join us because we're right. Where does that? Factor into it because these challenges are not small and they do impact you know, they impact clients, they impact projects products. Yeah. So that's kind of the context behind some of. Well, if I may interject, I [00:36:00] think at the moment four days might be enough, and I wouldn't want it to become a competitive on that end, because there becomes a point in which it's not really effective. It's like the the joke in that movie, there's something about Mary where there's like, you know, seven minute abs and they're like, I came up with six minute abs. And then, you know, like, what if someone comes with five minute abs? There comes a point where like, you can't really do anything in that amount of time.

Harpreet: [00:36:33] I mean, it doesn't. You need to be. It needs to be. There's not enough amount of time for it for for the workforce to continue to be productive. So and that's all relative. Know like their startups that have their employees whether they want to or not working 16 hour days and sure, they're getting a lot of work done, but their employees are getting burned out. So, I mean, it's not a sustainable solution in the long run and they know it, which is why they have so much churn. But I think if an established company cannot work like that, so they they can start offering these things. But, you know, it's better if there's actually a convention. And as the work, as the tools, the technology progresses and people in a certain field or an entire industry become more productive if it if it there's already a standard of four days or even three days as it becomes even more. But it's it's a good thing, I think, in the long run. Actually, there's a book by this books by this author, Alex Payne. One book he wrote was called Rest and one book was called Shorter, which is all about the [00:38:00] four day workweek. He's actually a guest on my podcast as well. I think the episode, if you just go to my podcast and type in Alex, it should come up with work less, get more done type of thing. But yeah, his book is lays out just such a good case for the four day workweek. People are. His whole argument is that more rest leads to more creativity, more productivity. It's all around better. It helps improve work life balance. Check out the episode. Can I see your hands up?

Speaker2: [00:38:35] Yeah, I wanted to. I don't know if it's necessarily playing devil's advocate, but I had a broader question about that. So I think one of the reasons why for our work, I mean, not for our four day workweek can become more attainable is through outsourcing. So there's a lot of essentially essential, not tasks that we don't need to have on prem anymore that we can outsource basically all over the world. And is the four day workweek just squeezing the bag so somewhere someone else has to work more hours somewhere else and we're just working less? Or is it something that everyone can actually work less? Obviously, this varies by industry, by the type of roles and whatever that might be, but I feel like that has to be some component of it, doesn't it?

Harpreet: [00:39:26] Then I'll certainly go for.

Speaker3: [00:39:30] Y'all time to tell some truths that don't actually work 40 hour weeks. Come on, look at me. Look me in the eye. Tell me you work 44 hours, I. Come on, let's be some honesty here. Everybody's already working a four day work week. Come on. You just show up for work at five for five days. I love this because data scientists pretend they work eight hour days. They don't. You spend at least an hour or 2 hours [00:40:00] waiting for some blue bar to go across the screen. I call it blue bar time. And it doesn't matter what you do in technology, there is blue bar time where you're waiting for a query to run. You're waiting for something to train, you're waiting for an instance to spin up. You're waiting for I mean, how much time do we spend waiting for something to happen? It's a ridiculous amount of compiler time. It's just there's a ton of time that we end at. One job that I worked at, what we ended up doing was we structured the week so that all of those tasks kicked off before the weekend. So we spent really the entire week prepping for long run tasks and then we would kick it off either Thursday night or Friday night, depending upon how much time it needed to run. And that was it. Like, there's nothing else to do, you just wait till those tasks complete. And so you could structure a work week so that you had all those long run things happening, all those things that you had to wait for anyway, especially if you're working with another team or another organization to get you things.

Speaker3: [00:41:00] You know, giving them Thursday and Friday to handle all of those. And, you know, you come back on Monday and all that stuff's ready for you and now you're ready to work. It just makes way more sense to admit reality. We don't work 40 hours. None of us do. And then, you know, sort of be smarter about how we schedule all of these tasks and all these deliverables that we will just be sitting there waiting for anyway. And that's that's what we ended up coming up with. And in another group, I ended up structuring it so that we had people that worked Monday, Tuesday, Wednesday, Thursday, and we had people that worked Thursday, Friday, Saturday, Sunday. And we ended up compressing project timelines because we were working seven days a week, but everyone was working four days. So we were working for tens. And everyone got three days off. They came back Monday, actually refreshed because you had time off and you had a bunch of [00:42:00] people that didn't have kids. Their days off, no one was anywhere they wanted to be. And so, I mean, if you go on vacation for three days, you know, people you talk about this like there's no one at Disneyland Monday, Tuesday and Wednesday where there is no one going to a particular movie theater or any of that. And so, you know, from a life standpoint and from a productivity standpoint, you can make these things work, but only if we're honest. So everybody, come on, let's open up and be real. 40 hours, really?

Harpreet: [00:42:36] Listen. Go for it. But yeah. Okay, interesting, interesting point. But counterargument to that, right, is how much of our time is like in software. We're guilty of this, right? We got sick of these long, compile times, long, long everything. We're impatient is anything. Let's be honest, okay? That's our worst flaw, is we're super impatient. That's why we lean to software. That's why none of us in this room stuck to, like, electronics and mechanical design that takes, you know, three weeks of fabrication after you design the damn thing, right? So that's why we're working with software. We like quick turnaround. We like, you know, sick to just take care of it instantly. Here's the catch. There we start getting to the point where you've got your, as you call it, blue bar time. I'm going to steal that one, by the way. Blue bar time. That's a great it's a great quote. I keep it. You get to a point where your blue bar time is long enough that it's disruptive to your cognitive flow, that you in order to make that blue bar time efficient, you've got to context switch to something else. You probably got four blue bars running asynchronously at the same time. I find myself doing that all the time just to try and not sit there and wait, you know, compile. I've been a I've been in a team where it was one of my first forays into software as a professional where we had this big monolithic web app. Right? And I had hit [00:44:00] a wall.

Harpreet: [00:44:01] I naively did the whole compile and test thing. But there's cheaper, smaller ways of testing the bits that you're working on and me not knowing that as a grad just hit, compile and test, I hit compile and the senior dev next to me was like, Wait, did you just hit compile on the whole thing? I'm like, Yeah, it's like, All right, so grab a coffee. We got 25 minutes and I'm like, What? Right? So like we've brought all that time that like crunch time down like 15 minutes, 10 minutes, and we find all this time to do other stuff and we just add cognitive load. So one of two things either we actually, like you said, automate to the point where you're able to make these long scheduled tasks or something that you run overnight and come back to or something that you run over the weekend and come back to. And I did that in my previous work place where I'd say, okay, I want to do ten or 15 runs of training and machine learning model for object detection with different hyper parameters, different kinds of data set mixes. But I don't want to sit here, train one, train another way around, train another, right? So I'd schedule it. So I'd actually have 30 runs, run overnight, come back in the morning, figure out first task of the day, figure out what worked, what didn't work. And then within the first hour, I would set up 20, 30 experiments. 40 experiments go about the rest of my day doing design, work, dev work, engineering work to keep that flow.

Harpreet: [00:45:20] And then the scheduler would just trigger overnight to train, right? So I could get through 30 or 40, 50 experiments a day because I was able to optimize that workflow. So maybe if we're able to do that, I agree. We're able to actually. But I don't know. I'm also getting there's a lot of stuff where it's just like, oh, I've got to push this prop up. And then that's going to take GitHub actions 10 minutes to do something. By which point, either I'm twiddling my thumbs for 10 minutes because it's blue bar time, or I'm off doing something else and I've context, which I'm distracted and I lose focus and flow, right? So there's this weird balance where [00:46:00] we want everything to be so instant, but there are physical limitations and we're not able to get it down to that 30 seconds of blue bar time. It is inherently 5 minutes because there's a container image building somewhere in GCP that takes, you know, and needs a GPU connected or something. It's going to be a 20 minute build fine. So I don't know, like in order to it either go to cut out the blue bar time or like you said, automated to a point where we can do it reliably. So I think if we're going to do that and actually go down to four day weeks, we need to structure the kinds of work.

Harpreet: [00:46:35] And that's going to vary, right? For data scientists, it's one thing. Back end phone and software engineers. What about V and V testers? What about reliability? All of those things. Right? So I see what you mean, but I'm still not sold about how we can execute on it. Well, as a company overall, because there's going to be people that need sporadic across five days as opposed to necessarily four days and then take a day off and oh, they might be more efficient working six days. We're working five and a half hour days. Potentially. I don't know what the science is behind that, but I'm literally just tossing out numbers and ideas at this point. Just curious. It's kind of already like that. I feel like I'll work like whenever. Like, if I got to work Saturday morning for a couple of hours, I'll get that done. If that's when I feel like that, that an idea and I can execute on it and do something productive. Of course, I like to describe it as a blog and somebody should unpack that first. But let's go to them and see if they have anything to to add a riff off the plane. Or can Russell Maria, if anybody has anything, let me know those. Are you guys watching on LinkedIn? On YouTube? If you have any questions, feel free to comment right there in the chat section. I'll be happy to take your questions. Or, you know, Vin, just to tell [00:48:00] me that I'm flat out wrong because like.

Speaker3: [00:48:03] No, you're not wrong. That's why I'm saying we've got that. It's that second structure that I was talking about where you split people up into two groups and you have one group work Monday through Thursday and the next group work Thursday through Sunday, and you end up just handing off, you know, so they're working anywhere between 32 and 40 hours a week because it's like you said, you work, you know, you you have these breaks, you have these sporadic stops. And so working more like a ten hour day makes more sense because you're not really working 10 hours. And once again, I'm just being honest. Maybe I'm not talking for everyone, but I know I'm talking for everybody where you have these breaks where because I'll have 2 hours in the middle of the day to go work out. That's just how I structure the day where it's there's time, where I'm waiting for something to happen and I just structure it so that I can work out. And you can do that where, like I said, you have four and four. And so where you have to have people and coverage consistent, that's how you end up doing it. And like I said, you end up accelerating project times using that for ten hour days versus five, eight hour days and you're sort of making a nod towards reality.

Speaker3: [00:49:17] Whereas the other way around is if you have more consistent work where you can do kind of scheduling, where I was talking about where you're basically having these massive long run tasks and when you have large models. That's that you have these huge gaps where you kick something off and yeah, there's tons of other stuff to do, but you can't really do that because you're always staring at the run. You're like, Is this, you know, should I stop it now? I mean, that's always the thing in your head is messed up just now and do something else and you don't really have that same anxiety when you have it set up so that it starts off Thursday night and you come back Monday morning and you get to review the results of a very long, very large run. So it just really depends on what type of [00:50:00] work you do, how you can structure it, what type of service level agreement that you end up having. It's you know, it's a it's a balancing act, but you can make it work no matter what.

Harpreet: [00:50:14] Thanks so much. Russell, anything to add there or if anybody has any questions, please do let me know. Otherwise, we're going to wrap it up. Come on.

Speaker3: [00:50:24] I was just going to say I'm enough. I've had this conversation.

Harpreet: [00:50:28] With other people.

Speaker3: [00:50:29] In the last couple of days, actually just talking about, you know, a four day working week, three day weekend, and whether the business industry as a whole wanted to maintain a five day working week model. And then you could have, as Vince suggested, I think moment ago, people that work for ten day.

Harpreet: [00:50:50] Days, but.

Speaker3: [00:50:51] A group of people did Monday to Thursday, a different group of people did Tuesday to Friday. And you.

Harpreet: [00:50:57] Run them like.

Speaker3: [00:50:57] Shifts so that the business maintains a five day.

Harpreet: [00:51:00] Working week.

Speaker3: [00:51:01] But no single employee works five days. That could be a good model to get everybody to experiment with this, you know, four day working week, three day weekend philosophy, better life balance. And the cost is really only an additional two days, sorry, 2 hours for every day that you work. I think most people would.

Harpreet: [00:51:19] Would go for that. Awesome. Russell, thanks so much. I, for one, I'm a huge fan of four day work week. I think it's awesome. I think four by eight or four by seven is a great way to work. More importantly, actually, you should be able to work whenever you want. If Saturday and Sunday don't work for you, then why does it have to be a weekend? Just take it off on Tuesday or Wednesday if that works better or spread those two days out. Yeah. With all that said, I'm just a fan of just getting shit done on my own time. As long as it's within the constraints and confines of the project, it's not delaying anyone. You know, I [00:52:00] think what we are in the industrial age where these rules were established. Right. The factory worker kind of mentality where we're not good anymore. 20, 22. All right. Thank you so much for joining. Looks like there's no more questions than we're comments coming in. Do you want to give a huge shout out to our sponsor for this episode? It is Zee by HP get rapid results from the most demanding data sets, train models and create data visualizations with the data science, laptop and desktop workstations. The Data Science Stack Manager provides convenient access to popular tools and updates them automatically to help you customize your environment on Windows two. You can find out more by going to Hpcl for data science. You all thank you so much for joining us today. Appreciate your being here. Remember my friends, you got one life on this planet. Why not try to do some basic.