Speaker 2: (00:00) Be humble. This world is full of smart people and there are much smarter people than you. This is a community. The more we share, the more we grow. As someone said that we stand on the shoulders of giants, so we are just taking from others, so what we'll give to others as well. Speaker 3: (00:16) [inaudible] Speaker 3: (00:22) [inaudible] Speaker 1: (00:31) what's up everyone? Thank you so much for tuning in to the artists of data science podcast. My goal with this podcast is to share the stories and journeys of the thought leaders in data science, the artists who are creating value for our field, to the content they're creating, the work they're doing and the positive impact they're having within their organizations, industries, society, and the art of data science as a whole. I can't even begin to express how excited I am that you're joining me today. My name is Harpreet Sahota and I'll be your host as we talk to some of the most amazing people in data science. Today's episode is brought to you by data science dream job. If you're wondering what it takes to break into the field of data science, checkout, DSD J. Dot. Co forward slash artists with an S or an invitation to a free webinar where we'll give you tips on how to land your first job in Davis science. Speaker 1: (01:22) I've also got a free open mastermind Slack community called the artists of data science loft that I encourage everyone listening to join. I'll make myself available to you for questions on all things. Data science may keep you posted on the biweekly open office hours that I'll be hosting for our community. Check that out @partofdatascienceloft.slack.com community is super important and I'm hoping you guys will join the community where we can keep each other motivated, keep each other in the loop on what's going on with our own journeys so that we can learn, grow and get better together. Let's ride this beat out into another awesome episode and don't forget to subscribe, follow like love rate and review the show. Speaker 3: (02:06) [inaudible]. Speaker 1: (02:22) Our guest today has over 15 years of experience in software development, building products that consistently deliver value while maintaining the perspective of the end users throughout his career. Speaker 1: (02:32) He's grown teams from scratch and has successfully played the role of both mentor and leader. He's experienced using multiple programming languages and technology stacks and believes in leveraging that experience to paint a holistic picture of the product to ensure positive impact for his end-users. He's passionate about developing innovative software solutions, continuous learning, and applying new technologies and skills that can lead to efficient products. He believes in building products that are performant, scalable, and efficient that helped drive value while delivering solutions to tough business problems. He enjoys collaborating with research, engineering and business teams alike in search of finding the optimal solution. The engineer in him loves optimizing algorithms and prototyping solutions for efficient implementation and he strives to follow the engineering best practices in extending prototypes into fully functional, polished and product. He's a graduate of the highly prestigious India Institute of technology at Delhi and his work at companies such as Motorola, Keala, Packer and ExxonMobil currently do use the data science team lead at trade rev where he leads the design, architecture, implementation and deployment of several machine learning services and products. So please help me in welcoming our guest today. On at ****Jane Thank you so much for being on the podcast and I'm really happy to have you as a guest Speaker 2: (03:47) Thank you so much are praise for organizing this. I'm super excited to be part of this and it's a good platform to share my experience and you know, talk about things. Speaker 1 (03:57) I know that our listeners are going to be able to gain a tremendous amount of value and insight from your journey. Speaking of your journey, man, you come from a very strong software engineering background. Can you talk to me about your journey from software engineering into data science and machine learning? Maybe touch on some of the challenges that you face along the way and how you overcame them. Speaker 2: (04:16) Of course, I'll be happy to do that. So I am, uh, electronics communication, Tripoli. I uh, I did my bachelor's in electronics communication. I did my masters in wireless communication. The first job I had was InterTelecom and it was a natural transition for me. I worked for Motorola, uh, there I was primarily into networks, telecomm networks or a three G four G spectrum of that. And that was light software. All the network is basically running on a server and that has an embedded C C++ assembly language. So I'm talking about late 2005 timeframe. Uh, and that time there was no machine learning primarily in the industry. It wasn't the research phases mostly. So over the from 2005 to let's say roughly 2015, the 10 years of my journey was primarily into embedded software, telecom. From there, uh, across different industries. Around 2015, there was a shift of where slowly, you know, you were hearing Lord about deep learning, machine learning, gaining some traction and said, so that brought me the curiosity. Speaker 2: (05:27) Okay. Water is right. And the funny thing is, in my bachelor's may aback maybe in 2000 timeframe, I remember we had a subject called neural networks and the never, like we went to meet him and just thinking it's a subject, right? So, and now when I look back, we actually had neural networks in our course bachelor's course, which was never like a mainstream subject. So around that 2015 timeframe, I started looking in the Coursera, Stanford lectures, MIT lectures, you know, they a lot of good open source material and then over a period of time I just gained some of the intuition about it. Uh, the main inflection point came when I joined my last organization, ExxonMobil. There I was into quite quite challenging work, which was basically related to image processing, signal processing and modeling, seismic modeling of earth, which is basically all mathematics, all of the equations. Speaker 2: (06:24) And it is heavily statistical base. Numerical analysis is higher, rightly used. And then we were exploring machine learning. So that role was a pivot in my career where I, I was working very closely with researchers. I was taking their UCS, which you are entitled to. Dot seven numb by my role was to take them into production and C++. So that was my first big into machine learning in industry. There were a lot of challenging. It was never an easy journey for me and still it is not an easy journey. It's a like tremendous learning curve. Couple of challenges you can, I can summarize for you as a getting an intuition of what machine learning is. It's data, right? But based on data, you have outliers, you have lot of preprocessing requirements, you have normalization. We have a skew in data. So getting that intuition, you know, how it impacts was a big, big challenge for me. Speaker 2: (07:17) And then understanding sometimes the item code, the numerical Python code, the PLCs are never focused on performance. They're always focused on, you know, getting to show it to working. Then you have to take it to raction. It has to be performance. So as an example, one of the projects I did in ExxonMobil, the Python code was running at uh, and production court. I did a comparison later on, there was a 30 X improvement in the speed of the system getting from by then to C++. And that was just not C++ there was optimizations everywhere. Trying to understand, okay, why there is a bottom like here and there because appeals is never concerned about it. And my role is, you know, getting things into production. So I have all of these data of speeding up things, scaling up things. So that's one of the challenges the other we always faces. How do we scope out things for the business? You know, the business has a certain objective, ensure that, that objective then they made by the data and by their data science machine and teams. Speaker 1: (08:18) That's interesting. Yeah. Cause I mean a lot of the times when students are upskilling where they're breaking into the field, um, they're not really cognizant of, of these challenges that, that you'd have faced once he started bringing something into production. Um, so could you talk to me about some of the common challenges that you've seen with, with these up and coming, uh, data scientists and machine learning engineers, uh, you know, you've had experience building up teams, so you've, you've worked with, with, with freshers. Um, what are some of the challenges you, you see these freshers face? Uh, when, when we're talking about taking something from proof of concept into production, Speaker 2: (08:52) one of the biggest challenges I see is it's like the visibility is just a model. You know, you have a business problem and you have a model, the model will solve the problem for you, but then just don't stick to the model. Have a bigger picture, how your model is going to fit into the entire product. As anecdotal evidence. I was attending a conference last year and there I think there was interest TensorFlow vote I last year and one of the Google engineers, he showcased a slide and let you say overall product and two 20% is just machine learning. There are a lot of, you know, surrounding things about it. So getting a holistic, it is, I always find challenging the new and upcoming people in data science. They're just worried about the accuracy of their model. It's just 10 to 20% of the overall, uh, you know, piece of puzzle. Speaker 2: (09:40) So that's one of the things. The other thing is people look at complex solutions rather than, you know, looking at simpler solutions. Uh, maybe a psychic lung model can solve the problem for you or maybe a basic algorithm can solve the problem for you rather than applying deep learning and, you know, complex networks. So start from, you know, and it's good to showcase that, you know, I know phyton and all those things and I built so many layers of models, but can you do it simple? And it's simple. It's easier to deeper. Easy to explain later on. Speaker 1: (10:14) Yeah. The solutions just end up being more parsimonious, more elegant. Right? And it's easier to communicate to people who are maybe not as technical Speaker 1: (10:28) whatÕs up artists, check out our free open mastermind Slack channel, the artists of data science loft at art of data science, loft.slack.com I'll keep you posted on the biweekly open office hours that I'll be hosting. And it's a great environment and community for all of us to talk all things, data science looking forward to seeing you there Speaker 1: (10:54) um, so how can a data scientist that's, that's kind of up and coming, like what can they do to kind of gain this awareness about, you know, kind of going beyond that 10 to 20% thinking only about the accuracy. What are some things that, uh, that you suggest they look into or, or research about [inaudible] to gain this intuition that you're talking about? Speaker 2: (11:11) So I basically see it in two ways. One is how the machine learning is helping the business or the overall product or machine learning is the bread and butter or is it the product itself? Let's say give you an example for trade rev. Business straight drive is a auction house for used cars. It's a B to B space dealer to dealer space. Their dealers can auction their cars. So the bread and butter of the a company is auction. Now where machine learning comes in picture is they do recommend the systems. We recommend traits too. You know, dealers, [inaudible] prices of the cars, the predict from the images, whether the car, whether you actually listed a car or not, or you have a list or some hard Scott or a truck or something. You know. So the benefit of machine learning here is it is adding some value to the business, but it is not the bread and butter of the business. Speaker 2: (12:04) Whereas compare to a company which is intellectual credit card fraud detection, right? So assuming a bank is using a third party service for detecting credit card fraud, right? The bread and butter of that company is detect credit card fraud, right? So there's a huge difference in the two approaches when the person is joining our bread and butter, the company which who's spinning water is machine learning, he has to be much more conversant of how this product will impact the business because our model is not good. It's gonna kill the product and kill the company. So having a first level of understanding of that. And then the second thing would be that think about what will happen if today this model gets more data or less data. Maybe you had bad data. Have you talked about those scenarios to gain a holistic picture? And once you have that, then you have to concentrate on the data pipeline. Speaker 2: (12:57) Maybe your data pipeline is not as good as you wanted your data to be provide. You want to restrict yourself to just one isolated territory, right? If you have a model in production, how will you monitor it? What's the use of a model? If it cannot monitor, you don't know how it's performing in production. These are the common traits which are there in the two companies, but the emphasis on a company where you know machine learning is adding revenues. Another thing is how will you measure that? Your recommender system, let's say has truly increased the sales. Like let's say tacos you do, you have a recommender system they can actually track based on your history and your future use, whether the recommender system is working and Netflix basically folks on recommender systems, they recommend you the shows, right? And if the recommended system is bad, their revenue is going to go back. So getting a holistic picture, it involves, you know, looking at your models, the monitoring pieces, the data pipelines. These are the, some of the things that people can look at. Speaker 1: (13:57) Everyone listening, I hope you're taking notes with what I'm, it was just talking about. So you know very often when you're upskilling, when you're learning, you're only dealing with, like you said, one half of chapter two I like to call it is the monitoring, evaluating what are some things that that a fresher can do to develop their, their awareness and skill when it comes to monitoring a model's performance? Speaker 2: (14:18) I think it's tricky for a fresher to do that in life production. It can be thought of others as you meant. They're part of a team and the team has some metrics designed about it, so the artist to begin with, they can look at the dashboards which say that, okay, in the real time instance, what was the prediction? Accuracy of the models and if they don't have then maybe try creating one as an example. I can give you an in real life, let's say we have a regression problem as a house prediction, right? You have a inference API. If the team is not having the monitoring dashboard, take the initiative and build a Moncton dashboard and it's, it's a very simple dashboard. You can segregated by let's say month and for first month you had let's say 10,000 trades, those 10,000 trades gave you more XYZ predictions. Now can you do a back test and see whether those match your expectations and what was the editor there? These are, these things will help gain a better picture. It can still happen. You discover outliers. That may be your model was having a huge error because of one of the outliers. And based on that you can, you know, take outlet addiction into the next training. Speaker 1: (15:28) Yeah, thank you very much for that man. It's really valuable information. So let's talk about agile and scrum methodology for, for a data science teams, machine learning projects. How have you noticed it play out? Is the agile methodology does a play out differently from software engineering teams to data science and machine learning teams? Speaker 2: (15:46) That's a very uh, you know, insightful question. And this is one of the places where a lot of teams are struggling. I just want to go back and dying how the software engineering has evolved. Earlier there was a waterfall model. You have requirements and you know six months you work on it and go back to the customer and the iteration never happened. Improve on that. There was a giant manifesto and then you know it had foster iterations, there was scope for feedback. All of us. Good insight on the software world. I think last 20 years or so a dial has been totally tested. Scrum has been implemented in many deans. I've been using scrum for last 15 years or so now comes data science. Data science is research, hardcore research and you know sometimes you don't know where you're starting. So somehow I feel data science is more, is more close to waterfall rather than per scrum. Speaker 2: (16:38) Scrum is more oriented towards you know, delivering software. We had these challenges, something back and then what we figured out we were trying out is separate. The research phase from production phase and the research phase can be in Kanban. You just have look items. You are timeboxing them to some extent. You're not like, I typically like to call them a general just doing a Bradford search. So just explore what are the available options rather than going that first major, just taking one approach down to the dream. So given a timebox of one month, let's explore out. And once that is there, then Google scrum, they can, scrum in research phase is a recipe for disaster because you have to give out within let's say two weeks, right. My personal experience has been that two weeks. Sometimes you just discover a failure, maybe you, you don't have a data and having a time bomb taking that [inaudible] ending, it's not helpful. So separate the research from production CS is can be a good approach to, you know, work in our data science machine learning production team. Speaker 1: (17:41) Awesome. Man, that was really, really insightful man. Thank you for sharing. That kind of segue into this next question I had. Um, when you're taking on a new project, what are, what are some of the steps you take to keep you on track while you're kind of navigating that ambiguity of, of the data science project or machine learning project? Speaker 2: (17:57) Okay, so this is, uh, not very different from a typical software project. I do. I basically start with Daisy Rowe and understand what the business wants, what is the business objective, and then time travel to a times if we have actual products, I'm in the present. The business comes is you know, this is what they want. And then I visualize off the Lexis six months we have the product, how it will look and then do a backtrack from that. If I am a binder of three months or six months and I want to be at this stage, what are my intermittent milestones? So backtrack from there. Okay. At let's say six months down the line, I want this, where should I be at five month? Where should be at four months at treatments, two months, you know, and then start having my thoughts around there. So what is real blocker? Is the data blocker or is something else locking? So wherever the blockers are, I try to get them in the first few months. And then the low hanging foods, sometimes they're just delayed to the end because it can be, this is one approach. And then led clearly defined milestones. We can always navigate and track. But of course their life is very different from expected issues and you know, dependencies. And then based on that, we keep iterating, keep modify animation. You will change scores. Speaker 1 Thank you man Speaker 1: (19:16) Are you an aspiring data scientist struggling to break into the field or then check out DSD J. Dot. Co/artists to reserve your spot for a free informational webinar on how you can break into the field that's going to be filled with amazing tips that are specifically designed to help you land your first job. Check it out. DSdj.co/artists Speaker 1: A s somebody who's walked the path from fresher to now being a team lead, um, what are, what are some steps that someone can take to go from expiring data scientists to, to a data science or machine learning team lead? Speaker 2: (19:56) One of the basic things is do really want to do it. They really want to be a leader or they just want to be an individual contributor. Are you happy doing just a single job, which can be, you know, coding or researching or something or you actually want to lead. Being a leader in was lot of responsibility. I would say this, the disadvantages, I have a burden of mentoring people. If I mentor someone really Vail, so he, his career will grow. But if I do a mistake, the person can be devastated and he might feel discouraged, right? So there's a, I always call it like I'm wearing a crown of tongs. There's a burden on like, even though I think I'm doing a good job, my team is happy but still have that fear, right? Assuming they do something wrong, it's going back to person. Speaker 2: (20:43) So this is very important to know. Can you handle people? If you can and you think you're good for this role, the first thing you should do is figure out can you mentor someone, not a necessity, your beer. It can be a junior in any, any field. Can you mentor and can you take criticism? Are you open to criticism? It's not that, you know, not that I deliver. People have issues. So they'll come back to you once you no that do the initial mentoring. You will know where their skills are there or they're lacking. Uh, technically you have to be strong. So whether it is coding or it is architecture that is, uh, that has to be strong. The other thing you need to learn is delegation. So as a leader, you are not supposed to do though for your, your job is to get the book done here. Speaker 2: (21:29) Demarcation given a chance, I will do the entire word, but then I might spend 24 hours a day. Right? So can you delegate? Can you learn to delegate work and can you identify priorities? These are couple of things if you can think about it and mentoring, delegating work, thinking about the priorities and the, I think one of the most important features is business impact. Can you think about how your work will have a business impact? So if those things you can, you know, keep in mind then over a period of time you can be a leader and always you need to keep in mind that as a leader, people depend on you. It's, you can call those accountable trustable, right. Then it will create leader. Speaker 1: (22:09) Very insightful man. Uh, I'm not, I'm not sure if you're a much of a reader, but there's a book on leadership that I absolutely love. It's called multipliers. I think her name is Liz Weisman. They wrote the book an extremely amazing book on leadership and you know, what it means to be, to be a leader that brings out the best in, in, in his employees. Right. Um, you know, definitely. I agree with everything you're saying there, man. Speaker 2: (22:35) I've heard about it. But haven't read it. Speaker 1: (22:38) Yeah, it is awesome. Oh yeah, for sure, man. So, so kind of touching back on that, um, you know, we've talked about the, the obvious technical skills, um, that somebody needs to break into the field, but what would you consider to be, uh, an essential skill that need that, that someone needs to have, uh, so that they can be and remain successful as either a data scientist or a machine learning engineer? Speaker 2: (23:03) Oh, I think one of the main skills will be adaptability and flexibility. You have to be flexible. You know, if you just say that I'm a data scientist, I'm just going to do research, maybe it'll work for big teams, maybe it'll not focus more teams because you may not have a data engineer, your models will not work if you don't have data. So you might have to, you know, go to the other side and do some data engineering work or you have models in you know, deployed. But there is a bug which comes and you will have to go figure out where the bug is. So the more flexible you are and the more adaptable you are to work across the full stack data science, but right from, you know, data ingestion to the final delivery, the better your school is to gain a across the spectrum skills as well as remain successful in the job and curiosity. You have to be curious, okay, what this is and how it is solving. The problem is that some of the soft skills, the, the idea that, you know, this field is very mature research. I think the researchers in good shape. The next thing is production. We have to take research to production. The more flexible you are, the more successful you are Speaker 1: (24:17) definitely man i like that. Like you kind of have to get rid of that phrase. It's not my job from your [inaudible]. Speaker 2: (24:23) I think that's the right phrase to put it Speaker 1: (24:25) What are something that you know, what are some characteristics that you're looking for in a up and coming data? Scientists and machine learning engineers. Speaker 2: (24:32) Teamwork is one of the main things. Philip, you are not individual working. You have to work with their team. It's a collaborative work and the second thing is technically you have to be strong given a problem. You know even if you cannot solve it, you should have a clear thought process. You know what steps you could do. Like in interviews, typically some people are able to solve it, some are not, but if they have a clear thought process that you don't, given this scenario, this could be implemented. That's good enough. Some ability to be flexible. Like I always ask in interviews that this is the expected work, but maybe this will not happen when you join. You might go on another project which will be in a different skill set within the same domain, like machine learning or data science, but you, your expertise as an example, your expertise is in let's say recommended systems, but we may not have the academia give you some regression problems. Are you open with that? So those are the skills and they look at, Speaker 1: (25:29) Apart from your stunning technical skills, what are some qualities you feel have contributed to your success? Uh, as, as a machine learning engineer? Speaker 2: (25:38) I think persistence and perseverance. There have been times where it was very difficult for me to give a successful product, but I was very perseverant on a day in day out who working nonstop on deck, trying to upgrade myself every now and then. I usually don't get much time to do that, but wherever I'm commuting or you know, trying to figure out what's a good a lecture to look into it and operating myself. And on the leadership side, I think one of the things which has really helped me is constant interaction with the team and the managers for the feedback. So I'm very open to feedback and I have a fail fast approach. So if there is something wrong or it's not on track, let's look at it right now or maybe a day later rather than waiting for six months. So a fail fast approaches has helped me alot Speaker 1: (26:31) Awesome as spoken, like a true linchpin. Um, so Hey, before we jump into lightning round, what's, what's the one thing you want people to learn from your story? Speaker 2: (26:39) Stay humble. This world is full of smart people and there are much smarter people than you just stay humble, appreciate a success, but don't be arrogant. Speaker 1: (26:48) Excellent advice. Pass. Let's go ahead and jump into our lightning rounds. Python or R Speaker 2: (26:52) bison. Speaker 1: (26:53) Any, any particular reason? Speaker 2: (26:55) My prime focus is taking things to production by 10 has had a lot of libraries, so many, they're so easy to take into production and also many people are comfortable with it. Speaker 1: (27:05) Same here, man. I'm on pipeline all the way. Uh, what's, what's your favorite algorithm? I know that's kind of hard to answer without, without context, but, but what would you say is your, your favorite out? Speaker 2: (27:14) Okay so you also got, they're asking this to a software engineer. Speaker 1: (27:18) Yeah, Speaker 2: (27:19) so from software engineering, my favorite one is pilot research. Very basic, very simple idea and does the job in machine learning as say, uh, the convolution neural network. It's *** architectures, but I think CNNs I really love them. They take an image and have this patient I can offer. Speaker 1: (27:39) What's a book that every data scientist or machine learning engineer should read? Speaker 2: (27:43) I think I really loved the book Geren a hands on data science with scikit-learn. Speaker 1: (27:51) That's a great book. Yeah, I'll definitely, I'll include that. Include the title and the author's name in the show notes. What's your favorite question to ask in a job interview? Speaker 2: (28:00) From a technical side, I tend to ask binary search. It basically gives an idea of what the person is thinking. Can he work around the edge cases and with a small algorithm basically opens up his mind on those. On the leadership side, I tend to ask, tell me about a scenario where you have a, what was the reason and you know, what was your learning from it? That helps me to see what the person has been like. How did he approach the failure? Because this is a research field and we fail a lot. So if I, since I haven't failed I, that's a lie. Speaker 1: (28:32) What's the strangest question you've been asked in an interview? Speaker 2: (28:35) I have to think about that man. I think that was maybe a, a something back. But the question which was asked was basically there is a, it was a big technical question then a nonsensical that uh, you know, you have a bar and the bar has number of, you know, different types of liquor in it. Um, and there are like 10 people there. Can you give us a rough permutation combination of what a person can drink after that? Can you predict who will be more drunk? This is partially what I remember. I was like, why the hell I'm being asked this question is, it doesn't make sense to me, but later I think the person wanted to know that. Can I think rationally? But that was a good question. Speaker 1: (29:16) Or maybe she wanted to see, uh, what's your, what's your drink color? And it was, Speaker 2 Yeah,maybe Speaker 1 How can our people connect with you? Speaker 2: (29:25) So I'm pretty active on LinkedIn and all the search for Ahmed, Jen, Trey drift, I always there and I'm on Twitter as well. My handle is *** so I can share that video. You can put in my notes. Speaker 1: (29:40) Yeah, definitely man Speaker 1: (29:41) Well Ahmed thank you so much for being so generous with your time. I really appreciate you taking time to answer some of these questions. I think that our community and our listeners are really going to benefit from your insight and your experience. So I can't thank you enough for taking time out of your schedule to be on the show, man. Speaker 2: Oh thank you so much, I think it was a very nice conversation with you and I'm always eager to share, eager to help people who are getting into the field or who are in a similar journey. I always have feeling that you know, this is a community. The more we share, the more we grow. As someone said that we stand on the shoulders of giants. So we are just taking from others. So we'll give to others. That's fine. Okay. Well thank you so much. Thank you so much. Or put it this way, my struggling to do your math Speaker 3: (30:29) [inaudible].