Giuseppe Bonaccorso-2020-07-05.mp3 Giuseppe Bonaccorso: [00:00:00] I often repeat that data scientists must think like an artist when finding a solution, when creating a piece of code. But of course, also an artists - imagine an architect, for example, has to know physical laws because otherwise that dome will collapse. So it's clearly hard science, but it's not purely hard science. And there is room for creativity and lots of room for creativity. Harpreet Sahota: [00:00:45] What's up, everybody? Welcome to the artists Data Science podcast, the only self development podcast for Data scientists. You're going to learn from and be inspired by the people, ideas and the conversations that'll encourage creativity and innovation in yourself so that you can do the same for others. I also host open office hours. You can register to attend by going to bitly.com/adsoh. I look forward to seeing you all there. Let's ride this beat out into another awesome episode. And don't forget to subscribe to the show and leave a five star review. Harpreet Sahota: [00:01:44] Our guest today is an experienced and goal oriented leader with wide expertize in the management of artificial intelligence, machine learning, deep learning and Data science. Harpreet Sahota: [00:01:55] His experience spans projects for a wide variety of industries, including health care, B2C, military industries and Fortune 500 firms. His main interests include machine learning and deep learning, Data science strategy and digital innovation in the health care industry. He's currently leveraging his interest and expertize as the head of Data science in a pharmaceutical corporation. However, you may recognize him from the many best selling machine learning books he's published, including Python Advanced Guide to Artificial Intelligence Fundamentals of Machine Learning was I Could Learn and hands on unsupervised machine learning with Python. Today, he's here to talk about his most recent book and shares his insights into best practices for data science and machine learning. So please help me in welcoming our guest today, author of Mastering Machine Learning Algorithms, Giuseppe Corso. Giuseppi, thank you so much for taking time out of your schedule to be here today. I really, really appreciate you coming on to the show. Giuseppe Bonaccorso: [00:03:02] My pleasure, Harpreet. I'm really glad to be here. Harpreet Sahota: [00:03:06] And I'm so, so excited to get into our conversation today. We've got a lot to cover. But first, let's start kind of at the beginning here. If you could just talk to us about how you first heard about Data science and what drew you to the field. Giuseppe Bonaccorso: [00:03:19] This is a very interesting question because I never heard of Data science when I heard about what we now call data science, that names were quite different. So before starting my university master, I remember I found two books in a bookstore, one about fuzzy logic and another one about the neural networks. And I bought them all and I started reading them because I was very attracted by the field of mind and the connections between mind and brain. So I didn't have the means for working with biological experiments. And I loved computer science. I immediately tried to replicate the experiments and I found them so interesting, so stunning that I decided that that would be my career. In that period, there were just a few resources. So it was very difficult sometimes to find the right publications. But after reading these books, I started contacting other people, I started buying other books, and I started moving in the direction of learning how neural networks could solve many problems. But I want to be honest. In that period, neural networks were not considered like they are today. So it was a real new world, but without so many people supporting it. Giuseppe Bonaccorso: [00:04:34] So it was a very interesting challenge for me. Harpreet Sahota: [00:04:37] That's an interesting point to bring up, because my next question for you is going to be how much more hyped has machine learning become since you first broke into the field? Giuseppe Bonaccorso: [00:04:47] In that period it wasn't hype at all. Consider that I remember a conversation when choosing my some subjects at university and there was artificial intelligence and some people told me, why are you picking artificial intelligence? It's not so interesting. The reality is that in that period the research was more on other topics and the computational power was not enough to run very big models. So some people just decided not to waste their time in something that was seen as more theoretical. Nowadays, the situation is completely different. It's a natural process because now there is also an inflation. Giuseppe Bonaccorso: [00:05:25] There are so many applications that almost everybody can talk about and can talk about AI applications. So it's exactly the opposite. I honestly remember that I had some conversation with professors who didn't believe that neural networks could be so powerful one day they considered them quite limited. In fact, what happened was that after a few years, they were the first examples of convolutional neural networks applied to more complex problems. And that was a real revolution. It changed completely the mindset. And in that moment, the hype started and clearly it was a progressive process. And nowadays we are probably around the peak of this process. Harpreet Sahota: [00:06:10] And where do you see the field of machine learning headed in the next two to five years? Giuseppe Bonaccorso: [00:06:17] Well, I see clearly diffusion. Nowadays, more and more companies are becoming interested in the machine learning, Data science, AI. They want machine learning for some time with a real awareness because they can identify problems that can be solved using machine learning. Sometimes they want to find ways to apply machine learning because it's also very trendy. So I. I think that more and more sectors will be involved in the field. I think that nowadays, if we just try to compile a list, I mean, just to have an idea of the companies either know which companies can be easily excluded, the majority of big companies and also medium-sized companies adopt. And the reason is also because the costs are going down, they are cheaper and cheaper. And there is also more availability, for example, of free trade models and automatic tools. So the need of very high specialized people that's also Harp represents a cost for the company is sometimes considered not necessary. I don't support this opinion, but this is also the reality. So I think that in the next five years, let's say there will be a data scientists probably in each company, even the smallest ones. Harpreet Sahota: [00:07:33] And in what ways do you think that Data science machine learning and I will have the biggest positive impact on society in the next two to five years? Giuseppe Bonaccorso: [00:07:45] Well, this is indeed very difficult to say because I think the impact can be either positive or negative simply because Data science or AI or machine learning our technologies are tools. So for every technology you can apply, it's for positive things and negative things. It's very difficult to say something will never happen while other things will happen for sure. I think that I belong to the health care sector, for example. I think that health care is already improving the efficiency of many processes. We have read in the news, for example, about the speed of drug discovery. You know, drug discovery is normally a very complex process where it's necessary to analyze lots of sequences. Giuseppe Bonaccorso: [00:08:30] So AI helped to solve this problem in a very quick way, avoiding very high costs. So I think that the positive impact is in the application, for example, the models to solve more and more complex tests. But on the other side, I think also that this extra complexity can represent a threat, a threat for the possibility of negative use. I make always the same example. It's just like knobble with dynamite when it discovered it was considered a revolutionary tool. But at the same time, we understood that could be used to kill people or the same is for a blade that can be either a scalpel to for surgery or a knife to kill another human being. So clearly there are possibilities which are strictly related to the complexity that is reaching. In some cases, NLP, for example, is becoming more and more precise, and it's possible to substitute human beings in many NLP tasks. And this can be considered a threat for many jobs. For example, the same provision for surveillance. Nowadays, it's possible to have very good surveillance systems which are completely based on A.I. This is also a negative side we have to consider. Automation will reduce the need of human beings for repetitive tasks. This is probably a negative side. On the other side, there are many positive aspects related to the fact that now it's easier to use these tools and it's easier to change the professionality of many employees, giving them the opportunity to focus on more creative activities, for example. Harpreet Sahota: [00:10:13] So as practitioners of Data science and machine learning, and as we kind of move to this vision of the future that we have, what are some things that we should be concerned with, the way that we practice Data science so that we could mitigate the negative effects that you're speaking about? Giuseppe Bonaccorso: [00:10:29] One thing that I consider absolutely essential is the real knowledge behind Data science. Data science is not something that can be learned in a week or even in a month. It's a real topic with a lot of theory behind. And it's very important for the practitioners to have clear ideas about what they do. And this is also very important, considering, for example, the diffusion of automatic tools. Whenever you use an automatic tool and you have no idea of what you're using, you are just pressing buttons, for example, a training and explained to use a model. And you always repeat this operation mechanically. The result is clearly a failure because you can never managed to solve complex problems. So I think that fundamental step for any company is to improve the knowledge of the practitioners and the practitioner themselves have to pretend to study, to learn more and more, to never stop, because the only way they can remain necessary. And this is the only way to create a progression in this field without avoiding that the fields defaults in. Just applying the same techniques or whatever the result, which is another thing is, is the fact that sometimes unaware Data scientists are limited to work on just Data science. Giuseppe Bonaccorso: [00:11:55] They have no idea about the fact that a company is made up of layers and sometimes the C-level or the top management is completely far from their word. The responsibility for creating a connection, the responsibility for helping these people create a more advanced environment relies on data scientists clearly under the science managers. But it's very, very important to create this awareness and to propagate this awareness, to increase the commitment, because without this commitment from the sea level and from the top management in particular in the largest companies, all of the applications will be always limited and many possibilities will be lost forever. So this is an area of concern. If I consider, for example, companies where sometimes they want to go in this direction just because they want to be trendy, trendy, because this is something that nowadays you can read everywhere. So they want to say we use Data, but they are not using it. Giuseppe Bonaccorso: [00:13:01] They are just using some methods, but they are not creating the culture. The culture needs a lot of work and the only people that can start working on these cultures are the practitioners. So I always talk to my colleagues, sometimes the people I manage saying to them, you have to expose yourself in discussing with higher level managers and explaining and becoming more and more involved, because only in this way we can avoid the problem of finding a situation where Data science is just considered a sort of commodity, which is not absolutely it's not a commodity. Harpreet Sahota: [00:13:39] It's a very interesting point you made there specifically regarding culture. So I was wondering if you could kind of break that down for us. In your opinion, what makes for a healthy Data science culture in an organization and what makes for an unhealthy Data science culture? Giuseppe Bonaccorso: [00:13:56] The culture in an organization can start from the bottom, but it's absolutely necessary that the management is involved because at a certain point it is like a filter. So if you start a culture and the culture never reached the management, many strategic decisions will exclude, for example, Data science. Including in the strategic science into the strategic decisions is the only way to start creating culture. Giuseppe Bonaccorso: [00:14:25] Because in this way, first of all, the whole company, also the people who are not involved, can understand that for the company Data science is an asset. It's not a liability. When I say liability, I refer to the fact that sometimes it happens that the scientists are just considered as useless because some tasks are not considered very valuable if we measure the value like, for example, compared to other activities. But they are an asset also in terms of potential. And the only way to show that they are a real asset that can grow and can multiply its value, it's necessary to involve the C-levels. And to try, for example, to organize meetings, to organize - nowadays there are many possibilities, digital channels now, unfortunately, with COVID there were many limitations- but we know that there are many possibilities to organize, for example, zoom meetings or also channels, in particular applications, and starting discussing presenting the results involving other people, asking and answering questions, asking questions to the stakeholders, because the stakeholders sometimes are not aware and it's necessary to go to knock on the door and say, yeah, I am a data scientist. Clearly when I say this, I refer sometimes there are different layers. Maybe it is just more for a manager, but the manager can start saying, I want to introduce myself, my team. We can do this, we can contribute to this. Do you want to help you? And in my experience, whenever these activities are done correctly, the result is always positive in some cases. Giuseppe Bonaccorso: [00:16:08] Instead, the situation is a segregation of Data cites. Segregation means that there is a group of people, the nerds let's call in this way. They are closed in a place. They work on specific projects. Nobody tries to understand because it's too complex. They are people like magicians. Nobody is interested. They don't talk with other people. They talk only among themselves. This will create pure segregation and culture. Culture is exactly the opposite is the diffusion. This is a. Namely, extremely problematic in some cases. So I always encourage Data scientists to talk to everybody to try to be available to speak the language of your stakeholder, not your language. You don't have to pretend to be the one who talks and the one who must be understood. You must pretend from yourself that the other people are able to understand you. So you have to do whatever is needed to make them understand you. That's your success. Harpreet Sahota: [00:17:18] I really like that advice that you gave there. It's all about your audience and it's all about making sure that they're able to understand you and having that kind of bit of respect for your audience in the sense that, yeah, we're doing complex stuff, but I'm going to explain it to you in such a way that you can understand it and that you will be able to understand what I'm doing in turn. So another kind of follow up question to that. So we talked about some qualities were good Data science culture. Harpreet Sahota: [00:17:45] But what would you say are some defining qualities or characteristics of a great Data scientist that will separate them from just the good data scientists? Giuseppe Bonaccorso: [00:17:56] This is a very interesting question. Thank you. I can summarize the answer by saying that a great data scientist will always try to innovate and not to imitate other solutions. But when I talk about innovation, I don't want to be misunderstood. We don't have to invent every day. Innovation sometimes means, for example, finding new ways to solve problems, finding different approaches to the same problems, the right way to talk to the people. That's what I was saying before. This can be connected. For example, when talking with the stakeholders because it's necessary to understand a business context and to find the right way to solve the problem. A great data scientist is the one who tries to where the right has a good data scientist is just a poor listener or just takes some notes and tries to do exactly what he understood, he or she understood in their reality. Giuseppe Bonaccorso: [00:18:56] If you don't change your hat and you don't think with the other person mind, you can never find a good solution. Sometimes I found situations where the requirements were very difficult to collect because they were unclear. So we don't have to think that a great data scientist is like a genius. I created that a scientist in a company is a person that thinks that he or she has to interact to maximize the efficiency and effectiveness of its work and to try to meet all possible requirements. So, of course, as I also said before, learning is extremely important. Great data scientists will never stop learning. If you think that you are done after 30 years, 40 years, if you think that you are done, some people think that they are done after just a few years. Well, you are absolutely not a good probably a very poor data scientist, because it's impossible even for a genius to have a complete picture to complete a career path in just a few years. There are so many things to know, so many possibilities to expand the boundaries and to increase the lateral thinking to to to to become more and more involved in the business, to be up to date in terms of technologies, to know more algorithms, to understand how it's possible to improve that learning is not a necessary it's a mandatory thing. So if you don't consider learning important for your work and if your employer don't consider learning important for your work, probably there's something wrong and a success can never be reached. Giuseppe Bonaccorso: [00:20:32] And in this case, and a scientist that complains because he, for example, cannot improve its knowledge, can never become a great data scientist, he must probably change its workplace. I know that all the best workplaces Anchorage's to study, to improve the knowledge, to improve also soft skills so this can make the difference between good and great. Harpreet Sahota: [00:20:58] I 100 percent agree with you on that. I believe that if you sign up for a career as a data scientist, you have signed up for a career as a lifelong learner. And I think that really is the beautiful thing about our field is different from other quantitatively rigorous fields. Let's just take superficially, for example, an accountant. I feel like accountants, they can kind of go to school. They learn the tricks they might need to keep up to date on tax code and things like that. And I worked as a biostatistician and an actuary as well. And I felt like in those scenarios that I never was growing and learning as much as I wanted to. And I feel like this field because there's so many new advances of so many new ways to solve problems. There's just a very fresh and innovative field. To be in a really is conducive to continuous lifelong learning environment, Harpreet Sahota: [00:21:50] What's artists I would love to hear from you. Harpreet Sahota: [00:21:53] Feel free to send me an email to the theartistsofdatascience@gmail.com. Let me know what you love about the show. Let me know what you don't love about the show. And let me know what you would like to see in the future. I absolutely would love to hear from you. I've also got open office hours that I will be hosting and you can register like going to Bitly.com/adsoh. I look forward to hearing from you all and look forward to seeing you in the office hours. Let's get back to the episode. Harpreet Sahota: [00:22:33] So wanted to get into your book a little bit here, starting at the top. Real simple question here. What is a model and why do we build them in the first place? Giuseppe Bonaccorso: [00:22:45] Yes, here we are talking about models for machine learning in science, because that's our model is very flexible. So there are different possible definitions. But in reality, a model for us is a way to represent a part of the world. We sometimes imagine that we have a process. This process is outside our boundaries. The only way to interact with the process is to transform the process into something manageable. And the language for doing this is mathematics. So a model is generally a way to transform reality into a mathematical representation, trying in this case because in machine learning, it's very common to avoid too many limitations, to avoid simplifying too much, but also on the other side, to avoid to create other complex problems. But another very important thing when defining a model is that our goal is not necessarily to describe what we already know, but to make predictions. So our model must become a sort of container of future possibilities. This is very easy to understand. If we think about physical loss, it's a little bit more complex when we think about problems where, for example, to classify some outcomes according to some criteria because there are no specific laws behind. If there were some laws, physical laws, we could use them whenever there are no such models existing. Giuseppe Bonaccorso: [00:24:23] We have to create a model. And what we do is making, first of all, an assumption. The first assumption is that there is what is called Data generating process. So the data that we are collecting is not coming from a limited data set. This is just sometimes a wrong assumption, but it's coming from a process that can theoretically generate all possible data that belong to a specific problem. And we must be sure that the data that we are collecting belongs to that process, because if we are not doing this, we risk to create a model that is limited because the model can only be limited to a specific subset. The region of this process, for example, imagine that you have to classify cars and you just have pictures of utility cars and then you take a picture of a Formula One car and your model fails to classify it correctly. This is not probably because the model is wrong. It's because your data set is wrong, because you haven't represented correctly the Data generating process. That is the way we look at reality. Another thing that is extremely important is a concept called the grounding. Giuseppe Bonaccorso: [00:25:40] And this concept can be summarized saying that whatever happens if you think a connection between inputs and outputs in the model, it doesn't matter what happens in between. But what if something is connected to an outcome in the model? It must be connected also to the same outcome in reality. So it can be very easy to understand with very simple models. It becomes extremely complex for complex models, but it's extremely important and model as no fantasy. Don't misunderstand me when I say this, but a model must find a low that associate correctly, for example, in classification, something that in reality is normally associated, for example, with a specific class. And this process must be always grounded when the model tries to, not the model, of course, autonomously, but because the model is wrongly trained, such, for example, producing wrong outputs. It's because the grounding. Is broken, so the mother is not able to make the right connections and we obtained very wrong results so mother can be seen like an arc starting from the ground, going up to an abstract level where all the obstructions can happen and these obstructions can be extremely complex and then it must return to the ground where the ground troops rely. Harpreet Sahota: [00:27:06] Thank you very much for sharing that. I really enjoyed that part of the book, especially how you open with that. The fact of what we're trying to model is really just some Data generating process. And the way you came at it from first principles, I thought was just a very intuitive. So I really enjoyed how he broke that down in your book. So I want to move on to another topic that you cover in your book and really just kind of pick your brain to see if you have any heuristics to share with our audience as to when to use certain scaling techniques. Harpreet Sahota: [00:27:40] Like, for example, like when do we use standard scalar versus robust scalar? When do we use min max over standard scale or what heuristics do you use when you are trying to find the appropriate scaling technique to use in your pipeline? Giuseppe Bonaccorso: [00:27:58] Sure. First of all, I have to say that standard scale, robust scalar and so on are the names of the classes used in sckit-learn, because the book uses scikit-learn. So it's helpful also to understand the logic behind these techniques because they're using other tools. The names can change. We have to say that Data is normally what we collected, Data set. That is not, for example, made up of the images, but it's made up of different data points. The scale up to Data can have completely different values. We need to scale Data for many reasons. In some cases, I can make an introductory example to make everybody understand why it's so important. If, for example, you have a measure that is scale between zero and a thousand and another one, which is between zero and 10. But from a physical viewpoint, the minimum and the maximum are exactly the same. If you train just a very simple model, like a linear model, what you obtain is that the feature with the higher variance with dominate the model and you will see a coefficient of the other feature will become smaller and smaller because clearly the other feature is like almost constant when predicting the outputs, the dominant factor is the other one. So we need to scale to work with the many models to create datasets which are compatible with many models that require, for example, logistic regression support vector machines assumed to have Gaussian Data. Giuseppe Bonaccorso: [00:29:30] So standard scalar is a very simple approach to achieve these goals. Then the scalar also discourse scalar called. It's a way to have a data set with null mean and a standard deviation equal to one. In this case, all the features and I want to be clear, this operation is feature wise for each feature is scathed autonomously. So in the end we have a covariance matrix which is not diagonal, but each feature as a unit standard deviation. Giuseppe Bonaccorso: [00:29:59] In this way, the way the contribution is proportional to the real information contained in the features. And it's not dominated by the fact that one feature is different and let's say more information content, just a parenthesis in information theory. If you check the entropy of the majority of stochastic variables, the entropy is proportional to the variance. So clearly, if you have a larger variance, have more, apparently more information. So what we are doing is just saying we want to avoid that and appear and dominating factor could prevent the other wants to contribute with their information. Content rather scalar is a consequence. Exactly. A consequence of standard scaling. Unfortunately, when a data set has outliers, you know that the variance is a quadratic measure. So the variance is very influenced by outliers, in particular when these outliers are very far from the majority of other data points. Giuseppe Bonaccorso: [00:31:05] So the result can be a wrong scaling. So in order to avoid the effect of the outliers in the scaling procedure, it's possible to use a relative scale or scalar. I'm not describing here all the details, but it's not based on computing the standard deviation, but it's based on intermeshed range. So it's considered the intercutting range. So between the twenty fifth and 7th percentile that will contain the majority of samples. And in other words is this kind of approach tries to remove the outliers before scaling. And the effect is at the. Data said that the scale to be a more robust way when there are outliers in general, if there are not liars, the result is very similar to a standard scaling. Min Max instead is a different approach. Min Max is more when the problem requires to scare the Data in a range. It doesn't matter the variance itself because consider that the result is generally that the variance is the standard deviation is also scathed. So if you are reducing the range, the standard deviation, the variance is proportionally scaled down or vice versa, it can be scaled up. Min Max is generally helpful only in those task when we want, for example, to feed other models or other elements in our pipeline that expect to have the value in a specific range. And they cannot accept any kind of value outside of the rent. For example, if there is a function of filtering function that accepts the value only between minus one and one or zero, one percent of scalar normally will try to produce value, which are very close to zero. But you can also have values that are outside of the range minus one. So that kind of scaling is much more structure like, say, than practical. It must be adopted only when it's necessary and can be adopted as a further stage in the process. I mean, it's not necessarily the only way to candidate. It can be like a way to standardize the Data again before feeding the data into another system just to complete the discussion. Giuseppe Bonaccorso: [00:33:22] When I said that standard scaling operates feature wise, it means that each feature is considered as autonomous. But there is another technique which is not sometimes very known, which is called the wife aning in wayfinding. Instead, what we do is to perform a standardization of the whole dataset. So what we want to obtain is a diagonal unit covariance matrix. Giuseppe Bonaccorso: [00:33:50] So it means that, in other words, the covariance matrix that sets up a number of parameters that is proportional to the square is that square root of the number of features. We have only the values on the diagonal. So we have only variances. This will have a very positive effect on all these models where the computation is proportional, for example, to the number of values in the covariance matrix. Wayfinding is very helpful in many deep learning tasks and can really improve. The performance is a deep learning tasks. The difference clearly is that Y training requires to operate on the whole data set, while standard scaling can operate on the single features. But it's doubtful that a reader can understand the reader. I mean the student. I would say that can understand the difference between these models and will be the right one. And whenever something is wrong, it's important to understand how to Dipak the model, to understand why the performances are not as expected in some cases. Giuseppe Bonaccorso: [00:34:58] For example, it's because a standard scaler is used in situations where there are many outliers and they should be filtered out. And if they don't use a robust scaler or they perform a filtering, the result is always not the optimal one. Harpreet Sahota: [00:35:16] That's something I really enjoyed about your book you have presented from the theoretical standpoint. You show us the formula, then you show us the implementation using sckit-learn. And then you also have some really well crafted diagrams that really help us visually understand how these different scaling techniques are affecting our dataset. So I really like the way you laid that out in your book, another topic you cover in your book that I think would be interesting as well, kind of in line with the previous question is wanting to pick your brain in terms of cross-validation. Harpreet Sahota: [00:35:50] So with so many methods of cross-validation out there, how can we know which one to utilize for any given scenario? Giuseppe Bonaccorso: [00:36:00] Yes, I will try to remain very compact because this topic is very large. First of all, the reasons behind cross-validation are important to be understood cross-validation as two main purposes. One is the fact that sometimes the training set is very small. So if we split it into training and test or training test and validation, we get very small datasets that are not enough to train the models. Giuseppe Bonaccorso: [00:36:30] Another reason is to obtain an unbiased measure of the performance of the model, and this is probably the most important reason nowadays. Let's start one second thinking about the. Them, we want to validate the model by considering all possible combinations when the test set is not just selected randomly once, but it can be every part, let's say, of the Data set. This problem can be solved in different ways. Giuseppe Bonaccorso: [00:37:00] Clearly, as we are going to see, there are ways which are very drastic. But a very simple way that is very effective is called k-fold. It simply means that we split the dataset into k blocks. And we train the model using K minus one blocks and we evaluate on the remaining ones and we repeat this procedure times. This is also important to understand that cross validation has a computational cost. And in fact, it's normally not used in deep learning, for example, because it requires in general to retrain the Model K times. And if K is between five, 10 or sometimes even more, this can be unacceptable. Giuseppe Bonaccorso: [00:37:39] In many models. This process can be extremely fast at what we get is an array of measures where we can immediately understand if the model is performing well or not. Giuseppe Bonaccorso: [00:37:51] And in particular, what we can understand is if the standard deviation of the values that we obtain is small. So one or less all the faults behave in the same way. Or if there is a very huge difference in some cases, as I said before, the problem of the Data generating process, sometimes we have just a few data points belonging to a region. And if those points are at fault and this fault is excluded from the training process, clearly the model will have very poor performances when evaluating on that remaining fault. So we can observe sometimes that a model has very high performance in a lot of faults. Giuseppe Bonaccorso: [00:38:32] But there is one where the model has very poor performance, is close to zero point five. And this means that clearly there is a problem. In one case, the problem can be as, as I said, related to the nature of the training set. But another reason that is solved using a modified version called the Stratified Gatefold, is the fact that the classes are unbalanced and they are not well represented in each field. This will happen many times because we cannot expect always to have perfect datasets. In general, the rule is the datasets are very, very dirty and we need to be prepared to manage this kind of complexity. So stratified gatefold will try to create fields where the distribution of the classes is kept, as in the original data set with stratified gatefold. We can immediately check the performances in a very unbiased way and we can see whether the model is performing well or less in each of the fold or again, if the model is performing better. And in that case, probably it means that the training set must be integrated, or if the performances are poor in or Harp under a certain threshold. In the majority of faults, it probably means that the model is not as a low capacity, is not enough to capture the dynamics that are necessary. Other approaches which are much more drastic are the so-called leave one out and leave out. These two approaches are very dangerous. So I always suggest to be careful with these two approaches. Giuseppe Bonaccorso: [00:40:08] I mean, nothing will happen, but your computer can remain stuck for a very long time because in particular, with leave out, leave one out simply says, let's try to understand if a model is enough to classify, for example, certain data sets. So what we do is excluding a single data point from the training set and we perform and if there are any values and validations. So we have N measures where the training set is made up of all points. But one, these will help immediately understand if your model is on the fitting, for example, because if your model as for performance is in yellow, it means that is unbefitting and the whole can easily overfit, but it can never show the opposite. It can never happen that when l'eau is very low, it's because there is something wrong in general. It's because the model is not able to capture the correct dynamics. So there are a lot of injera. It's not for all the values, but there are a lot of values, a lot of the data points that are used for the evaluation where the model produces the wrong, wrong classification. So it means that the model, even using the majority of the training, set, almost all the training set is not able to output the right class to avoid this kind of problem. It's possible to use another approach, which is LPO. But as. I invite always the students to be very careful because LPO is based on non disjoined data sets, so it's based on computing the binomial coefficient of all possible subsets, which is proportional to the factorial of the number of samples and using. Giuseppe Bonaccorso: [00:42:02] It's just like, oh, but instead of using this as a single one for evaluation, there are P ones. But when you compute end of a P the binomial coefficient, this can become really, really large. So in some cases can explode, can reach millions. So it's a very dangerous method that must be considered only if necessary. And hopefully all the readers know that, for example, in the majority of implementation can compute this factor before starting the process so they can check whether the number is reasonable. And if the number is too large, they can reduce or increase the number of P. Clearly, when P is reduced, we go towards L.O.. If we increase it, we have something that is probably useless because in that case we are training the model with just a few points and we are evaluating on a very much this set. So just to to conclude, I would say that stratified gatefold is probably the best choice in the majority of cases. How low can be used to evaluate the capacity of a model if it's enough? But in general, a good data scientist can immediately understand if a model is working properly using a stratified gatefold and can make the right decisions to go on. Giuseppe Bonaccorso: [00:43:25] Thanks to the results provided by this method. Harpreet Sahota: [00:43:27] You did a really good job in your book that I enjoyed how you presented it again from First Principles, as well as some compare and contrast and how to implement in scikit-learn. And I think it's really good treatment of cross-validation in your book. Want to jump in now to a couple of other topics. Harpreet Sahota: [00:43:43] Now, I was wondering if you could share some tips with our audience so that we can be more thoughtful with our feature engineering feature. Giuseppe Bonaccorso: [00:43:53] Feature Engineering is an extremely important part of the process. We need to think about the problem in this way. First of all, in general, when we have a data set, the dataset is provided, for example, by a certain source. Not all the features are generally necessary for predicting a certain outcome. This is important because only really in some cases when we have a very large data sets, we need to have all the features. So one important step is feature selection. Feature selection is a fundamental step also to improve the explainability of the model. And nowadays, for example, thanks to techniques like expandable AI, it's possible to find explanations also for models which are not easily interpretable. So it's very important to limit to the features with a very high variance and to try to use these methods to discuss with business stakeholders so they can immediately understand which are the dominant factors. And considering the remaining features, which sometimes have a very limited impact as, let's say, noisy features which are not really important. On the other side, feature engineering can also be extremely helpful to create new features. One field is, for example, when we start using linear models, which is generally a good a good starting point in many cases. Giuseppe Bonaccorso: [00:45:20] But when we know that we observed that it's impossible to achieve certain performances, it's possible to use feature engineering, for example, to try to create new features by combining the existing features. There are the polynomial regressions which are based on this idea. In some cases, when we consider the features as independent, we make a mistake because, you know, in reality is very difficult, that all the features are independent. So having the ability to model also the interactions of the features and then performing a feature selection can be extremely helpful for the results. And again, I always invite to study and to apply methods of official importance and XIII, if possible, to understand the real importance of the feature, because these tools are very helpful when discussing with experts who are not familiar with Data science. But they can immediately understand whether a feature is is something surprising, for example, or it's just something that they everybody is expecting. This is very important, for example, in health care, where sometimes data sets can help discover the interaction of different features or a particular feature in. Driving a result, and this will help move to the next analysis to understand why some features are particularly important in some cases. Harpreet Sahota: [00:46:50] And how about when it comes to tuning our hyper parameters of what can we do or what tips can you share with the audience so that we can be more thoughtful when we're performing? Giuseppe Bonaccorso: [00:47:03] Hyper parameter tuning hyper parameters are probably one of the most important topics and unfortunately there are no silver bullets. Normally, when talking about tighter parameters, we first need to understand the difference between the different classes of upper parameters. Giuseppe Bonaccorso: [00:47:20] And then, of course, we need to try to evaluate how to to tune them up. For example, there are some other parameters, like the learning rate that we know is an either parameter that in general is relatively small. But there are some techniques, for example, Bache normalization, that allow to have larger values. So these kind of parameters can be compensated in some way, other hyper parameters instead, for example, the strength of regularization sound strange, anstead more like capping parameters, and they can really change the results completely. So I normally suggest, first of all, understanding how the algorithm works, how understanding the role of the either parameters and in, let's say, machine learning and deep learning, this is easier. It's a little bit more complex in deep learning because the number of parameters can be very large. But it's important to understand how each parameter works. So the contribution of the hyper parameter and the best way is once we understand and we have an idea of the possible values right now, the best way is great search, which is a way of, let's say, a brute force search. But it can be optimized, but it's a way to look for different combinations, because one thing to remember is that type of program that's just like features are never completely independent. Giuseppe Bonaccorso: [00:48:45] So when you consider different values, sometimes you you find a good value for one type of parameter. But this will completely change the effect of the others. So research allows to evaluate different combinations. It can be very, very expensive. But if the number of combinations is very high, that's why there is an approach for that random search, which is very used in the learning that consists in trying to first to have the core serve you. Then when there is a good region where some parameters seem more promising, it's possible to zoom in and continue this process in a more random way. But zooming in every time into the region where the best type of parameters normally lie and this way clearly cannot reduce completely the burden, the computational burden, but can reduce I mean, at least can can avoid a complete search, double combinations. I normally suggest to consider the default values as the values which are generally valid for the majority of tests. But for example, I make an example based on regularization using know it's helpful not to start directly by considering the necessity of being a regularization. So, for example, it means I could learn logistic regression as a ready constant equal to one for the penalty L2. Giuseppe Bonaccorso: [00:50:10] But this is not necessarily true. I mean, in some cases it's not necessary. So I always suggest to start with the very small value, also zero and then checking the results. And if we observe other fitting, for example, in that case we can increase that value and we can find the optimal one. So the default values must be considered as good choices because they are based normally on different analysis, performed on many data sets, but they are not necessarily the right choice. So if you default every time on the on, you use the default value so you never change the default values. In some cases, you can never reach some results. We have seen some results obtained thanks to the usage of slightly different values. Giuseppe Bonaccorso: [00:50:58] Clearly, this is a very long topic, so it's very difficult to synthesize everything. But I think that it's very important to understanding the nature and understanding also that hyper parameters are not independent. So it's important to when tuning up another parameter to tune up also the correlated ones. Harpreet Sahota: [00:51:16] Yeah, definitely. I think that's a great point. It is a really deep topic and I think even our listeners a lot to think about and hopefully that could help them become more mindful and more thoughtful in the way they perform their hyper parameter tuning. So towards the end there, you mentioned regularization. Harpreet Sahota: [00:51:33] So I was wondering if you could offer us some heuristics for determining whether we should use regularization. And if we decide to use some regularization technique, how can we ensure that we're using the correct one? Giuseppe Bonaccorso: [00:51:47] Yes, regularization in general is a technique where we try, at least in the classical way, we term a penalty term to the cost function. And the reason of some techniques, like, for example, the famous CELTA or rage or taken off regularization is to avoid overfitting, for example, and reduce the effect of Solidarity's in linear regression. The reason for reducing the effect of overfitting is that there is a problem called the bias variance tradeoff. So in some cases the model can become really almost unbiassed or can reduce the bias, but you can increase the variance. In other words, this is equivalent to say that the model is overfitting is not able to generalize. Giuseppe Bonaccorso: [00:52:31] So using the L2, we are just imposing a constraint to increase the bias of the model. But with the price that we pay is to reduce is for reducing the variance. Giuseppe Bonaccorso: [00:52:43] So the solution is suboptimal because clearly when we had this term to the cost function, the minimum that we obtain is not the optimal minimum, but the result is a function which is more able to generalize so as to is in general the best choice in the majority of machine learning models. And it's helpful to tune up, of course, the strength of the regularization, because if it's too strong, there's a risk of biasing too much. If it's too small, the bias is too small. I mean that the effect is negligible in linear regression when regression was proposed, that kind of regularization and also the effect of the quality. The reason of this is purely mathematical, and it's related to the fact that if you just solved the problem of the linear regression that can be solved in closed form, there is an inversion of the matrix that can become singular when there are linearity. And the effect of the regularization term is to add as more constant to the diagonal of this matrix. So this matrix is not singular anymore and can be inverted, so this can increase the numerical stability of the system. Another way to reform regularization is to use, for example, the L1 or I need to be precise. All norms at the end of the day will be have more or less in the same way, but the effect is proportional to the P value of the norm. Giuseppe Bonaccorso: [00:54:09] So we are not interested normally in the zero norm or normally higher level norms. Generally we use as one and two. And one is the most common method called the lasso to perform automatic feature selection. The reason is very similar to L2, but the difference is that in L1 the coefficients which are very small are pushed to zero. So the other one arm will force this coefficients to become effectively zero. So in this case, the result of this regularization is to remove all those features whose contribution is not important for the prediction. Clearly, in some cases it's possible to use both NRPs and the combination is called elastic net. This is very common when we want to perform feature selection and also we want to prevent overfitting. And this is very common in machine learning when we work instead with deep learning. There are also other techniques in deep learning. It's possible to use L2, but the effects are sometimes a little bit more difficult to control. There are in the planning techniques like dropouts, for example, which is an extremely interesting technique that can avoid overfitting by limiting the capacity of the model, but by creating a lot of sub models is like splitting a model, which is a deep network for a domain, sometimes millions of parameters into mini sub models randomly that will be trained generally on specific regions. Giuseppe Bonaccorso: [00:55:42] So the result is that its model will become more and more expert on a specific region and the overall model will never overfit. It's obviously it's important to control also the parameters because the drop out is based on a random selection of the input of a layer and putting them to zero. So as much value as very limited effect, a very large value, of course, can create a model that is clearly unbefitting for sure. I think it is important is that sometimes I read about Charlie, stop stopping also as a regularization technique, because that regularization sometimes is to avoid overfitting. So you train the model until the model is performing well and then you stop when the model starts performing badly. Because the easiest way to check if a model is overfitting is to observe the training curve. And if you observe the training curve, normally you see that in overfitting model. As a training curve that goes down and breaks almost zero, while the validation curve after reaching a minimum, then starting increasing again, and this curve is called the you curve. Giuseppe Bonaccorso: [00:56:51] So the idea of early stopping is to stop simply the training before or just before the Ukr honestly high. Listen to this suggestion from Andrew and I absolutely second it because every stopping is a way to cap the model and it's a way to prevent any possibility to perform better. So it should be considered like a last resort, not considered as the best choice. In general, it's preferable to try alternatives before only if there are no other chances using early stopping as regularization. Harpreet Sahota: [00:57:25] Thank you for that. And I know we spent a lot of time getting into the foundational concepts of machine learning. So I really appreciate you sharing your insights into that. But your book covers so much and there's so much great content and you present everything from the very first principles approach in your book. You cover everything from the fundamentals, which we spoke about today. You even go into a semi supervised learning, time series, generalized linear models. You go into neural networks, deep, convolutional neural network reinforcement learning, deep belief networks. It's a really comprehensive book. And I really, really thought the effort that you put into presenting these things from a first principals perspective. Harpreet Sahota: [00:58:01] But one thing that I don't think really gets coverage in many books or in many literature blog posts or what have you out there is what to do once the model is shipped into production. So once we fit a model and ship, it doesn't work. As a data scientist, stop there. Giuseppe Bonaccorso: [00:58:18] You're right. This is a topic that's not very well covered. It should deserve more space. The answer is absolutely no. Nowadays we are moving into the direction of applying that DevOps approach also to machine learning that Data scientists or the machine learning engineer can never consider his or her work ended after the model is in production. Only when the model in production is test continuously. It's possible to understand if the model is working properly. Giuseppe Bonaccorso: [00:58:50] And in that case, the goal of the data scientist is just to check if the performances that were initially evaluated using a data set are now the same, using a real Data and sometimes it happens that this performances can change. Some models need retraining, first of all. And retraining a model is not something that can be done completely, automatically. I mean, it can be automatically, can be done automatically, but it's necessary to observe the results of this retraining. Giuseppe Bonaccorso: [00:59:21] Sometimes there are surprises when retraining some models. For example, it's necessary to increase the capacity of a model. For example, a model when retrained can show bad performances on other samples which are excluded or the training set is becoming too large. So there are many problems where the data scientist and the machine learning engineer must be involved and they can never stop thinking that their work is done. Another problem, for example, is the fact that the Data generating process can change. So this happens very often. A model is trained starting from some Data, but after a while the Data changes because of external factors and thinking that the model must continue working properly is absolutely ridiculous. So it's necessary to involve these people in understanding whether it's necessarily a change. I also invited a scientist to monitor the process, for example, to understand if it's possible to improve the performance is so. I Data scientists must never forget that the final product is a piece of software and optimizing a piece of software is a never ending process. So the data scientist must observe if the results are really good, if the response time is good, if it's necessary retraining, if it's necessary to tune up to change something to optimize or for example, to work again on different versions of the models to find better results. And in my experience, for example, working with different countries, we can observe a very huge differences. So we doubt a continuous approach, which is the approach of the model or develops applied to machine learning. For example, it's impossible to guarantee a service that is reliable over time. So you are absolutely right. This is a much more attention. Harpreet Sahota: [01:01:24] So what are some things that we need to monitor and track once the model is deployed? Let's say let's start from the perspective of the Data generating process. What are some indications that we can look for to signal whether or not this underlying. Harpreet Sahota: [01:01:40] Data generating process has changed from what we had initially modeled, Giuseppe Bonaccorso: [01:01:45] Well, we observe clearly some bad performances and there should be a sort of debugging in checking which samples are misclassified. For example, let's suppose that we are working on face recognition. Clearly, it can happen that we have excluded some people work or we have excluded people with hats because the picture where without hats and now they are wearing hats or for example, thinking about covid, we have excluded people with masks. Giuseppe Bonaccorso: [01:02:16] So we have a model that works perfectly. But now all the people wear a mask and it's impossible to recognize them anymore. So what we observe is that the model is performing poorly with respect to a specific population. At that point, the process is extremely important. It's necessary to understand the population that it's necessary to sample from. And it's necessary to understand the weight of this population with respect to the existing one. And if this process is done correctly, we can find a proportional number of samples. We can enrich our process. We can retrain the model. Because one thing to remember is that many models, like neural networks, are very powerful in learning, but they are also very fast in forgetting. So it's important to create a more comprehensive process where there is a representation also of this percentage of new samples of this population and then performing the retraining and observing how the performances are affected. Giuseppe Bonaccorso: [01:03:23] So it's very easy to see that the model is not performing correctly. Sometimes we can just skip this step saying the model is performing badly, but in a real debugging we need to check whether and when. So we have to take the samples where the classification is wrong. And after checking a certain number of samples or using automatic techniques like clustering or something like this, we can immediately understand that there is a group of particular samples with some specific features which are not classified correctly. And this means that they are not represented in the training data set. Harpreet Sahota: [01:04:00] And do you have any type of resources that the interested listener can go check out any keywords that they could use in their searches for getting up to speed on this aspect of the pipeline? Giuseppe Bonaccorso: [01:04:12] Well, they have to in this case, I think that is helpful to to look at. Yes, exactly ModelOps is that new way to to look at this process. But it's important for the team managers, I think in this case, to create an integration of different roles. So Data engineering data scientists, the machine learning, engineering, business analysts, all these people must become to a team. They must work together. And each of them has a responsibility. And some responsibilities are peculiar. Giuseppe Bonaccorso: [01:04:46] So it's important that the culture of creating a process that is completed, let's say it's self-contained, but at the same time, every part of this process as specific responsibilities like a football team or any kind of team where everybody has a specific responsibility is fundamental. So I suggest to study the Vulpes in general, because having some knowledge about standard software development is fundamental, even sometimes that the scientists don't think they are developing software. This is absolutely wrong. They are developing software and they have to understand that a software can become valuable only when it goes into production. And so all the techniques that have been developed now to create pipelines and to automate these processes, continuous integration, continuous deployment, automating the tests and checking the results, creating a real time alert systems, all of these elements are necessary. Even if you are not directly Hands-On, you are not working directly with them. You have to understand their role and you have to be ready to accept also the presence of people with this kind of background in your team and to leverage their knowledge to guarantee the results. Harpreet Sahota: [01:06:01] Thank you very much for that. So I want to shift gears here now. I'm wondering, do you consider Data science and machine learning to be an art or purely a hard science? Giuseppe Bonaccorso: [01:06:11] Thanks for this question. I really love this question. Data science is a science for sure. There is mathematics behind and we never we should never forget this. But I consider also mathematics and mix of science and art. Giuseppe Bonaccorso: [01:06:25] When I say art, I don't mean pure fantasy. I mean that sometimes the way we solve some problems, the way we address the problems as a lot of creativity behind and this creativity can make the difference between. Repeating the same task or trying to do something different and finding real innovation, so I consider definitely consider it a science, a mix of the two and the ability of the Data scientists. I normally sometimes I considered crazy for this, but I often repeat that the data scientists must think like an artist when finding a solution, when creating a piece of code. But of course, also not this. Imagine an architect, for example, as to no physical loss because otherwise the dome will collapse. Giuseppe Bonaccorso: [01:07:11] So it's clearly hard science, but it's not purely art science. And there is room for creativity, a lot of room for creativity. Harpreet Sahota: [01:07:21] So talk to us about creativity, curiosity. What do these two things, what does creativity and curiosity, what rules do they play in being successful as a Data scientist? And how can somebody who doesn't see themselves as creative and actually understand that they can be creative and that the work that they're doing is creative? Giuseppe Bonaccorso: [01:07:41] Everybody can be creative? I think that creativity is not an elitarian. The real problem here is that sometimes when you start working, you start working using routines which are helpful to avoid the problem of uncertainty. So you repeat the same things because you are sure about the results. But on the other side, this approach can lead to boring activities which can drive them to lose interest. So creativity means finding new ways sometimes to solve the same problems. And curiosity is a compliment, because when you think about the way you solve a problem, you at least considering not only Data science, but in general the many solutions have been found by considering analogies. Giuseppe Bonaccorso: [01:08:31] So I always invite people to think by analogies if a problem has been solved in another scenario, using a different situation, a different different structure, different environment. But there can be an analogy. Curiosity is the way to think this analogy as a possibility, and creativity is the way we can transform that analogy into something new. So even the most repetitive task can become very interesting. If you think about the way to change, to transform the boring activities into something that, for example, can be automated so you can work on other tasks, or, for example, when you can find new ways to solve solutions or you can find ways to improve performances of existing systems. So creativity and curiosity are the key for everything, not for Data science, in my opinion. And clearly, as we are talking about Data scientists, I always repeat that curiosity must be directed towards everything. You must not be segregated, being segregated, thinking that you have to work only with your tools, only with your topics, is a way to handle your career immediately. Giuseppe Bonaccorso: [01:09:48] The only way you can really expand yourself is to be curious, to learn the new processes, to learn how other people work, to talk to other people, to understand how your business work. Even if you don't have all the knowledge, you can acquire some basic knowledge. And at that point you can discover also new possibilities where you can apply the science, for example. Harpreet Sahota: [01:10:07] Thank you very much for that. I think you and I are a couple of peas in a pod. I share the exact same perspective as you do with respect to that. So I think one thing that's important for Data scientists to realize is that we play an integral role in the organization, in the business that we are a part of. So it's not enough just to have that hard technical skill with respect to mathematics and coding. But we also need to have a strong business acumen and a product sense. Harpreet Sahota: [01:10:38] So in your opinion, how could Data scientists develop their business acumen and cultivate a product sense? Giuseppe Bonaccorso: [01:10:46] I totally agree and this is a real challenge. It's very important for a scientist, aspiring Data scientist and also sometimes senior data scientist to become more business oriented, the only way to become more business oriented. That doesn't mean that you have to forget your background. It simply means that you are thinking that's what you are doing, must have a business value. Giuseppe Bonaccorso: [01:11:11] Otherwise, it's just an exercise and nobody is willing to pay for an exercise. And the only way to develop this ability, it's related to the culture of the company. For example, a good way is to participate to meetings. As I said before, in some meetings, you have always the chance to talk, but you have to understand the other people in the meeting and you have to understand that your contribution must be fully compatible if you. I think that in a meeting where there are, for example, marketing people or your management people and you start talking about capacity of a model or regularization, they just smile and probably they forget everything after a few seconds. Giuseppe Bonaccorso: [01:11:54] So the only thing to develop this ability is to ask for feedback to these people about everything, to ask questions and to think that your language as translator is always the stakeholders language. So you must become a translator every time you have a concept in mind. If you are able to express that concept in a language that is understandable by the business stakeholders, not only you increase the chance to be successful, but this is the only way in some cases to be successful. Because when you have, for example, to justify the need to budget and you are not able to define the business value of a solution, you don't receive the budget. So you fail. And at that point, there's nothing. I mean, you can talk about the beauty of a model. You can talk about the art of Data science and whatever, but you don't have the budgets and so you don't you cannot go on. So that's a very pragmatic viewpoint. But being pragmatic in nature, science is normally a winning element. So I always invite in particular the juniors Data scientists to start as soon as possible to discuss with senior stakeholders, to present to them and to become more and more confident when discussing also to answer questions which are not strictly related. And this is something that requires a sort of training and ongoing training. It's not something that can be learned in one day or in one week. It's something that requires training. But it's important that the managers allow their junior employees to expose themselves in these situations. Otherwise, they never have the possibility. And sometimes they remain stuck in a position which is purely technical. Harpreet Sahota: [01:13:50] Thank you very much for sharing that insight, because speaking of juniors and maybe up and coming aspiring Data scientist, what advice or insight can you share with them, with these people who are breaking into the field and they're looking through these job postings and then some of them seemingly want the abilities of an entire team wrapped up into one person and they end up feeling dejected or discouraged from applying. Do you have any words of encouragement or insight that you can share with these people? Giuseppe Bonaccorso: [01:14:22] Well, I normally ask this question during interviews when I understand that this is quite common in some cases. Giuseppe Bonaccorso: [01:14:30] But what I try to explain is, first of all, that a team is made up of people and each person has peculiarities. So belonging to a team doesn't mean that you disappear as an individual contributor, but you become peculiar to your team. What I normally say is that you have always the possibility to become unique, which is the next step. So if you increase your domain knowledge in some cases, if you become more and more reliable, you can really become unique in the team. So being part of a team is absolutely necessary because it's impossible to think about large projects without teams. But at the same time, it's important to say that every single person has to develop as a component and this component, let's say, a startup team inside. So this small team inside a single person must be developed and at that point that person can really become a fundamental pillar. When this happens, the team becomes absolutely winning. And there's no way to stop such a team when this doesn't happen. It's because exactly in that team, every person doesn't feel like part of the team or it feels too much as part of the team and without an identity. So I try always to say, you have an identity. Giuseppe Bonaccorso: [01:15:57] I normally talk to everybody and I try to understand the peculiarities of each single person and I try to emphasize this peculiarities. So to help other people to understand how these new members can be really helpful and whenever they have to trust this new member and whenever they have to help, for example, them. So by doing this, which is really something that can be a continuous activity, it's possible to avoid this kind of problem. Clearly, it's not. I mean, in some cases it's impossible because some people will never arrive to the certain stage of the interviews because they. Don't show any ability to work in teams, and if you want to be a pure freelancer, you can try, but also being a freelancer is a risk because they're also freelancers to work in teams in some cases. So unfortunately, some people are filtered out simply because when they discuss about themselves, they don't emphasize their ability to interact with different people. So I try when I can, to underline this concept. So to drive the discussion in this direction and to help the new hires, to understand immediately that interacting is the key. Harpreet Sahota: [01:17:19] So last question here. Before you jump into a quick lightning round, what's the one thing you want people to learn from your story? Giuseppe Bonaccorso: [01:17:26] I think that there are for sure more work to be able to learn from. One thing that I always say that I never limited myself. I always tried to expand myself, and I always accept the challenges. So my character is that I'm not scared about new challenges, clearly, after many years. I learned that other promising is dangerous. So I never overpromise or I try to never overpromise, but I never try to limit myself or I never said I don't want to do this because I'm scared. I always loved learning whenever it was necessary. Every new experience for me was an opportunity. Sometimes some experiences were short, sometimes some experiences were longer. But if I have to look in my past, there are no bad experiences, even the bad experiences, because it's normal. I consider them as part of my evolution. And if I have to think about myself, this is something I was discussing. Sometimes I never limited myself when I received some proposals, honestly, I felt sometimes a little bit scared about the risks, but I said, OK, I want to take the risk because only if you take some risks. Of course, with all the possible measures, you can reach some results and you can gain the right confidence and you can also obtain better results. But of course, I don't consider myself as an example for other people. So I prefer I invite to read the biographies of real working people. Harpreet Sahota: [01:19:08] And I definitely think you are a worthy person to learn from. Thank you so much for that answer as very beautifully put. One hundred percent. I agree with you. It you know, sometimes you have to do the things that are frightening, that are hard, that are challenging, that push you out of your comfort zone, so to speak. Because when you pursue those activities and those type of tasks, that really is where the most growth happens, both professionally and personally. So thank you so much for sharing that. Let's jump into a quick lightning round here, starting off with the first question. Harpreet Sahota: [01:19:42] What do you believe that other people think is crazy? Giuseppe Bonaccorso: [01:19:46] Sometimes people think that what I did, for example, in my past, many decisions, the time that I dedicated to my patients was a crazy thing. And I would say that sometimes I also consider this a crazy thing. But it's something that, yeah, some people consider crazy. I dedicated a lot of time to things that in certain moments of the life, probably are not extremely important, like reading, studying, working. When you are very young, it can be something that is considered crazy. Harpreet Sahota: [01:20:19] If you could have a billboard put up anywhere, what would you put on it and why? Giuseppe Bonaccorso: [01:20:26] Well, I take a lot of notes, so I would like to have a billboard to put notes everywhere. But on the other side, I prefer to remember interesting quotes. For example, by heart. I never write down the quotes. I repeat them. I learned some quotes in my life. And sometimes when I go to bed or when I wake up in the morning, I repeat them. It's like an automatic procedure, like a mantra. Giuseppe Bonaccorso: [01:20:52] And this will help me to go on. And sometimes it gives me new ways to look at the world through interesting kind of affirmations for yourself. Harpreet Sahota: [01:21:03] So what is your most favorite quote? Or rather, what's the quote that you woke up this morning and told yourself? Giuseppe Bonaccorso: [01:21:10] Well, I have a quote that I use for myself. And the quote is, the road is always ahead. I repeat this quote many, many times. Harpreet Sahota: [01:21:18] I like that one. Whose idea is that by anybody in particular that? Giuseppe Bonaccorso: [01:21:21] I think no, honestly, I think probably by myself. But it's a quote that can be found in many, many books and many lessons. So I consider this the result of all the studies, all the readings. So I don't consider it mine, but yeah, I never read it. Honestly, but in some cases, I read about never stopping, always go on and I translated this in this quote, Harpreet Sahota: [01:21:47] I like that there's a the stoic philosopher Seneca. Harpreet Sahota: [01:21:50] He said something to the effect of it doesn't matter who the author is, as long as the quote is good and that if anything that is true becomes mine. So very different viewpoint. I enjoy that. So what's an academic topic outside of Data science that you think every Data scientist should spend some time researching? Giuseppe Bonaccorso: [01:22:08] Well, for sure, mathematics. I will add neuroscience, for example, and psychology, cognitive psychology. These subjects are extremely important. They are not only important, they are fascinating. And they can really change the way we look at the world. And they propose a lot of questions that can help us understanding the power of what we have right now and the power of what we have to achieve if we want to reach the same level of animals and human beings in terms of intelligence. Harpreet Sahota: [01:22:40] Of course, that's absolutely fascinating. And I'm glad you said that, because if anybody was to ask me that same question, I would say pretty much the same thing. I'm a really big into neuroscience and cognitive psychology and just understanding the nature of the human mind, both biologically, physically and kind of in the more abstract sense. So just understanding the brain, understanding its structure and understanding how the different pieces work together to help you better understand your thought process is once you kind of have a grasp on that, even just a little grasp, your life can become so much more enjoyable. I would say so what would be the number one book, fiction or nonfiction that you would recommend our audience read? And what was your most impactful takeaway from it? Giuseppe Bonaccorso: [01:23:23] I would recommend two books, not one. Please accept it is. I want to recommend one book that I consider extremely interesting is a Siddharta by Hermanus. It's a real lesson on how to go on and how it's possible to change the life, the difficulties. And it's a very, very small book can be read in a few hours. But there are some pages that will remain stuck in your mind forever. There is the power of the mind, the power of the will. And this is a fantastic example. Another book that I really love is a book written by a cognitive, I would say scientists, because it's not necessarily I mean, it's more a philosopher. Giuseppe Bonaccorso: [01:24:07] Mathematics is Gedda and backed by Hofstetter. I read that book a lot of time ago. Giuseppe Bonaccorso: [01:24:14] I consider like a real masterpiece that put together art logic. That was artificial intelligence for that time because of the book is very old. But if you read it, you can never stop reading it because it's so interesting and creating these connections and helping you understand the power of these connections that you will continue reading and probably reading it for the whole life. Harpreet Sahota: [01:24:40] I'll definitely be adding those to show notes and I'll check those out myself. So if we could somehow get a magic telephone that allowed you to get in contact with the 18 year old guys at the what would you say to him? Giuseppe Bonaccorso: [01:24:54] That's a very hard question. I would say go on. Whatever it happens, never stop. Still scared by the unknown. And for sure, you are going to fall down, but just take your time and then restart. I wouldn't say to change anything. Harpreet Sahota: [01:25:11] What song do you currently have on repeat? Giuseppe Bonaccorso: [01:25:14] I have many songs in this case. I don't have a single song, but I have many songs that sometimes are associated with my mood. Sometimes I just repeat them so I don't want to name a single one. But I would like to say that there are sometimes some songs which are associated to specific situations that happen in my life. So when I listen to the song, it's because maybe I want to feel in that way. I love many kinds of different music, so it depends. So I don't want to say a name because honestly, if I say something, I will exclude other things which have the same value for me. Giuseppe Bonaccorso: [01:25:54] But yeah, I listen according to my mood and according to what I want to achieve in that moment. For example, I mean, you ask give me an example. I will tell you an example. I had a fantastic experience in my life and there was a karaoke. I didn't participate because I'm terrible. But there was the song, a very old song, American Pie by Don Meklit that I really love. It's a very old song. But that song is so intense in certain moments that opened my imagination. So whenever I listen to that song, it's like moving in that period, in that moment. And it's a sort of flashback for me, Nabalco. Harpreet Sahota: [01:26:39] I love it. I'm very much the same way. So where can people find your book and how can people connect with you online? Giuseppe Bonaccorso: [01:26:46] The books are all available on Amazon and in almost any bookstore, in some cases also physical bookstores, but in general, all digital bookstores. There are also Chinese and Polish translations. In some cases, people connect with me generally through digital channels. LinkedIn, for example, is one of the main channels. I also use Twitter a it's a bit less because I don't like the limitations, because I prefer to write longer posts. I also love going to conferences and events where I meet many people. But unfortunately in this period I had to limit all these activities. But thanks to this limitation, I had the possibility to get in touch with a lot of people through digital channels. And this was an extremely interesting thing for me because I met many new people. I'm creating new relationships even if they are more virtual, but they have the same intensity in some cases of real relationships. Harpreet Sahota: [01:27:53] Giuseppe, thank you so, so much for taking time out of your schedule to talk to us about your book to share your insights with us. I encourage everyone listening to go out and get a copy of Giuseppe's book Mastermind Family Algorithm's. It is super comprehensive presents a lot of the topics from a ground up approach highly recommended. Again, thank you so much for coming on the show and taking time out of your schedule to be here. I really appreciate it. Giuseppe Bonaccorso: [01:28:18] Thank you. Likewise. It was a pleasure.