On this second Monday episode we do a first dive into the differences between AI, Machine Learning, and Predictive Analytics!
We’ll also apply each one of these technologies to a single problem! Get psyched for the new technical face of Data Couture!
To keep up with the podcast be sure to visit our website at datacouture.org, follow us on twitter @datacouturepod, and on instagram @datacouturepodcast. And, if you’d like to help support future episodes, then consider becoming a patron at patreon.com/datacouture!
Welcome to data tour the podcast covering data culture at work at home and on the go. I’m your host, Jordan Bohall. If you’d like to stay up to date with all things, data and data couture, then head over to our social media pages. If you’d like to help support the show, and check out our
email@example.com forward slash data couture know under the show.
Welcome to data consumer. I’m your host Jordan and on today’s episode, we’re going to be talking about the differences between artificial intelligence, machine learning and predictive analytics. However, for those careful listeners, you might have noticed that the introduction has changed just a little bit. And if you didn’t happen to catch the first episode that dropped today, and drop this one, namely, vs ML vs PA, as well as the episode data, couture and new scope. Now, I am shifting paths a little bit on what I’m doing with this podcast. So so far, we’ve covered a ton of topics, and the data realm, everything from self driving cars to, I don’t know, ticket price prediction, all sorts of stuff. And well, that’s very, very important. It’s more or less just kind of groundwork for what I find truly interesting, namely, the technical stuff, the technical parts of all these cool new technology topics. And so what I am doing is for the Monday episode presenting a new topic. So for example, for this week, differences between artificial intelligence, machine learning and predictive analytics. But that’ll be kind of the first half of the show. The second half is more or less kind of getting into the technical aspects of each of those. Then on the Wednesday episode, we’re going to be doing a true technological deep dive where I get into the sticky bits that I find most interesting. And that’ll be closer to the Monday episode length, maybe 20 minutes or so. And then on Wednesday, or sorry, not Wednesday, on Friday, the third episode, on Fridays, we’re going to be doing an implementation episode. So how do you take these technical resources? And how do you implement them in your organization, and I think I’m going to be calling that implement this. And the final bit of officiousness of bureaucratic messaging is that we will be also producing two YouTube episodes on Tuesdays and Thursdays thing will be calling it data tour data driven. And it’s where you come along with me on my drive into work. And I talk about what’s affecting me in my industry for my organizations in the data capacity. And so it’s literally going to be me driving and talking to you guys. Which means that you’re going to get data couture content five days a week, three podcasts and two blogs. So be looking forward to that. You know, let’s just get on with this episode AI versus ML versus PA. That’s the interesting stuff. Right? So here we go.
Okay, so let’s get into it. What is the difference between artificial intelligence, machine learning and predictive analytics? Well, the way I like to describe it is by Venn diagrams, you know, those big circles that are surrounded by a massive rectangle that you probably used in grade school? Well, I absolutely love Venn diagrams. And I argue that you can solve almost any logical puzzle by the correct application of Venn diagrams. But nevertheless, let’s think of artificial intelligence as the biggest circle on the board. So in your mind’s eye, draw a very large circle. And then within that circle, maybe, I don’t know, third, to a quarter of the size of the main circle, draw another circle. And then inside of that circle, maybe, I don’t know a quarter to a fifth of the size of the secondary circle. Now, the big main circle that we drew in our mind, that’s, that’s artificial intelligence that encompasses machine learning and predictive analytics. Now in that second circle, the one that’s third to a quarter of the size of artificial intelligence, that’s machine learning. And then within that circle, the third circle we drew, that was a quarter to a fifth, the size of machine learning is predictive analytics. And so we can see that predictive analytics is a subset of machine learning. And machine learning is a subset of artificial intelligence. And so if we remember, any sort of of set theory, or I don’t know, was trying to activity, then we’ll know that predictive analytics is a subset of artificial intelligence, and it’s a much smaller subset that is machine learning. Right? So that’s one way to think of artificial intelligence. Okay, let’s look at that first bubble, the largest circle, we drew artificial intelligence, what is artificial intelligence? Well, artificial intelligence is more or less just the notion of making machines intelligent, we want to make them at least as intelligent as we humans, if not more so. And the idea is that these machines are then able to take information, take data, on their own, make some decisions, according to the situations that they’re faced with. So think of a robot learning to navigate a busy intersection, right? Sure, we can preload it with some data. But at the end of the day, it’s just going to have to try all sorts of different until it learns how to navigate that busy intersection. Now, let’s look at the second bubble in the the second size bubble, the one that’s a third to a quarter of the size of artificial intelligence, namely, machine learning. Machine learning is well colloquially, the way to make those machines intelligent machine learning is that is a subset of AI. And it focuses on a very narrow range of activities. in machine learning, we supply the algorithm with a bunch of historical data. And then using that historical data, the algorithm that we write, predicts what’s going to happen in the future. And it’s only predicting that future based on what has seen in the past, namely all this historical data that we’ve given it. Now, finally, let’s look at the third subset that have predictive analytics. Predictive Analytics, like I said, is a subset of machine learning. And there are some algorithms which we’ll get into soon, that are in the realm of machine learning and AI, but also exists in predictive analytics. So like logistic or like linear regression, they would both come under machine learning and predictive analytics, right? Whereas algorithms like your decision trees, your random forests, your random stumps, those are more advanced pieces that would naturally come under machine learning. And so where does predictive analytics use or follow? predictive analytics falls mostly in the tool set and the kind of, I would say, orientation that the user has. And so predictive analytics is one that doesn’t necessarily rely on heavy code, we can use Excel, which I absolutely love doing for linear logistic regression. You know, it’s one of those things, we don’t want to tell people in your organization that you can do this kind of thing, because then you become Excel guru. And now you have 30 emails or phone calls a day saying, How do I do AV look up Jordan, that can really mess up your workflow. Nevertheless, predictive analytics is typically limited to these types of methods. And you use tools that aren’t heavily code based, even though you might, you know, use things like BBA and write macros, or micros, or anything like this. But nevertheless, the machine learning engineer is going to spend the majority of their time writing lots of code, they use languages like our Python, or SAS, or MATLAB or anything like that. Whereas the predictive analyst is going to be more concerned about the business side of the house. And so we see a dissection their terror division there, right. And then the AI engineer will rely on machine learning, of course, and maybe even parts of predictive analytics. But the artificial intelligence professional, well have at their disposal far more tools than the machine learning engineer will have. And so this will come in, or this will take into account things like mechanical engineering, or robotics, engineering, or any number of different technical areas beyond the machine learning area. So that’s how we have a distinction between artificial intelligence, machine learning, predictive analytics, again, artificial intelligence is the widest circle, because involves more than just coding or, you know, fairly simple stats. And then machine learning is heavily code based using our and sequel and the others that I’ve mentioned. And then predictive analytics is perhaps the most basic or the most fundamental of all three, because it’s focused on just simple regressions, using simple tool sets. without too much, I don’t know, let’s call it technical danger. So yeah, that’s a good way to discern the difference between the three. So now for the second section, let’s look at a problem. That’s pretty common and AI, and then one that’s common in machine learning, and one that’s coming in predictive analytics. And on Wednesday, we’ll talk about a technical solution, at least to one of these using all three methods. So stay tuned.
Okay, welcome back. Now, for this section of the podcast, I want to talk about a specific example, that we can use to show the differences truly between AI, machine learning and predictive analytics. Namely, say we have a problem where we want to identify the difference between two pictures, saying one picture, there is a dog. And then one picture, there is a cat. And so the the general problem is, Can my algorithm can my machine know the difference between pictures when presented on whether or not there are cats in the picture, or whether there are dogs in the picture. So to do something like this, I would employ something on the AI site called Deep Learning. Now, deep learning is very similar to machine learning, in that there are algorithms created and functions such that they can classify whether or not the picture contains image of a dog or representation of a cat, right. But instead of having maybe one or two algorithms, like you would and machine learning, deep learning involves numerous layers of algorithms, and each one of the algorithms provide a different interpretation of the data being fed on. And so the sort of network of algorithms we call them, artificial neural networks, because they’re functioning is similar to attend to imitating the function of the human neural networks presented in the brain. So the question is in this particular classification problem, which is pretty simple, compared to the sort of problems that we usually use deep learning methods for, because deep learning can solve very complex problems that require very deep calculation. But nevertheless, the difference between the deep learning and the machine learning approaches to this problem is that with deep learning, you have lots of unstructured unlabeled data. Whereas in machine learning and predictive analytics, for that matter, you have highly structured,
labeled data. So what I mean by that, well, in machine learning, it’s very good at predicting the future, like I said, based on a bunch of historical data that you have. And so when you write a machine learning algorithm, you typically have a field of variable. Think about thinking of it in an Excel format, you have a column where you label whether or not this picture is a dog, this picture is a cat. And so when you build up a classification algorithm, in machine learning, you have what’s called the test and train set, training set. And in the test set, you have all of those examples, nicely labeled, this picture contains a dog, this picture contains a cat. And then when you let that algorithm loose on your trainer, your test set you more or less, have those like the algorithm has been trained with all those pictures of cats and dogs. And then you see how accurate it is labeling a picture either a cat or dog based on that training data set. Now, and artificial intelligence in this case, deep learning, that’s not necessarily the case, you have lots of unstructured data, the difference between unstructured data unstructured data is is structured data is easily searchable by the basic algorithms. So think of spreadsheets, or think of any sort of data that you might get from a sensor or a log file. unstructured data, on the other hand, is much more like human language, it doesn’t fit nicely into relational databases which, and future weeks we’ll get into. And searching it is more or less based on old ranges from difficult to completely impossible things. So think of, for example, any sort of chat program, think of slack, for example, or think of teaching or Microsoft Teams or think of Twitter, think of Facebook, right? The way people write isn’t an A definable pattern necessarily. And it’s extraordinarily difficult to parse if you were to just look up words based on how they’re spelled, right, and where the the type, or where the word spelled rigidly and everybody used the same format. For those words, it would be much easier to search and you could more or less parsed that into a structured database. But in the case of free chat, it’s all over the place. And it’s difficult to understand. And so in the case of deep learning, it receives all sorts of information to see receives picture after picture, in this case with the dog versus cat example. And then using the multi tiered algorithms that it’s been programmed to handle, it then looks at all these cases, and all the ways that all these different algorithms identify these pictures of cats and dogs, and then it looks at the edge cases, namely those cases where all the algorithms or at least multiple of the algorithms, and the net overall, identify a particular picture is a dog, or a particular picture as a cat. And by that it gets us a sense of labeling is a picture of dog, or is a picture of a cat. Now you can see the problem here, in the case of artificial intelligence, you need a ton of data, way more data than you need for machine learning. And machine learning, you have lots lots and lots of structured data that have been neatly labeled, and trained by human with the algorithms that have been built. And artificial intelligence specifically, in this example, deep learning, you don’t have that you have kind of hard to parse data, a lot of times, maybe it’s structured, but most of the time, it’s unstructured. And the algorithms, the net of algorithms, the neural network has to go through lots and lots of examples before I can start to label data points accurately, and then be able to produce an answer on whether or not a picture contains an image of a cat, or an image of a dog. Right. And so that, that that’s a very kind of nice way to determine the difference between whether or not you’re doing artificial intelligence, or you’re doing machine learning. Now, let’s take it down a level further. And look at predictive analytics, again, it’s it’s confined to mostly the logistic regression or any other type of regression of that, that nature, it couldn’t really handle or cannot really handle this kind of problem. Instead, it’s better off
determining whether or not an item should be in one segment or another. So it does, it’s very simple classification problems. And it requires highly structured data, it has to have clean data, which both machine learning and the deep neural net version of AI that we’re talking about. They also require clean data. But it wouldn’t really be able to handle this kind of computer vision problem, right? And so it’s better for any sort of low level classification problem where you want to put people or groups or whatever into 1234, whatever many buckets right to segment them. Now, when should you use deep learning? And when should you use machine learning? Or when should you use AI? And when should you use machine learning? Well, in the case of deep learning, you happen to work at a company that has a boatload of data. And it needs it, this is where we get into the Big Data round, right? You’ve all heard of big data, or three V’s come into play, have a lot of data all over there are a lot of data coming in all the time, right. And from that big set of data from that, from true, big data, you can drive interpretations using deep neural networks, or at least AI. Similarly, if you have a problem to solve, that’s way too complex for machine learning. So in the case of, say, self driving cars, you know, machine learning algorithm, by itself might be able to determine whether or not something is a light pole, her something is I don’t know a child or not. But in the case of self driving cars, there are so many different things that this car has to be aware of. And even, I don’t know has to have some sort of ethics Should I hit the old person or the young person, when I don’t have any other choices, right. So this is where you might use deep learning. And then finally, if you have the architecture, if you have the technical resources, where you can spend a ton of computational resources and expenses to drive the hot, the hardware, and then as well as the software for training the deep learning networks, again, it requires a ton of data. And it takes time for that particular deep learning algorithm to come to the solution. Now, when should you use machine learning? Well, you should use machine learning if you have data that is structured, and you can use that to train the machine learning algorithm. So it’s clean enough, right? As well as if you’re looking to leverage benefits of AI, not necessarily doing the highly complex, true AI, the true deep learning sort of things, to search ahead of your competition. And you know, the best machine learning solutions can help and lots of automation, it can identify, I don’t know whether or not you need to verify whether or not one of your customers is truly one of your customers. It can help in marketing. It can help and I don’t know, leveraging that great next idea to drive your company into the future, right? That’s the case in which you’d use machine learning in the case of predictive analytics. Well, why would you use that, again, if you have structured data, and you’re looking to forecast Say, say the likelihood whether or not somebody is going to go delinquent on their auto loan or their home loan or whatever it is, that’s where you can employ machine learn herbs, sorry, predictive analytics. See how that is Monday’s episode. And so on Wednesday, we’re going to we’re going to really dive in into those algorithms that we use, maybe for deep learning more from machine learning and talk about how that actually works. So until next time, keep getting down and dirty with data. That’s it for the show. Thank you for listening. Be sure to follow us on any of our many social media pages, and be sure to like and subscribe down below so that you get the latest from data couture. Finally, if you’d like to help support the show, then consider heading over to our Patreon firstname.lastname@example.org. forward slash data could tour writing, editing and production of this podcast is done by your host Jordan Bohall. So until next time, keep getting down and dirty with data