Episode 58: (Tech Talk) How to Apply Deep Learning Models to Image Classification

On this first episode of Tech Talk, we cover one of the classics of deep learning and artificial neural networks: Image Classification!

Check out this walkthrough of how to apply Convolutional Neural Networks to a real-life feeding problem!

To keep up with the podcast be sure to visit our website at datacouture.org, follow us on twitter @datacouturepod, and on instagram @datacouturepodcast. And, if you’d like to help support future episodes, then consider becoming a patron at patreon.com/datacouture!

Transcripts:

Welcome to data tour the podcast covering data culture at work at home and on the go.
I’m your host, Jordan Bohall.
If you’d like to stay up to date with all things, data and data couture, then head over to our social media pages. If you’d like to help support the show, and check out our Patreon page@patreon.com. forward slash data couture know under the show.
Welcome to the Erica tour. I’m your host Jordan and on today’s Tech Talk, we are going to be applying a deep neural network to an image classification problem. So like I said, on Mondays double header episodes,
we are transitioning the show into one that one talks about really cool stuff happening in the data scene. But to does more of a technical deep dive on some of the issues that are presented on the Monday episodes. So today is one of those first deep dives into technology. And since we talked about the difference between artificial intelligence, machine learning and predictive analytics on Monday, namely that predictive analytics is a subset of machine learning, which is a subset of artificial intelligence. Today, we are going to give an example of deep learning specifically, how we can classify different types of images using a type of neural network a specific version that we will talk about shortly. So let’s just get into it.
Okay, welcome to the first.
So here’s my problem. I am a very awesome person. And I’m an awesome person. And don’t chuckle for people who know me in real life. I’m an awesome person, because I love animals. And, you know, I love animals so much that I have 100,000 animals. And don’t worry, I have 100,000 animals that live on this mountain. And the mountain has all sorts of different areas and zones, and that will make it habitable for the many animals i that i happen to own. And the types of animals that I have. I have birds, I have cats, I have
deer, I have dogs, I have frogs,
and I have horses. Now, how do we get 100,000? animals? Well, I have a very kind heart, thank you very much. And I happen to rescue all 100,000 of these animals. But I’ve got a problem with was me, right? I have to feed all these animals and I have to ensure the welfare of all these animals. And you know, it’s it’s not cheap feeding animals. Like I don’t know, if you have ever had a high needs dog before. I’ve had to I guess I currently have one. But it’s expensive. It takes medication and the right sort of food and the right sort of care and, you know, so forth. But now I’ve got 100,000 animals, so what can I do to ensure their their safety to ensure that they’re doing okay, well, naturally, I’ve had to employ an army of people to go to these many zones and to feed the animals to take care of the animals to make sure that the animals are doing while they’re healthy, happy, it cetera. Well, I hate to say it, but you know, it’s starting to really take a bad turn on the old pocketbook. People still carry pocketbooks. I don’t know, any cases really starting to hurt my bank account, and my credit cards and my refinance mortgages and all the other ways that I’ve attempted to pay for these hundred thousand animals. So I’ve recently come across a solution where there’s a machine that can dole out a specific amount of food at specific times a day. Well, that sounds awesome, right? I just need one machine that can feed my animals? Well, here’s the problem. It’d be great if it were just a dog, or if I only had horses, or I only had birds, right? But no, I have birds, cats, deer, dogs, frogs and horses. And so do I need to buy a machine for all of those? Well, turns out these machines are expensive. And so I can really only afford one. And so what I want to do is be able to classify my animals in such a way that when they come up to the feeder, because of course I have my animals very well trained, and they all line up for food. When they line up for food, I want a picture to be taken of them. And then the machine to know which type of food and which amount should be given to that animal so that it can be fed. Right. So that’s my problem. And I think that’ll really solve a lot of my issues, I’ll be able to reduce my workforce because they’re getting tired of it, frankly, trudging through the mountain. And I’ll be able to make sure that my animals are well cared for. So how do you do that? Well, now that we have this ridiculous scenario in place, it’s all used to introduce the sort of image classification problem that I want to solve using a deep neural network. So to remind all of my listeners, deep learning is type of combine machine learning, or it’s a subset of artificial intelligence, intelligence. And it’s very good at recognizing patterns. However, it takes a ton of data in order for it to recognize the patterns accurately. And because I’m using a deep neural network, let me remind you guys what that is a deep neural network. It’s a type of computational model working in similar ways to how neurons in the human brain work. And so each neuron or each model takes an import an input, excuse me, perform some kind of operation, and then passes the output to the following algorithm. And so we’re able to detect the edge cases where all these different neurons identify the image correctly, and then the output would be the solution, whether it’s a dog, horse, frog, cat, dear, etc, right. And so my plan is to teach my computer to recognize images, and classify them into one of a bird, cat, deer, dog, frog, horse, or horse category, right. And the idea is, once I do that, then I can have my feeder machine setup, with the bends with the relevant food already to go one place, oh, hundred thousand animals lineup, they get a picture taken of them before proceeding proceeding to the feeding station, feeding station then drops the exact amount. And off they go after eating happy as can be. So that’s the problem. I’m solving a using an image classification through deep neural networks. And the next section, we’ll talk about how we actually do that. So stay tuned.
Okay, welcome back, let’s get into solving my problem of feeding all 100,000 animals that I happen to have my property. And it’s a ridiculous scenario. But you know, just think about self driving cars. And you know, those captions that you get whenever you try to log into certain websites, whatever, and it’s like spot the crosswalk or spot the stop sign or spot the person? Well, what you’re doing is one of the steps that we’re about to have to do when it comes to teaching the the algorithms to actually be able to detect images, right? Well, that is teaching the computer, what, dog, cat, a horse, a bird, whatever it looks like before it can recognize any new objects. And the idea is, the more times the computer sees stop signs or crosswalks, or people or in our case, cats, or horses, or deer, the computer then gets better at recognizing those objects. So this is called supervised learning. And we can do this by labeling images. And, you know, this is a common machine learning tactic. and machine learning is a subset of AI. And, frankly, deep neural nets use machine learning right now, because it is a subset of the overall architecture, nevertheless. So we start by labeling our images, cats, dogs, horses, etc. And the computer starts recognizing the patterns present in the cat pictures, because again, we have each of our animals taking a snapshot before they move into our feeding station, right. And so the computer starts recognizing the patterns that represent the deer or the horse, or whatever. And knowing that those patterns are absent from the other pictures of the animals that are presented, right. And so it starts building its own kind of cognition, so to speak. And so in this case, what I do is I use Python, Python is one of the many languages that data scientists use in an upcoming episode, I think I’m going to explain use cases for when you should use Python, when you should use are when you should use MATLAB. It’s Excel, etc. Right? Now, just know that I’m going to be using Python now, as well as something called TensorFlow to write the, the program for those not familiar, TensorFlow is a Holtan, open source software. And it’s, I guess, it’s a framework. But anyway, it’s it’s a open source deep learning framework. It was created by Google. And it gives
data science
data scientists and developers very, very rigid and granular control over every single machine learning algorithm or every single note as we call them in the flow. And so you can adjust the weights and achieve or I should say, the weights of the algorithms, what it’s producing, thereby, achieving optimal performance, right. And TensorFlow is very nice, because it has quite a few libraries, that will are that are built and will be using some of them, as well as a great community, just like the art community. Um, so you’ll be able to ask questions and get answers as they come up. So let’s get into the classification, shall we? As you know, computers, love ones and zeros, I’m the cat necessarily seeing it in the same way that humans see. And so we have to convert all these pictures of all of my pets from the mountain into numbers so that the computer can understand it. And frankly, there are two different ways. One, we can use something called grayscale. grayscale is what it sounds like. It’s a range of gray shades from white to black. And the computer then can assign each pixel a value based on how light or dark that particular boxes that that single pixel is in the image. And then all those numbers that the computer is assigned or putting her put into an array, so it’s put into a big grid, more or less. And the computer can do computations on that array that has created. And so whenever you look at that code and the computer, it looks like a very, very long string, but it is interpreted by a computer. The other way to do it, is to use RGB values. RGB stands for red, green, and blue. And those colors range from zero to 255. So the computer could then extract RGB values from each pixel each square in the grid to put the result in an array for further interpretation, right. So when the computer interprets the new pictures that each of my animals gives it, it’ll convert that image ternary to that big, square and a big string of values by using either the RGB technique or the grayscale technique, and then compare the patterns of the numbers against the known objects. The final step of course, being that the computer allots type of competence score for what it thinks each image is. And the class with that highest confidence score is usually the one that computer says, Oh, yeah, this is one of Jordan’s frogs, or this is one of Jordan’s horses are one of Jordan’s dogs, or dear, what have you. Finally, before actually building our model, I want to talk about how I can improve the accuracy of my image classification. Because let’s face it, I don’t want my horses to get my frog food, I don’t want my beard to get my dog food, etc, right? A popular technique for improving the accuracy of image classification is something called a convolutional neural network. It kind of delusional neural network is a kind of neural network that works in the same way as your normal neural networks, except that it has a convolutional layer at the beginning, what does that mean? So instead of feeding the entire image as an array of numbers, ie the grayscale, or the RGB values, the image is broken up into multiple tiles, machine then tries to predict what each tile is. And so then, at the end of it, and computer tries to predict what’s in the picture based on the prediction of all the titles individually, right. So this allows the computer to what’s do what’s known as parallelization of the operation. So can do all these tiles at once and then detect the object, regardless of where it is located in the image. Why is that important? Well, I don’t know if you know about dogs, and cats and deer. Well, my dog specifically likes to talk its head in a weird way or likes to, I don’t know, sit awkwardly, my dog’s gone over. And he likes to sit on his butt sometimes. And so that puts his head or the arrangement of his body differently than if you were just kind of like face on to the camera. And because he’s such a, you know,
a weird dog, the image isn’t always perfect, right. And same goes for the deer there, they’re just all over the place. They’re a bunch of idiots, but I love them. Same goes for any of the other animals, right. And so using CNN, I’m able to allow the the image recognition of the deep neural network to be far more sophisticated and far more accurate, because it grids up all of the pieces of the single image and then predicts the final verdict based on what it predicts each one of those grids to be. So now let’s move to the next section where we talked about the pre processing of the data. Granted, I’ve already taken pictures of all of my animals, all 100,000 of them. But we have to do a little bit of pre processing. And we’ll talk about that next. So stay tuned.
Okay, now we’re to the pre processing section. The reason we have to do a bit of pre processing of the images is because frankly, I’ve got a very good camera. And the camera takes images that contain very, very little noise. And because you know, maybe it’s wet outside, maybe it’s raining, or maybe it’s foggy, or whatever. Or maybe like I said, My dear are being idiots and they take a weird kind of look to the camera, we’re going to have to add a bit of artificial noise to those images. So we can do this in Python using a library called image og IMGAUG. And we’re going to do a little bit of random combinations of cropping parts of the image flipping images, two different rotations or perspectives, I guess. And then we can also adjust things like hue, contrast, saturation, that kind of stuff. And if you guys are interested, I can throw up my Python code for this for another example that I did up on my GitHub page. But any case, that’s the next step, that way, we can train the computer to train our deep learning, algorithm tour or neural network algorithm to, you know, predict something, regardless what the conditions of the data coming into it are. So now comes the next step, namely, splitting our data set. Like I said, machine learning is a subset of artificial intelligence, specific specifically within our deep neural network, or the way we’re using it the compilation are delusional neural network, excuse me, the CNN movie soft to train and test our models, right. And so the way we do that is to split our data set. There is lots of theory about how once you do this, you can either do like a 4060, split 2575 split 5050 split it, it’s really just kind of up to you, you can try different methods to see how accurate you can get your data set. In this case, I’m going to be doing a 6040 split, since we have quite a big data set.
And so
we have a training set that contains you know, 40,000 images of all of my pets, and other 60,000 for the test set. Right. So now it’s time to actually build the compositional your network, the CNN. So, so far, we’ve done the pre processing, and we split our data set. So we can start actually implementing the neural network. And so in this case, I’m going to use three compositional layers with something called a two by two max pooling. And Max pulling is a technique which you use to reduce the dimensions of an image by taking the maximum pixel value of grid, which helps reduce the overfitting and makes the model more generic. And so consider you have a grid that is eight columns wide by eight
rows deep. Well, a two by two.
Max pulling here would take that I eight by eight, and it would change it to a two by two matrix. So two columns, two rows, right. So then after that, we add the two fully connected layers. Since the layers are fully connected, they should be two dimensional, like I said, the two by two matrix and the output of the of the layer is four dimensional. So we need to flatten the layer between them to doing this guests how accurate my algorithm happened to be my deep neural network, my CNN, my unconditional neural network. Well, I got an accuracy of 78.4%.
That’s okay. You know,
three out of four of my animals will be fed correctly, the fourth one will get who knows what? that’s challenging. What do you guys think the solution is? Well, there are lots of solutions. I could change my training and test sets. I could try a different model. I could do all sorts of things, right. But one way, and the way that I think I prefer since I am such an awesome dude, is to get more animals, I think I need more data. I think I need to do an full order of magnitude more pets, I think I need 1 million pets. Right? I need to have a million animals roaming around my my mountain so that I can perfect my algorithm for treating them appropriately for feeding them and caring for them. That’s the only way to do it. Any case, this is been tech talk for how to use deep neural network when it comes to image classification. So until next time, keep on getting down and dirty with data. That’s it for the show. Thank you for listening. Be sure to follow us on any of our many social media pages and be sure to like and subscribe down below so that you get the latest from data couture. Finally, if you’d like to help support the show, then consider heading over to our Patreon page@patreon.com forward slash data couture. writing, editing and production of this podcast is done by your host Jordan Bohall. So until next time, getting down and dirty with data

Liked it? Take a second to support Data Couture on Patreon!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.