You may have seen some pretty weird and crazy images floating around the interwebs recently. You may have even heard about an emerging new artist who is producing them. You see something like this:
… and you go and think about someone having a go at some new tools in photoshop and you look at it more closely and you start thinking about the meaning of life and what must be going on in this person’s head.
And then … through a bit of Googling – you realize that actually… it was Google that created them. And by this I don’t mean its founders or some technician on acid, I mean some algorithm came up with this.
Now what is actually going on here is that a fraction of Google employees are working on machine-learning algorithms and the “emerging art” that you see is a part of their testing of this algorithm. I am not going to go into the science of machine-learning (Google it if you like!), but what it comes down to is letting a computer “teach” itself or “learn” how to perform certain tasks by simulating the human brain. This is done using the artificial neural networks. I will try and explain how this works in simple terms. Check out this discussion for more info. In fact, I learned a lot from here and draw some examples from comments in this thread.
Those of you out there who have ever tried to program anything (or have first done so at the Python course) know how hard it can be to instruct a computer to do something. Or maybe a better way of putting it is you know you have to give it a lot of information in order to perform a very simple task, because all this machine understands is numbers. Now imagine trying to teach this machine to recognize a picture of a dog, and KNOW that this is a dog. This is an extremely hard thing to do. To you and me this is a no-brainer – we all know what a dog looks like, but a computer doesn’t. The computer doesn’t even have a concept of a dog. It doesn’t have a concept of a concept.
Imagine for a second that you have met a kid who has never seen a cake in their lifetime (I know, this is cruel, but for the sake of the kid imagine this for me). So this kid doesn’t know what cake is, what it is for, what you do with it, what it smells, tastes or feels like and it has never seen any variant of a cake. Now you have the task of teaching this kid what a cake IS. So you sit this kid at a table and you present it with a few cakes – a brownie, a birthday cake with a candle on top, a pavlova. And you tell this kid that these are all cakes. Mind you, these are all very different cakes. You also point to a birthday cake and you tell this kid “This is a birthday cake”. And the kid goes “ok, I get it”. But you are not so sure – you want to see if this kid really understands you. So you give it a blank piece of paper and a pencil and you ask him to draw you a cake.
Now the kid is at a loss a bit, he has just seen a few cakes and he is still kind of wrestling with this concept and he can’t really just draw you a cake. He doesn’t know where to start. So you make it a tad easier for the kid and you give him one of those connect-the-dots type puzzles (nevermind what this should actually show once you connect the dots!) and you ask the kid to recognize a cake within those dots and outline it for you. And he sets off to the task, identifies 4 or 5 dots that in his mind kind of resemble a cake and he draws a very vague outline around those dots. Now you look at it, you don’t quite see it yet but you want to keep going. So you hide that piece of paper behind your back and pretend to get a new piece of paper but you actually give the kid the same one he just scribbled on. You ask the kid to outline the cake again on this paper. The kid has not only never experienced cake but also doesn’t recognize (for the sake of the argument) the piece of paper in front of him. So now he sees his own scribbles in this puzzle and they vaguely remind him of a cake, so now he adds more lines, more scribbles to the existing ones to make them more cake-like. Now you, being a patient teacher, repeat this a several dozen times, the kid keeps drawing over his own drawings and finally in front of you there is indeed a drawing of something that you might recognize as a cake.
Now you repeat this experiment and ask the kid to draw you a birthday cake. After some repetition (of the above-described kind) you have a drawing of a cake with the candle on top! But hold on, you see something weird in this picture. You see that right next to a birthday cake there is this long, slender shape. And then you look at the birthday cake on the table and you see that there is a cutting knife right next to it. Your kid has successfully put an image of a candle into his mind when picturing birthday cakes, but he also thinks that a knife is a part of this cake. If you ask the kid to identify a birthday cake among many other cakes he will always choose the cake with a candle and a knife! And the candle, as far as he is concerned, is a part of what constitutes a “birthday cake”. And so is the knife.
This process essentially describes how an algorithm learns and comes up with those trippy images you have seen around the web lately.
The engineers at Google have been playing with algorithms in the past few years in such a way so these algorithms can recognize a picture of, say, a dog. So the way they do it is they pass pictures of dogs to a complex algorithm and they tell it that each picture shows a dog. They have “trained” its algorithm to recognize objects this way. In fact you can test this yourself. Upload an image, any simple image of a simple object (dog, house, a bird, a tree…) to Google images (just literally drag and drop your image to its search box) and hit “search”. Go ahead, do it. Google will come up with similar images to that one. If you grabbed this photo off the internet chances are it will find the same one. The algorithm has “learned” to recognize objects – much in the similar way of how people do it. This is why if you go and read about this topic more you will find expressions like machine learning, artificial neural networks or deep neural networks .
So once these engineers have trained their algorithms to recognize stuff and successfully employed it in their search engines, they wanted to see what actually this algorithm “thinks” (? Careful now, it doesn’t actually think – you will have noticed I’m using a lot of terms in the post loosely for explanation purposes… that’s why they are in quotes, I’m sure there’s heaps of people online who would scrutinize me for being incorrect. If you notice an error in reasoning, or have questions, please post it in the comments) when it thinks of, say, bananas. So they gave it a picture of random noise (this is our connect-the-dots puzzle in the above example), which looks like this:
That’s right – it’s essentially your dead tv channel. And now they told it to find a banana in this image. After several thousands of iterations (working always on its own output, like the kid that keeps drawing over his own drawings) , this is what it came up with:
Yeah, you can vaguely see a banana there, right? Several of them.
Here are some other examples of feeding the random noise to this algorithm and telling it to find specific things.
Look what happens if they ask for a dumbbell:
There is a hand attached to it! Much like the knife in that birthday cake scenario. The algorithm has thought itself that a dumbbell also constitutes a hand, probably because most of the images of dumbbell it has “seen” were of a muscly hand holding one.
These brilliant people have also come up with one final test. They fed a random noise into the algorithm and basically said “tell me what you see”. So they didn’t say that there is a banana in there, or a dog or a parachute. They just instructed it to outline anything it sees there. And THAT’S how you get those trippy images. Through thousands and thousands of iterations and hundreds of different layers of image enhancement (there are layers and there are iterations in this process – separate things – there is a number of iterations on each layer) the algorithm takes some dots and to it these dots look more like a dog, those dots look more like a tree and so on… and you end up with something resembling that starting image. Why is it always dogs and pagodas and eyes you might ask?
This tells you what type of images the algorithm has been trained on. It has been trained on images of dogs and pagodas for example. It has also learned that pagodas appear on the horizons (because pictures it has “seen” showed pagodas mostly on horizons) so if you pass it an image of a tree, chances are it will turn into a pagoda somewhere on the horizon. Eyes? Dogs and all living creatures have eyes. So if it “sees” any creature in random noise it will most likely have eyes.
The swirly images of regular things and/or nature that you can find online like this one:
are created using the same process but to a lower layer – meaning the algorithm is at the stage of outlining some edges and geometrical shapes that it can recognize in the image. It hasn’t gone any further (it’s not looking for dogs, cakes or pagodas).
So if you now Google “deep dream” you will see all sorts of these images. You will recognize Google’s dream algorithm style pretty quickly. As to why it’s called “dreaming” – there are ongoing debates online whether this is an appropriate name or not, but this post is already too long. Suffices to say that the algorithm builds an image on its previous “experience” of certain objects – which is similar to what our brains do when we are dreaming.
You can have some fun too and create some of your own images like that. In fact a lot of people are doing it and the results range from meh, to beautiful to thanks-I-didn’t-want-to-sleep-ever-again creepy. I have personally uploaded three of my photos to this site and will be waiting for about a week for them to get processed. Once the results are out – I will show them in a new post, so we can all have a good laugh. Or no sleep.
And just to top it off here’s this process animated: