NVIDIA - GodFather of GANs

By Dhnesh on Oct. 25, 2021, 10:27 p.m.

I am going to try to tell you the glorious tale of AI, based human face generation and showcase an unbelievable new paper in this area. The Paper showcases a system that could not only classify an image but write a proper sentence on what is going on and could cover even highly non-trivial cases. You may be surprised, but this thing is not recent at all. This is 4-year-old news. Later, Researchers turned this whole problem around and perform something that was previously thought to be impossible. They started using these networks to generate photorealistic images from a written text description. We could create new bird species by specifying that it should have orange legs and the short, yellow bill then researchers at Nvidia, recognized and the two shortcomings. One was that the images were not that detailed. And even though we couldn't put text, we couldn't exert too much artistic control. Over the years NVIDIA again comes to the rescue, which was then able to perform both of these difficult tasks easily. Furthermore. Some features are highly localized, as we exert control over these images. We now have much more intuitive artistic control over the images, we can add or remove a beard. Make the subject younger or older, change the hairstyle, make their hairline, put a smile on their face, or even make their nose pointier, absolutely witchcraft. So, why can we do all this with this new method? The key idea is that it is not using a Generative Adversarial Network again. GANs dominated this field for a long time because of their powerful generation capabilities, but on the other hand, they are quite difficult to train. And we only have limited control over it out among other changes, this work disassembles the generator Network into d and g. The discriminator works in Touhy and Dee way or in other words as an encoder. The encoder compresses the image, they tore it down into a representation that we can add more easily. Or in other words, all of these intuitive features that we can edit exist here. And when we are done, we can decompress the output with a decoder Network and produce these beautiful images. This is already incredible. But what else can we do with this? New architecture, a lot more, for instance, to if we had the source and destination subject, their core middle or fine, Styles can also be mixed. What does that mean exactly? The core part means the high-level attributes like pose hairstyle and face shape, all the sensitive subjects. Interestingly is also changes the background of the subject. You can also perform image interpolation. This means that we have these images as starting points and you can compute, intermediate images between them. So what makes a good interpolation process. Well, we are talking about good interpolation when each of the intermediate images makes sense and can stand on their own. I think this technique does amazingly. This is so much progress in so little time. It truly makes my head spin, what a time to be alive. We have talked about which was made by weights and biases. I think organizing these experiments showcases the usability of their system weights and biases robots to us to track your experiment in your deep learning project. The system is designed to save you a ton of time and money and it is actively used in the project at prestigious Labs or at Github. And the best part is that if you have an open-source academic or personal project, you can use that for free.