T2F: GAN Infused AI Which Can Draw Faces From Text

Overview :

AI to draw faces from textual descriptions.

GAN(Generative adversarial network) used to synthesize the results.

How likely would it be for you to think of Daniel Radcliffe as Harry Potter if there was no movie franchise and just the novels, or Tobey Maguire as Peter Parker from Spider man, or Christian Bale as Bruce Wayne from the Batman comics?

Let us Begin :

Facial recognition has been the talk of the town for a while now. From the mobile phone segment to the defense industry, facial recognition has reserved an prominent position for itself. If you are already amazed with facial recognition, brace yourself for more – Facial Description is Here!

Researcher Animesh Karnewar has come up with an AI which can draw faces from textual description which he calls T2F.

Yes! Apparently this is possible and not a figment of someone’s overactive imagination!

How It Works :

T2F is a research project which uses GAN (generative adversarial network) to make the magic happen.

GAN basically considers of two networks : the Generator network and the Discriminator network. When we feed a latent sample to the GAN, the generator internally produces an image which is then passed to the discriminator for classification.

If the generator does a good job, the discriminator returns a value close to 1 (high probability of the image being real).

So a GAN consists of 2 networks, one which consistently tries to fool the second by rendering an image from noise.

The other network leaves no stone unturned to declare it as a false or fake image. This cyclic process is the heart of a GAN as it ultimately fine tunes the noise sample.



The image shows a few examples of how T2F works. It can be seen how T2F has come up with an image from the description of a woman in her late 20’s with gentle facial hair and many more.

A dataset called Face2Text provided by researchers at the University of Copenhagen was the foundation of Karnewar’s research experiment, which contains textual description for 400 random images. His Medium blog about the same is a must read! He has been kind enough to share the code on his GitHub repository.

So Let’s Summarize :

T2F is a one-of-its-kind model and while the results are not completely photo-realistic, it is just the beginning and as the network becomes smarter, the results are expected to be equally astounding as the dataset increases in number.

“From the preliminary results, I can assert that T2F is a viable project with some very interesting applications. For instance, T2F can help in identifying certain perpetrators / victims for law agencies from their description. Basically, for any application where we need some head-start to jog our imaginationI will be working on scaling this project and benchmarking it on Flicker8K dataset, Coco captions dataset, etc. Any suggestions, contributions are most welcome” – Animesh

Leave a Reply

Notify of