RaceGAN: A Framework for Preserving Individuality while Converting Racial Information for Image-to-Image Translation

Generative AI & LLMs
Published: arXiv: 2509.15391v1
Authors

Mst Tasnim Pervin George Bebis Fang Jiang Alireza Tavakkoli

Abstract

Generative adversarial networks (GANs) have demonstrated significant progress in unpaired image-to-image translation in recent years for several applications. CycleGAN was the first to lead the way, although it was restricted to a pair of domains. StarGAN overcame this constraint by tackling image-to-image translation across various domains, although it was not able to map in-depth low-level style changes for these domains. Style mapping via reference-guided image synthesis has been made possible by the innovations of StarGANv2 and StyleGAN. However, these models do not maintain individuality and need an extra reference image in addition to the input. Our study aims to translate racial traits by means of multi-domain image-to-image translation. We present RaceGAN, a novel framework capable of mapping style codes over several domains during racial attribute translation while maintaining individuality and high level semantics without relying on a reference image. RaceGAN outperforms other models in translating racial features (i.e., Asian, White, and Black) when tested on Chicago Face Dataset. We also give quantitative findings utilizing InceptionReNetv2-based classification to demonstrate the effectiveness of our racial translation. Moreover, we investigate how well the model partitions the latent space into distinct clusters of faces for each ethnic group.

Paper Summary

Problem
The main problem addressed by this research paper is the challenge of translating racial traits in images while maintaining individuality and high-level semantics. Current image-to-image translation models, such as CycleGAN and StarGAN, struggle to achieve this balance and often require additional reference images.
Key Innovation
The key innovation of this paper is the introduction of RaceGAN, a novel framework that maps style codes over multiple domains during racial attribute translation. Unlike previous models, RaceGAN does not rely on a reference image and is able to maintain individuality and high-level semantics in the translated images. This is achieved through the use of a style extractor module, which extracts domain-specific low-level style codes from the latent space of multiple domains.
Practical Impact
The practical impact of this research is significant, as it has the potential to improve the accuracy and fairness of facial recognition systems, which are often biased towards certain racial groups. By enabling the translation of racial traits in images, RaceGAN could also be used to create more realistic and diverse datasets for training machine learning models, which would lead to better performance and reduced bias in these models. Additionally, this research could have applications in fields such as entertainment, education, and marketing, where the ability to manipulate facial features and racial traits could be useful.
Analogy / Intuitive Explanation
Imagine you have a photo of a person's face, but you want to change their racial features to make them look like someone from a different ethnic group. Current image-to-image translation models are like a paint-by-numbers kit, where you have to follow a set of rules to achieve the desired result. But RaceGAN is like a skilled artist, who can take the original photo and create a new image that looks like the person from a different racial group, while still maintaining their individuality and high-level features.
Paper Information
Categories:
cs.CV
Published Date:

arXiv ID:

2509.15391v1

Quick Actions