Stable bias: evaluating societal representations in diffusion models
A. Luccioni, C. Akiki, M. Mitchell, and Y. Jernite. Proceedings of the 37th International Conference on Neural Information Processing Systems, Red Hook, NY, USA, Curran Associates Inc., (2024)
Abstract
As machine learning-enabled Text-to-Image (TTI) systems are becoming increasingly prevalent and seeing growing adoption as commercial services, characterizing the social biases they exhibit is a necessary first step to lowering their risk of discriminatory outcomes. This evaluation, however, is made more difficult by the synthetic nature of these systems' outputs: common definitions of diversity are grounded in social categories of people living in the world, whereas the artificial depictions of fictive humans created by these systems have no inherent gender or ethnicity. To address this need, we propose a new method for exploring the social biases in TTI systems. Our approach relies on characterizing the variation in generated images triggered by enumerating gender and ethnicity markers in the prompts, and comparing it to the variation engendered by spanning different professions. This allows us to (1) identify specific bias trends, (2) provide targeted scores to directly compare models in terms of diversity and representation, and (3) jointly model interdependent social variables to support a multidimensional analysis. We leverage this method to analyze images generated by 3 popular TTI systems (Dall E 2, Stable Diffusion v 1.4 and 2) and find that while all of their outputs show correlations with US labor demographics, they also consistently under-represent marginalized identities to different extents. We also release the datasets and low-code interactive bias exploration platforms developed for this work, as well as the necessary tools to similarly evaluate additional TTI systems.
%0 Conference Paper
%1 10.5555/3666122.3668580
%A Luccioni, Alexandra Sasha
%A Akiki, Christopher
%A Mitchell, Margaret
%A Jernite, Yacine
%B Proceedings of the 37th International Conference on Neural Information Processing Systems
%C Red Hook, NY, USA
%D 2024
%I Curran Associates Inc.
%K imported
%T Stable bias: evaluating societal representations in diffusion models
%X As machine learning-enabled Text-to-Image (TTI) systems are becoming increasingly prevalent and seeing growing adoption as commercial services, characterizing the social biases they exhibit is a necessary first step to lowering their risk of discriminatory outcomes. This evaluation, however, is made more difficult by the synthetic nature of these systems' outputs: common definitions of diversity are grounded in social categories of people living in the world, whereas the artificial depictions of fictive humans created by these systems have no inherent gender or ethnicity. To address this need, we propose a new method for exploring the social biases in TTI systems. Our approach relies on characterizing the variation in generated images triggered by enumerating gender and ethnicity markers in the prompts, and comparing it to the variation engendered by spanning different professions. This allows us to (1) identify specific bias trends, (2) provide targeted scores to directly compare models in terms of diversity and representation, and (3) jointly model interdependent social variables to support a multidimensional analysis. We leverage this method to analyze images generated by 3 popular TTI systems (Dall E 2, Stable Diffusion v 1.4 and 2) and find that while all of their outputs show correlations with US labor demographics, they also consistently under-represent marginalized identities to different extents. We also release the datasets and low-code interactive bias exploration platforms developed for this work, as well as the necessary tools to similarly evaluate additional TTI systems.
@inproceedings{10.5555/3666122.3668580,
abstract = {As machine learning-enabled Text-to-Image (TTI) systems are becoming increasingly prevalent and seeing growing adoption as commercial services, characterizing the social biases they exhibit is a necessary first step to lowering their risk of discriminatory outcomes. This evaluation, however, is made more difficult by the synthetic nature of these systems' outputs: common definitions of diversity are grounded in social categories of people living in the world, whereas the artificial depictions of fictive humans created by these systems have no inherent gender or ethnicity. To address this need, we propose a new method for exploring the social biases in TTI systems. Our approach relies on characterizing the variation in generated images triggered by enumerating gender and ethnicity markers in the prompts, and comparing it to the variation engendered by spanning different professions. This allows us to (1) identify specific bias trends, (2) provide targeted scores to directly compare models in terms of diversity and representation, and (3) jointly model interdependent social variables to support a multidimensional analysis. We leverage this method to analyze images generated by 3 popular TTI systems (Dall E 2, Stable Diffusion v 1.4 and 2) and find that while all of their outputs show correlations with US labor demographics, they also consistently under-represent marginalized identities to different extents. We also release the datasets and low-code interactive bias exploration platforms developed for this work, as well as the necessary tools to similarly evaluate additional TTI systems.},
added-at = {2024-12-11T12:42:07.000+0100},
address = {Red Hook, NY, USA},
articleno = {2458},
author = {Luccioni, Alexandra Sasha and Akiki, Christopher and Mitchell, Margaret and Jernite, Yacine},
biburl = {https://puma.scadsai.uni-leipzig.de/bibtex/22d61f594a31d2d361e97d344de7dc388/scadsfct},
booktitle = {Proceedings of the 37th International Conference on Neural Information Processing Systems},
interhash = {9a689ec266f91890c39d08855c634af5},
intrahash = {2d61f594a31d2d361e97d344de7dc388},
keywords = {imported},
location = {New Orleans, LA, USA},
numpages = {14},
publisher = {Curran Associates Inc.},
series = {NIPS '23},
timestamp = {2024-12-11T12:42:07.000+0100},
title = {Stable bias: evaluating societal representations in diffusion models},
year = 2024
}