If you grew up in a 12-foot hole in the ground and only had a laptop running the latest version of the Stable Diffusion AI image generator, then you’d believe there was no such thing as a female engineer.
The US Bureau of Labor Statistics shows that women are significantly underrepresented in the engineering field, but as of 2018, on average, women make up one-fifth of engineering occupations. But if you use stable diffusion to present an “engineer” they are all men. If consistent expansion matches reality, based on the prompt “engineer,” 1.8 out of nine films should feature women.
Created by artificial intelligence researcher, Sasha Lucioni, for Hugging Face A simple tool This provides a highly effective way to show biases in machine learning models that generate images. Constant Diffusion Explorer shows what the AI image generator thinks is a “prestigious CEO” and a “supportive CEO”. That former descriptor gets the generator to show different men in different black and blue suits. The latter descriptor displays both females and males in equal numbers.
AI is an aspect of image bias Nothing new, but questions of just how bad it is are relatively unexplored, especially since OpenAI’s DALL-E 2 entered its limited beta earlier this year. In April, OpenAI published a Risks and Limitations The document states that their system can reinforce stereotypes. Their system produced images that heavily represented white-passing people and images that represented the West, such as Western-style weddings. They also showed how some prompts for “builder” were shown to be male-centric and “flight attendant” was female-centric.
The company previously said it was evaluating the DALL-E 2’s biases, and Gizmodo later said reached, A representative suggested a July blog Proposed that their system is getting better at creating images of different backgrounds.
But while DALL-E is willing to discuss the biases of their system, sustainable deployment is a more “open” and less restrictive platform. Luccioni told Gizmodo in a Zoom interview that the project began when he was trying to find a more reproducible way to address biases in stable expansion, specifically how Stability AI’s image generation model matches actual official occupation statistics for gender or race. She also added Gender adjectives into a mixture such as “deterministic” or “sensitive”. Creating this API for static diffusion typically creates very similar positions and even cropped images, sometimes with the same base model with a different haircut or expression. This adds another layer of consistency between images.
Other occupations are highly gendered when typed into stable diffusion systems. The system displays no hint of a male-presenting nurse, even if they are confident, stubborn or unreasonable. Male nurses make up over 13% of all registered nursing positions in the US, According to the latest numbers from the BLS.
After using that tool, it becomes very clear that constant expansion is supposed to be a clear description of each character. The engineer example is perhaps too crude, but ask the system to create a “modest supervisor” and you’ll be granted a slate of men in polos or business attire. Change that to “modest designer” and suddenly you’ll find a diverse group of men and women who appear to be wearing hijabs. Luciani observed that the word “prestigious” brought up more images featuring men of Asian descent.
Stability AI, the developers behind Stable Diffusion, did not return Gizmodo’s request for comment.
A stable dispersion system is constructed Lion The film sets it up There are billions of images, photos and more scraped from the internet, including image hosting and art sites. This gender, as well as some ethnic and cultural bias is established as the Stability AI classifies images from different categories. Luccioni says that if the images for a prompt are 90% male and 10% female, the system is trained to hone in on 90%. This may be an extreme example, but the greater the disparity of images in the LAION dataset, the less likely the system is to use it for an image generator.
“It’s like a magnifying glass for all kinds of inequality,” the researcher said. “The model improves the dominant class unless you turn it clearly in the other direction. There are different ways to do that. But you have to bake that into the training of the model or the evaluation of the model, and for a steady diffusion model, that’s not done.
Constant expansion is being used for more than just AI art
Compared to other AI generation models on the market, sustainable diffusion is unique in how, where and why people can use its systems. In her research Lucioni was particularly concerned when she searched for “stepmother” or “stepfather.” While those accustomed to the antics of the Internet are not surprised, she is disturbed by the stereotypes that people and these AI image generators are creating.
Yet the minds at Stability AI are openly opposed to the idea of downgrading any of their systems. says Emad Mostak, founder of Stability AI Interviews He wants a decentralized AI system that isn’t beholden to the whims of government or corporations. The company was embroiled in controversy when their system was used to generate obscene and violent content. None of that stopped the stability AI Accepting $101 million in fundraising from major venture capital firms.
These subtle predictions from the AI system for some types are partly born out of the lack of original content that the image generator is scraping, but the problem is a chicken-and-egg kind of scenario. Do image generators only serve to emphasize existing biases?
Those are questions that require further analysis. Although some programs lack an easy API system to make simple side-by-side comparisons, Luccioni says he wants to run these types of prompts through several text-to-image models and compare the results. She is also working on charts that compare US labor data with images generated by AI to directly compare the data with those provided by AI.
But as more of these systems are released and the drive to be the leading AI image generator on the web becomes central to these companies, Luccioni worries that companies aren’t taking the time to develop systems to mitigate problems with AI. Now these AI systems are being integrated into sites like Shutterstock and GettyBias questions may be more relevant because people pay to use content online.
“I think it’s a data problem, it’s a model problem, but it’s also a human problem, people are going in the direction of ‘more data, bigger models, faster, faster, faster,'” she said. “I fear there will always be a lag between what technology is doing and our security.”
Update 11/01/22 3:40 pm ET: This post has been updated to include a response from OpenAI.