AI and The Beautiful People
Why AI doesn't know what an "average-looking" person looks like

My sister is opening a gym. She did what anyone might do in our age of generative AI, she asked Midjourney to generate marketing materials.
The task was specific. She’s not trying to create a gym for professional athletes or for gym rats; her target market is made up of ordinary people, many of whom will never have been into a gym before.
But getting generative AI to create images of normal people is harder than it sounds.
She asked Midjourney for a “diverse group of average-looking people engaging in a yoga class." She asked for an “average-looking group spin instructor;” and an “average looking person.”
Maybe “average-looking” is in the eye (or the pre-trained parameters) of the beholder, but what Midjourney gave my sister didn’t seem average-looking to me.
I mean, I’ll just come out and say it: These people are all just too dang sexy. The assignment was to deliver pictures of made up people who look average. The machine delivered made up people who look incredible.
My sister even tried using age to cause a decrease in beauty (as it can do in real life). “Spin instructor age 43, subject at a distance, instructor on spin bike.”
These dudes may be older, but I wouldn’t call them average-looking.
I know some readers will jump to prove me wrong and show that multimodal models can generate images of unattractive people. I know they can. I experimented with it, too. But even then, while I could get ChatGPT to give me images of people who are clearly overweight, I really did have a hard time generating images that landed between the two extremes.
I tried to get ChatGPT to generate pictures of people who are, you know, in decent shape. Not obese. They go to the gym from time to time and run a few miles here and there. They probably never had six pack abs, even when they were in their 20s. They also aren’t hideous or anything, but it’s not like modeling agencies are banging down their door. And, though health is important to them, so is quality of life and the siren song of the Doritos bag at the top of the pantry is sometimes just too much to resist.
You know, people like me.
But I had a hard time finding this sweet spot. Perhaps this is just one way that generative AI is capable of mimicking, not our real world in all of its complexity and splendor, but just our Internet.
To summarize crudely the current age of AI: scale is king. The quality of “next word prediction” or “next pixel prediction” approaches that underlie ChatGPT, Midjourney, Gemini, Claude, and all the rest has improved dramatically over the last decade with (a) larger training datasets; (b) more computing power during training; and (c) more complex algorithms. Massive scale really has led to gains in performance.
Those who fall at the most optimistic end of the AI prediction spectrum believe that this approach—next token prediction—will ultimately yield artificial general intelligence (AGI). In fact, I heard one speaker at a conference say that “there is nothing to limit the growth that scale offers.” But that’s what people said during the last AI boom in the 1980s and 1990s.
The optimists of the 1980s and 1990s kept believing there was no inherent limit on AI performance right up until they reached a the limits of what computing power of that time could do; and then the bottom dropped out of AI research and development.
I’m not saying we’re close to a similar moment in generative AI. The capabilities offered by these large models are amazing and I do think they will have profound impacts on the world. I also think that we’ll continue to see improvements from the big AI companies (even if they’re not the same kind of exponential growth that we saw in the late 20-teens).
But I want to ask a different question—not whether AI is impressive, but whether there are inherent limitations to the scaling approach. I think there are.
My sister stumbled upon one. Artificial intelligence might have a (very) skewed picture of what we, collectively as humans, look like. Why are all these “average-looking” gym goers and spin instructors so beautiful? Well, it might be that, in the aggregate, the Internet—the source of large model training data—fails to represent us as we are.
There are several reasons this might be true. I’m just speculating here, but most of the freely available images of people on sites like Flickr and Unsplash are, in fact, images of models and models are, in general, more attractive than non-models. In fact, you can experiment on this question without AI. Just search for “average-looking person” on free image databases and you’ll find more examples of professional photography of beautiful people than amatur photography of ordinary people.
Trying to get a sense for what an average person looks like from those datasets would be like trying to learn what an average family is by walking through the frame aisle at Target.
Even in the places on the internet where we can expect to find photos of real people—I’m thinking especially of social media sites—we don’t find our real lives. We find lives carefully curated for the internet. This is not news. Researchers have been looking at the effect of this social media curation especially on young people (with the most negative effects experienced by girls) for years.
All I’m suggesting is that there seems to be an inherent asymmetry between the data on which large models are trained and the real world in which they operate. It seems to me that the success of the AI project—in its current iteration, I understand that project to be the ability to release AI agents to act on our behalf in the real world—depends upon training data that is sufficiently aligned to the real world data. This is one area in which the two worlds are quite different. We don’t look the same online as we do in real life. And so, if we train models to create images of people, those images are going to reflect the training data, not the real world data.
This “beautiful people” phenomenon is just one facet of a more complicated problem. We’re increasingly asking, or at least expecting, generative AI to understand our world. But it isn’t trained on our world. Instead, it’s trained on our Internet. I don’t know where all the asymmetries are between the real world and the Internet world, but wherever those cracks are, those are the areas in which AI will perform poorly.
For now, I don’t have a lot of recommendations for how to solve this problem. But I do feel like I should go to the gym.
Credit where it’s due
This post was edited by Rebecca McCallum. As always, Views Expressed are those of the author and do not necessarily reflect the US Air Force, the Department of Defense, or any part of the US Government.







Hey, when did you start modelling for spin classes? Also, my takeaway is that AI bots are the biggest catfish offenders of all time. See you at the gym average guy.