Mar 24, 2024 6 min read Articles

A Glimpse of AI Generated Art

When preparing materials for my book review post "Rebuilding, When Relationships End", I had the idea of using openAI API to generate images based on the quotes I would use in the post. I requested 13 images from the 13 quotes/paragraphs (one image each) using openAI's DALL・E3, and 12 were successfully generated, although I only ended up using 4 of them in my book review post. The other 8 are also very interesting but not suitable for the blog (and you'll see why later).

Through this simple, fun exercise, I had the opportunity to glimpse into AI-generated art, and here is what I found:

1. Easy to Get Started

It only takes a few steps to install the libraries for the openAI APIs. The openAI tutorials are easy to follow and you don't need programming knowledge to use them, as openAI provides example codes for you to start.

To use the openAI APIs, you will need to pay a small fee each time you trigger a request. The DALL・E3 image model costs $0.08 per request.

2. More Literal than Abstract

The 13 quotes/paragraphs serve as "prompts" for openAI DALL・E3. As you may have read in "Rebuilding, When Relationships End", most of them are about feelings, mentalities and spiritualities that are rather abstract. As of March 2024, OpenAI doesn't seem to respond well to sentences and paragraphs that are not concretised into any specific "stuff", nor does it have the ability to integrate several related or progressive sentences into an abstract idea. Most of the generated images are more like literal translations of the key words of the paragraphs pieced together. For example:

OpenAI DALL・E3 generated this comic strip based on the texts: "The power struggle in the relationship is diminished when: Each person learns to talk about feelings. Each person starts using I-messages instead of you-messages. Each person takes ownership of unresolved problems. Each person looks at the other person as a relationship teacher. Each person works at learning more about herself or himself, instead of projecting the hurt and blame upon the other person."

Some images hilariously express quite the opposite of the texts as AI only focuses on illustrating the keywords, not the whole passage. You can see in the following examples:

OpenAI DALL・E3 generated this comic strip based on the texts: "Many people marry for the wrong reasons, among them (1) to overcome loneliness; (2) to escape an unhappy parental home; (3) because they think that everybody is expected to marry; (4) because only “losers” who can’t find someone to marry stay single; (5) out of a need to parent, or be parented by, another person; (6) because they got pregnant; and (7) because “we fell in love.”"

OpenAI DALL・E3 generated this comic strip based on the texts: "Your inner critic is SMALLER than you are, and you can be BIGGER than it is. Consciously make a decision to start listening to that part. Acknowledge the voice, and it will eventually start softening the words it uses. When your critic has finished speaking each time, you may respond with a simple “Thank you.”"

3. Concerning Gender and Race ratios

Despite the gender neutral tones in the prompt texts, in all 12 successful attempts, 6 out of the 12 images have a male central figure, whereas only 4 feature a female central figure. The other 2 depict heterosexual couples. As a female blogger looking to use AI-generated images to enhance my expression, I certainly didn't feel these heavily male-centred images represented my expression or the gender base of my audience. This was one of the reasons why I couldn't use many of the images I paid for in the post. To give openAI the benefit of the doubt, it is possible that the gender ratio bias I'm seeing is due to a not-so-large sample size in my experiment. Nevertheless, I hope that the world-leading AI service could live up to the gender equality expectations its customers.

6 out of the 12 images openAI generated based on gender neutral texts are male-centred.

It becomes more concerning when you start to count the racial ratio of the figures in the openAI-generated images. The 2024 AI technologies seem to still struggle to find the sweet spot between being not representative for all racial groups and overcompensating for an underrepresented racial group. 😅

As these technologies are trained with the gender-biased and race-biased database we already have, if the engineers do not get the representations right, the biased trends will be exaggerated exponentially with the AI boom.

4. Narrow Representation of Spirituality

Interestingly, openAI DALL・E3 seems to narrowly link the text of feelings, mentalities and spiritualities to religions, and especially in female-centred images, even though the texts have no indication of religions. Church, cross, and sari are reoccurring elements in those images.

OpenAI DALL・E3 narrowly links spiritualities with church, cross, and sari in the female-centred images.

5. Oops, Safety System

I fed 13 paragraphs of text to openAI DALL・E3, of which only 12 produced image results. The one that got rejected, according to openAI, may contain text that is not allowed by their safety system. Here is the text:

Ask yourself: Were you and your partner friends? Did you confide in each other? What interests did you share? Hobbies? Attitudes toward life? Politics? Religion? Children? Were your goals for yourself, for each other, and for the relationship similar/compatible? Did you agree on methods for solving problems between you (not necessarily the solutions, but the methods)? When you got angry with each other, did you deal with it directly, hide it, or try to hurt each other? Did you share friendships? Did you go out together socially? Did you share responsibilities for earning money and household chores in a mutually agreed upon way? Did you make at least major decisions jointly? Did you allow each other time alone? Did you trust each other? Was the relationship important enough for each of you to make some personal sacrifices for it when necessary?

Maybe the "safety system" is just to cover that DALL・E3 couldn't deal with long and complex prompt texts? Lol.

6. And the Award Goes to...

Despite some awkwardness in most of openAI DALL・E3's image generation outputs, I was very impressed by one image. It keeps the contents at an abstract level. By using bold, rich and high-contrast colours, and smooth curving flows, it embeds the key words "creative selves" and "spiritual well-being" in the generated art very well.