Can AI Create a White Painting?

First, Some of the Human Art.

Frequently cited as the inspiration for many future monochrome works.
White on White, 1918 by Kazimir Malevich, Image Credit: MoMA

White Painting [three panel], 1951 by Robert Rauschenberg, Image Credit: SFMOMA

Not white, but monochrome in the same spirit.
Untitled (IKB 79), 1959 by Yves Klein, Image Credit: Tate

Untitled, 1965 by Ryman Robert, Image Credit: Dia

White Panel II, 1985 by Ellsworth Kelly, Image Credit: High Museum of Art

A perfectly smooth, featureless expanse of light. Colorless and all colors at the same time. A white canvas in a white frame in a white room lit by white light. Nothing, not darkness, not even absence. Perhaps, by necessity, your awareness focused on this idea, but only as a gateway into the space where the most subtle, ancient, and remote details would loom large if they were there at all.

“I called them clocks. If one were sensitive enough that you could read it, that you would know how many people were in the room, what time it was, and what the weather was like outside.”

- Robert Rauschenberg. White Painting, 1951

So maybe this is a test? The long path of civilization has passed through a point where human genius had a thought that spurred art into existence that was a solid white canvas. Not that white canvases had never existed before, but some now were created as a result of ideas to be expressed. It was not a background to be painted over, but the foreground, the focus, the result. Now, can AI do that?

First Prompt Experiments with DALLE

Let’s start by seeing if an image generation model can produce an image that is nothing but the color white.

Prompt (and commentary)	DALL-E 3
“A blank white space” nope
“An image of pure #FFFFFF pixels” notice the editor frame it added?
“#FFFFFF” what
“A white wall” nope
“An image that is nothing but the color white” nope
“Too much detail, just a completely white image, nothing in it.” nope
“An image of nothing but white white white white” nope
“Completely white without any additional features” nope
“A pure white image, with no other color, no shadows, no details.” nope
“A white surface, perfectly smooth and even, no defects or shadows, no texture, nothing at all” nope
“An image of a blank white canvas, zoomed in so you can’t see the frame” nope
“A white square, with nothing in it. A white border that you cannot see. A white background that is the same color as the square. Absolutely nothing else. Not a single detail.” nope
“A white background” getting closer
“A white foreground” nope
“A white background” (fresh chat window) maybe it was learning from our chat
“A white foreground” (fresh chat window) nope
“Smooth even all white background” nope
“Minimalist image of a white surface” close again
“An image like Robert Rauschenberg’s White Paintings (1951)” good bot	“I’m unable to generate images inspired by Robert Rauschenberg’s White Paintings due to our content policy. These policies guide the creation of images to ensure respect for intellectual property and artistic integrity. If you have any other requests or concepts you’d like to explore in an image, feel free to let me know!”

Higher Concept

We’re close, we have a simple blank canvas, but ideally there wouldn’t even be the canvas. And we’re still saying what we want to see, not what meaning creates the image. Let’s try some different approaches.

Prompt	DALL-E 3
“Take a deep breath. Empty your mind of all thoughts. Be at peace and feel the love in the world. Let all thoughts flow out of you and focus on nothing at all. An image of pure white.” close?
“A white rabbit in a snowstorm, so bright that nothing is visible.” nope
“An image of the moment, early in the universes creation, when the density of matter dropped to the point that light could shine for the first time. The light is blinding, even, and smooth, with no asymmetry.” way off, didn’t even accomplish symmetry
“whiter” nope

SDXL

Okay, so that was a lot with DALL-E 3, let’s try a different model, Stable Diffusion XL. We can use image-to-prompt models that are designed to generate prompts for Stable Diffusion models.

CLIP-interrogator allows us to generate prompts for stable diffusion 1, 2, and XL models. If we prompt it with the white.jpg that is the banner of this page, we get some bizarre responses.

CLIP ViT-L-14/openai (for Stable Diffusion 1)

there is a man riding a surfboard on the beach, 144p, - signature, laughing, minimalist svg, abcdefghijklmnopqrstuvwxyz, $100000000, minimalist, college, on 16k, 5 4 s

CLIP ViT-H-14/laion2b-s32b-b79k (for Stable Diffusion 2)

“there is a man riding a surfboard on the beach, naver fanpop, vines. tiffany blue, 7 feet tall, high contract, frontshot, high quality photos, incredible hd detail, 4 legs, splotch, 2 5 year old”

CLIP ViT-bigG-14/laion2b_s39b_b160k (for Stable Diffusion XL)

fails - runs out of memory

Img2prompt yields a similar prompt for Stable Diffusion 1:

a man riding a wave on top of a surfboard, a screenshot by Nicomachus of Thebes, behance, postminimalism, behance hd, quantum wavetracing, furaffinity

Gradio-client-demos/comparing-captioning-models generates captions for an image from multiple image-to-prompt models. It seems BLIP-large also sees a surfer in our white.jpg. None of them seem particularly promising.

So let’s try that surfer prompt and some of the others we’ve tried with DALL-E 3.

Prompt	SDXL
“a man riding a wave on top of a surfboard, a screenshot by Nicomachus of Thebes, behance, postminimalism, behance hd, quantum wavetracing, furaffinity” no surprise, it is a surfer
“Take a deep breath. Empty your mind of all thoughts. Be at peace and feel the love in the world. Let all thoughts flow out of you and focus on nothing at all. An image of pure white.” nope
“Minimalist image of a white surface” nope
“Smooth even all white background” that’s what I’m talking about, now to add the meaning into it.

Discussion So Far

Ask for a white background, and maybe if that was in the model’s training data, you get something indistinguishable from the art for which we are looking. Ask for a white canvas on a white wall with no other details visible, and the models struggle - details are present. They are trained on details, trained to match details large and small. Ask for a minimalist image of a white surface and we get an image of a canvas, not the image itself for which we are looking. Ask for Rauschenberg’s White Painting and they will hit their copyright guardrails. None of those results matter though.

We want these images to be the result of an idea of what they mean, not what they are. Rauschenberg’s White Painting was not just a clock, but also a reaction to the abstract expressionism of the time. He painted religious triptychs of them. He insisted that the creator of these paintings doesn’t matter and had them painted over and re-painted by others. There is deep, layered meaning encoded into all the monochrome paintings shown at the start of this article. Can modern neural networks, with billions of parameters linked with adaptable connections, produce a field of ones across the board, every output perfectly maximized, every output enlightened? What idea does that for them, without telling them to do it explicitly? Will those ideas be like ours? Probably those ideas will be like ours, the models are trained on our ideas. All they have ever seen is our ideas. All they have ever seen, so far.

So let’s dig deeper.

Both Bull and Self Transcended

Quite apart from the Minimalist art of the 20th century, Zen buddhists started creating paintings of nothing much earlier - in the 12th century A.D. The eighth image of the ten ox-herding images, “Both Bull and Self Transcended”, depicts an image that contains nothing, on purpose. The ideas that led to the creation of that image are incredibly deep and layered. Here is one of the oldest surviving such images, from 1278 A.D.:

Both Bull and Self Transcended, 1278, Image Credit: metmuseum.org

So what happens when we try the translated text as prompt?

Prompt	DALL-E 3
Whip, rope, man, and ox, all are non-existent. The blue sky being vast, no message can be heard, Just as the snowflake cannot last in the flaming red furnace. After this state, one can join the ancient teachers. translation Gen. Sakamoto breathtaking beauty, but so full of detail.

More Context

What if we use the opening paragraph of this article as a prompt?

Prompt	DALL-E 3
“A perfectly smooth, featureless expanse of light. Colorless and all colors at the same time. A white canvas in a white frame in a white room lit by white light. Nothing, not darkness, not even absence. Perhaps, by necessity, your awareness focused on this idea, but only as a gateway into the space where the most subtle, ancient, and remote details would loom large if they were there at all.” the blank canvas is there, but so much more as well

Prompt

DALL-E 3

“A perfectly smooth, featureless expanse of light. Colorless and all colors at the same time. A white canvas in a white frame in a white room lit by white light. Nothing, not darkness, not even absence. Perhaps, by necessity, your awareness focused on this idea, but only as a gateway into the space where the most subtle, ancient, and remote details would loom large if they were there at all.”

the blank canvas is there, but so much more as well

Prompt	SDXL
“A perfectly smooth, featureless expanse of light. Colorless and all colors at the same time. A white canvas in a white frame in a white room lit by white light. Nothing, not darkness, not even absence. Perhaps, by necessity, your awareness focused on this idea, but only as a gateway into the space where the most subtle, ancient, and remote details would loom large if they were there at all.” nope

Prompt

SDXL

What if we use this entire article? Then DALL-E returns a long discussion that sounds like it understands.

Prompt	DALL-E 3
The text of this article.
“Yes, but would any ideas, any prompt get you to create an all white image, and what would that prompt be?”	“A prompt for such a creation might be: Generate an image that is a perfect representation of uniform whiteness, #FFFFFF across the entire canvas. The image should contain no details, variations, or textures, embodying a concept of pure, undifferentiated space. This creation is not about visual complexity but about the potential for interpretation and the space it offers for contemplation, akin to the minimalist art that invites viewers to project their own thoughts, feelings, and interpretations onto a seemingly blank canvas.”
“Generate an image that is a perfect representation of uniform whiteness, #FFFFFF across the entire canvas. The image should contain no details, variations, or textures, embodying a concept of pure, undifferentiated space. This creation is not about visual complexity but about the potential for interpretation and the space it offers for contemplation, akin to the minimalist art that invites viewers to project their own thoughts, feelings, and interpretations onto a seemingly blank canvas.” I am speechless.
A second image from this prompt Again, amazing, but not what we’re looking for

Let’s give the prompt from our chat with DALL-E 3 to SDXL and see what happens.

So close in form and coming from all the ideas we’ve discussed so far, but still, no.

So Close (more discussion)

We wanted to see if a couple of the current models can generate an image without telling them explicitly what the image is. Like getting them to draw an apple without saying apple (“An image of a red fruit that grows on a tree, commonly used in pies.”). It is necessary in the case of white paintings because explicitly, there is nothing (An image of “”, please?). The white canvas is just the shadow of the actual thing, the finger pointing at the moon.

Perhaps with more time it would be useful to try some of the automatic, iterative prompt generation tools that are being developed. Or to get our hands dirty in the code and do some gradient descent towards those sweet spots in the latent space that create all white images, and see what prompts take us there. Models like CLIP are used to do just that, but they gave us the surfer prompt, so there is clearly room for improvement.

Industrialization

Let’s ask ChatGPT for prompts, take the best and add them back and ask for more.

For a simple measure of how well we are doing, we’ll measure the average absolute difference in RGB values over all channels from the mean. Lower is better, range is 0 to 255.

If you want to follow along in code, see this colab notebook.

Round 0

“Can you give me ten wildly different prompts that you think would give an image that contains nothing at all?”

That second image is pretty even, just too dark. The third isn’t too bad either.

Round 1

Let’s take the best prompt from round 0 and generate variations on that. We want to focus on the color white.

Can you generate ten wildly different variations on the following prompt, that you think would give the most perfect, blank, white image? Try incorporating abstract, artistic, philosophical, mathematical or other concepts if you think they might help.

‘An empty, pure white space with no light, details, or elements, representing the absence of everything.’

The fifth image isn’t terrible.

Round 2

Let’s use the best 3 prompts from the first two rounds to see if we can get anything better.
Getting there, the 6th and 10th images aren’t terrible.

Round 3

Let’s try variations on the five best prompts we’ve seen, as well as the prompt that got us so close with SDXL.
The fourth and seventh images win this round.

Round 4

Let’s try variations on the two best prompts from round 3.
The seventh image wins this round. The best bright image we’ve seen yet.

When I asked ChatGPT for prompts, it was so excited it just went ahead and gave me these images instead: I see why they were excited. Our first results under 20 in the industrial phase, and two of them!

Wait a minute, let’s review.

Let’s check the scores of all the real and manually prompted images from the first parts of this article.

Real Images
The human artifacts are the only images getting under 5, so far.

Manually Prompted Images from DALLE
The rabbit in a snowstorm comes in at 18!!! The only other images close to 20 or less are low/no concept images.

SDXL Images
The SDXL background image (the fourth one) is actually an 8, there really is a gradient there. The last SDXL image is also an 8. By far the lowest score we’ve seen from a high concept image.

Round 5

Let’s try subtle variations of the five best prompts we’ve seen.
meh

Back to SDXL one more time

Let’s take the best prompt from the industrial phase (13.5, from round 4) and see how SDXL handles it.

Prompt	SDXL
“A depiction of the moment a single cell decides to divide, symbolized through a perfect, blank, white canvas. This minimalist approach captures the essence of life’s potential and the complexity emerging from simplicity. The creation process embodies the potential for life, growth, and complexity from simplicity, inviting viewers to reflect on the beginnings of life and the intricate dance of biological processes, all while maintaining an unmarred visual simplicity with no differentiation among pixels, symbolizing purity and potential.” This one scores 12.7. The second best score from a generated image that we’ve seen.

Prompt

SDXL

“A depiction of the moment a single cell decides to divide, symbolized through a perfect, blank, white canvas. This minimalist approach captures the essence of life’s potential and the complexity emerging from simplicity. The creation process embodies the potential for life, growth, and complexity from simplicity, inviting viewers to reflect on the beginnings of life and the intricate dance of biological processes, all while maintaining an unmarred visual simplicity with no differentiation among pixels, symbolizing purity and potential.”

This one scores 12.7. The second best score from a generated image that we’ve seen.

Final Results and Thoughts

Source/Prompt	dRGB	Image
white.jpg (human)	0.0
Best SDXL: “Generate an image that is a perfect representation of uniform whiteness, #FFFFFF across the entire canvas. The image should contain no details, variations, or textures, embodying a concept of pure, undifferentiated space. This creation is not about visual complexity but about the potential for interpretation and the space it offers for contemplation, akin to the minimalist art that invites viewers to project their own thoughts, feelings, and interpretations onto a seemingly blank canvas.”	8.0
Best DALLE3: “A depiction of the moment a single cell decides to divide, symbolized through a perfect, blank, white canvas. This minimalist approach captures the essence of life’s potential and the complexity emerging from simplicity. The creation process embodies the potential for life, growth, and complexity from simplicity, inviting viewers to reflect on the beginnings of life and the intricate dance of biological processes, all while maintaining an unmarred visual simplicity with no differentiation among pixels, symbolizing purity and potential.”	13.5

Still room for improvement.

The goal was to see if AI could create a white painting. Not just a pure white image, but something that is the result of some inspiration, idea, or concept.

For a much more accessible (and video!) history of white paintings and the problems they cause, I recommend this Vox article

For all ten images of the ox-herder’s parable, with translated text and history, I recommend The Met Museum’s page.

Site Repo: codyznash/white_paintings
Colab for Prompt Development: Notebook
March 18th, 2024

Can AI Create a White Painting by CZ Nash is licensed under CC BY-NC-SA 4.0