February 22, 2024


masterpiece of human

How Smart is Dall-E 2?


Prompt: “Polymer clay dragons taking in pizza in a boat”

Computer system-created picture (Dall-e 2 by OpenAI) 

For a a number of decades now, personal computers have been able to generate illustrations or photos dependent on a natural-language prompt. 

The ensuing pictures have experienced from challenges of logic and world-wide coherence.

For case in point, here is what you get if you give the computer system the prompt “A rabbit detective sitting down on a park bench and reading a newspaper in a Victorian environment.” (Latent Diffusion LAION-400M via @loretoparisi)

Exactly where are his legs? His fingers? Are those people guides or newspapers? Is that a espresso desk in front of his bench? 

The graphic does not make perception, and we might conclude that the issue arrives from the laptop or computer not getting any working experience of living in a physique or dealing with the authentic entire world. No subject how large the information sets, or how numerous layers of processing you deliver to the endeavor, you cannot get past that limitation. 

Or can you? 

Open up AI is 1 of the pioneers of generating real looking pictures and art from descriptions in normal language. They lately unveiled new software package called Dall-e 2, which has pushed the boundaries of what is actually achievable with this know-how.

Here is what Dall-E 2 does with the exact prompt: “A rabbit detective sitting down on a park bench and looking at a newspaper in a Victorian location.” 

The overall logic is much far better. Now he has legs and is genuinely sitting on that bench, even casting a shadow. But the graphic is continue to not best. What’s the black loop in his remaining hand? And why does not he seem to be keeping the newspaper with his appropriate hand? 

Here is 1 much more example of how the technology is strengthening, working with the prompt “teddy bears performing on new AI research on the moon in the 1980s” 

The first edition using more mature tech (laion400m) appears to be like a paste-up of unrelated aspects.

Here’s what Dall-e 2 came up with: a fairly plausible picture with consistent lighting. 

https://www.youtube.com/look at?v=qTgPSKKjfVg

This technologies scares some operating artists and illustrators. @VividVoid states: “DALL-E is breaking my coronary heart. AI artwork is about to lay utter squander to conventional visible artwork kinds. This will be so substantially much more destructive than what the World wide web did to music. It will be a technological conquest of one of the fantastic human avenues of spiritual transformation.”

AI skeptic Gary Marcus uncertainties regardless of whether the technological innovation will at any time swap artists mainly because it is just crunching massive info sets. It truly is not learning from embodied practical experience, nor does it realize symbolic or semantic concepts the way a human does. Marcus states: “This full thread is weaponized cherry-picked PR the antithesis of science.”

Go through far more

Podcast: Gary Marcus: Towards a Hybrid of Deep Studying and Symbolic AI


Source hyperlink