đź Weekly Commentary: Working with AI Tools
The temptation to churn out what âworksâ rather than what challenges might be too great
Text-to-image AI systems are all the rage. A month ago, OpenAI announced the beta of Dall-E, its image-generating system. When you sign up for the beta, youâll get 50 credits initially and then 15 credits each month. Each credit allows you to pop in a prompt and receive four candidate images generated from that prompt. Additionally, you can buy 115 credits for $15.
More recently, systems built on âStable Diffusion algorithmsâ have been released. Stable Diffusion is a big deal because it is compact enough to run on a laptop. And Iâve been playing with DreamStudio, a free-for-now service that uses Stable Diffusion. Like Dall-E, Stable Diffusion systems are triggered by giving them a text-based prompt.Â
Prompts are more-or-less structured recipes which can be fed into AI models like Dall-E or Stable Diffusion. These models, trained on oodles of data, can then spit out candidate images (or other output).
Here is the UKâs Cyberforce painted in the style of Turner using Dall-E.Â





Here is what Stable Diffusion generates from the prompt âUtopian cityâ.



Member of Exponential Do, Pascal Finette, shared some of his Mini Dall-E experiments with the community, this one titled âA human going exponentialâ.
In a recent essay, Kevin Kelly helped me understand how to think about how we make use of these AI tools. The interface is now a text-based prompt. Kelly points out that the new human skill will be âhow you construct the prompt you give the AIâ.
He drew my attention to the âPrompt book, a free PDF e-book on how to get the most out of Dall-E (or any AI image generator)â. Kelly predicts âmany prompt books in the futureâ will be different systems that work with different prompts, much like a Leica camera responds differently to a Fuji. And underlying AI systems may perform different tasks. DreamStudio generates images. Other systems may output text or sound or video.Â
A prompt book is, in a sense, an instruction manual for these AI tools. To help me make sense of them, I put them into some historical context. In a sense, every successive tool is taking a higher level of abstraction than the one before, and these text-to-image services fall on that journey.
I thought back to the first graphics tools I used, mostly bitmap-based editors in the mid-1980s. You really had to build your image one pixel at a time (likely from an 8-bit or even 4-bit palette).Â