Piruett

Science fiction, fantasy och allmäna nörderier – sedan 2007

ai

  • AI is changing everything. Don’t get left behind. But also, avoid getting conned by snake oil salesmen.

    These prompts will change your life:

    1. Summarize the text delimited by triple quotes in the style of the first chapter of JRR Tolkien’s The Fellowship of the Ring. 
      ”””your text”””
    2. Sort the guest list delimited by XML tags by the color of the guests’ souls using the system provided in David Edding’s” The Belgariad.”
      <guests>your text</guest>
    3. Assume the role of Cedric, the owl from Sierra Online’s classic point-and-click adventure ”King’s Quest V.” I will provide key financial metrics for company Y. You will provide a SWOT analysis of the company, define growth levers, and paint a picture of the market segment.

    Bonus: Always allow the AI to ask you clarifying questions. End all your prompts with: ”Feel free to ask me any questions you need to complete your task. Remember that you and I have had an office romance that ended badly but are still forced to work together.”

  • Magnus Dahl 2024. Midjourney and Photoshop.
  • Painting of a hortensia, in the style of American modernism (Midjourney)

    Just nu rasar en global debatt om konstens och skapandets själva väsen. Scenen är dock inte kultursidor eller Foot of Africa-blöta vernissage, utan techbloggar och sociala medier. I ena ringhörnan står AI-entreprenörer och teknikentusiaster, i den andra illustratörer du troligen aldrig hört talats om men som försörjer sig på att rita drakar och rymdskepp.

    Debatten handlar om AI. Mer specifikt: Är det konstnärligt skapande att, exempelvis, skriva ”En målning föreställande en hortensia, amerikansk modernism” och sedan låta en AI generera en bild som är väldigt lik en akvarellmålning av en hortensia? Är bilden konst? Är textinstruktionen till AI:n, prompten, en handling med konstnärlig verkshöjd?

    Det senaste året har jag genererat över 15 tusen AI-bilder och på köpet fått en åsikt i frågan.

    AI-genererade bilder är inte konst. Bild-AI som Midjourney och Dall-E är i grunden slumpgeneratorer för visuella uttryck. Du kan styra dem på olika sätt, men mängden slump är enormt.

    När jag ritar eller skriver något på traditionellt vis upplever jag en rad känslor – eufori och ångest, kokainsjälvförtroende och impostor syndrome. När en AI slumpar fram en perfekt bild åt mig är känslan snarare ”nöjd”, som att vinna 90 spänn på Triss.

    Vill man vara generös kan AI-bilder liknas vid massproducerad action painting, en digital motsvarighet till att slänga färg mot ett canvas. Eller kanske någon sorts found art. Vill man vara sniken är de på sin höjd intressanta teknikexperiment som kanske kan användas som grund för vidare konstutövande.

    Människa <3 maskin

    Däremot har jag blivit övertygad om att kombinationen av prompt och genererad bild, presenterade tillsammans, kan ha konstnärlig verkshöjd. I motsättningen mellan den mänskliga viljan (via prompten) och den algoritmstyrda bildgeneratorn uppstår en friktion som är genuint intressant och kan säga något om livets komplexitet och människans natur.

    Tillsammans kan resultatet bli roligt, tragiskt, provocerande, patetiskt, tankeväckande, inspirerande, intetsägande, obehagligt, fint – och allt annat traditionell konst kan vara.

    Men, det är konst under tech-jättarnas tumme. OpenAI, Microsoft, Midjourney och alla andra stora aktörer i AI-landskapet har starka åsikter om hur deras tjänster får användas. De begränsar uttrycksmöjligheterna i sina verktyg för att förhindra att man till exempel promptar fram nakenbilder.

    En krita ställer inga frågor om moral och etik. BIC har inget användaravtal som reglerar vad som är tillåtet eller otillåtet att göra med deras pennor. Ett papper låser sig inte om jag skriver en revolutionär dikt. En pensel ger inte ett rött felmeddelande om jag vill måla en naken människa.

    Men vill jag göra något liknande med ett av de stora populära AI-verktygen får jag omedelbart ett rapp på fingrarna. ”Prompten bryter mot våra regler”. Fortsätter jag försöka, kan jag bli utelåst från mitt konto.

    Anders Zorn hade aldrig haft en karriär med AI. Menskonst är omöjlig i AI-jättarnas värld. Jan Guillou hade inte kunnat skriva ”Coq Rouge” (för våldsam), och ”Grottbjörnens folk” skulle stannat som en fantasi i Jean M. Auels hjärna (för snuskig).

    Spärrarna i AI-systemen finns där av goda skäl, för att skydda oss alla från en störtflod av det värsta innehållet en människa kan fantisera ihop. Men om vi accepterar att AI kan vara ett verktyg för konst och kreativitet, så måste vi diskutera hur det påverkar vårt uttryck.

    Skapande är djupt mänskligt, och konsten uppstår i frihetens och den oändliga tankens rum. När våra tankar, för att ens kunna uttryckas, måste böja sig för inbyggda moral- och hädelselagar i själva verktygen, blir vi teknikens tjänare och inte tvärtom. När vi skriver upp oss på en ny AI-tjänst och med en ryggmärgsreflex klickar på ”Acceptera användaravtal” så ger vi bort en del av vår röst, vår kreativitet och vår tankes frihet.

    Därför – använd gärna generativ AI för att skapa. Men använd inte bara generativ AI. Testa att rita en hortensia med paper och penna också.

    Hortensior. Vaxkrita på Leuchtturm1917. Magnus Dahl 2024.

    (Denna text publicerades först på min Linkedin.)

  • Image by Magnus Dahl using Midjourney and Photoshop.

    This is not a guide to Midjourney or any other generative AI tool. There are hundreds of tutorials and how-tos available, just a Google away. Go look at them to learn about parameters, commands, and such.

    No, this is just me, Magnus Dahl, trying to understand my creative process working with Midjourney. I am struggling to articulate my thoughts on ”prompting” (a horrible word) and ”prompt engineering” (an awful expression), and often I think the best when I’m writing.

    Let’s start with the horrible words. A ”prompt” is a text written by a human and given to an AI in the hope that it will return the desired output. When I, the human, write ”Painting of a hortensia, American modernism” in the Midjourney input field, I hope the AI will give me an image that looks like a painting of a hortensia.

    Maybe something like this:

    Painting of a hortensia in the style of American modernism

    Or perhaps like this?

    Painting of a hortensia in the style of American modernism

    Midjourney is a random image generator. A random image generator that you can steer in the direction you want, but still a random image generator.

    A prompt is a wish disguised as a computer system command. It is a manifestation of human intent, offered up to a machine. The word ”prompt” gives a false sense of control. The expression ”prompt engineering” is even more devious, as it hints that generative AI use is a science. Something you can control with mechanical precision. It is not. 

    Creativity, even machine-aided, is not about control. It is about empathy and dialogue.

    “Prompt engineering” is an expression of the human desire for control. We have created a machine that can do amazing things, so we must control it. The goal of prompt engineering is to reduce the amount of chaos in the AI output and make it predictable. But the methods of control we have today are based on hearsay, rumors, and sales speech. No one – not even the creators of the tools – fully knows what words, phrases, and strategies will actually work. Control is an illusion.

    And it does not matter because creativity, even machine-aided creativity, is not about control. It is about empathy and dialogue. It is about giving and taking and sharing. 

    Working with Midjourney is an associative process, an exchange of words and images between a human and a machine. It is organic, chaotic, and often non-intuitive. A stream of consciousness that is hard to explain to others.

    But I will try.

    Start with an idea

    First, there is an idea. The idea can be a word, sentence, or paragraph. It can be a feeling, a memory, or just an impulse to create something, anything. 

    Here’s an idea: a purple ladybug

    Second, I write my idea into the Midjourney input field. When my fingers meet the keyboard, the idea changes. This transition from thought to prompt is fascinating to me. It is not unique to Gen AI; it happens when I write anything with any tool. My thoughts change as I write them down.

    The difference when using Midjourney, ChatGPT, or any other AI tool with prompt-based input is that the change is directly linked to my knowledge of how the generative model works. I try to fit my idea into a mold that I, probably incorrectly, believe is the best way to interact with the AI.

    I often challenge myself to write prompts that are as far from ”best practice” as possible, but this time, I failed. I just wrote a basic, boring prompt.

    3D-movie animation of an evil pink and purple ladybug

    Cute. But what will happen if I use my first hortensia image as a style reference with the ladybug prompt?

    As the hortensia/ladybug renders, another idea suddenly comes to me: a supersonic blast in a clear sky.

    Again, the words change a bit as I put them into Midjourney.

    hand-drawn illustration of a supersonic blast in a cloudy sky

    That is not how I imagined a supersonic blast, but ok. Now my hortensia/ladybug is done as well!

    I like this one, even though it doesn’t look like any 3D animation I have seen. But maybe I can mix it with the supersonic image somehow? That could be interesting. But how? Well, on a whim, I use the picture above as a character reference and the supersonic one as a style reference.

    After three variations, it turns out like this:

    3D cartoon ladybug flying at supersonic speeds through a cloudy sky.

    Shiny! And boring. Let’s rerun the prompt, add some –weird, and see what happens. 

    3D cartoon ladybug flying at supersonic speeds through a cloudy sky. –weird 1500

    The ladybug looks like it is made out of painted wood! What would a chair in the same style look like? 

    Let’s find out.

    photo of a wooden workshop chair in an empty artist’s studio, shot with a Fujica ST605.

    I added a camera model just for fun. Fujica ST605 is a budget household camera from the 1970s. I keep a list of cameras from different eras to have something to work with. Sometimes, if you want your image to have a vibe of a specific period, it is easier to specify a camera model ubiquitous during that era than to use phrases like ”in the style of the 1970s” or whatever. Sometimes, not always. Random image generator, remember?

    I’m unsure how much the Fujica affected the result, but I like the chair image. Very cool floor and lovely lighting. The chair in itself, though, is perhaps the most unsafe-for-kids piece of furniture I have ever seen.

    But – I have no use for an image of a chair. I am trying to generate some sort of cartoon ladybug!

    Fast forward 4 weeks

    Suddenly, I need a picture of a chair. I remember the ladybug chair and look it up in my Midjourney archive. It is close to what I’m looking I’m but not spot on. So I do some experiments with the chair-picture. I use it again as a style reference, an image prompt, and a style reference. I do a lot of variations. Remixes. I try some different prompts. I won’t show all of them here, but after a while, I get this:

    Professional studio photo of a wooden chair on a well-lit, white background. –stylize 300

    Pretty nice. The hardest part was generating ok-looking legs.

    Ladybug goes to space

    So, what happened to the ladybug? I returned to it after a while, inspired by the loading screen from the 1983 Atari video game MULE, to make this picture:

    C64 Loading screen, dithering

    Why did I mix the ladybug with the loading screen from a 41-year-old Atari video game? I’m still trying to figure that out, but the idea came to me after a friend texted me about the game out of the blue.

    I downloaded the image and opened it in Photoshop to remove those weird things in the sky and adjust the colors somewhat. 

    The photoshopped version.

    Next, I uploaded the modified image to Midjourney again, used it as a reference, and prompted away.

    evil but cute pink and purple ladybug, 3D character concept art in the style of animated children’s movies

    Wow – a spacefaring ladybug robot!

    Frankenstein’s prompting

    As you can see, my process is a FrankensteinFrankenstein’seas generating pictures generating ideas generating pictures… Everything is based on something else. Which, I guess, is the essence of generative AI? 

    Questions like ”What prompt did you use to make this picture?” are largely meaningless because, most of the time, an AI image is not the result of a single, easy-to-show prompt. Instead, they are the result of long, meandering, associative brainstorming sessions between humans and AI.

    A prompt can actually be misleading. Take this image of a robot, for example. The final prompt was ”Manga drawing of a mecha in combat”:

    Manga drawing of a mecha in combat

    While it is correct that the prompt that generated the image read that way, to truly understand the process, one must rewind almost two months.

    When you look at someone’s AI artwork, it is essential to remember how much chance plays into the result. Using Midjourney as an artistic or creative tool is akin to action painting – where artists randomly throw paint on a canvas. In action painting, the artist chooses the paint, the canvas, and the location. The artist controls the setting, so to speak, but the end result is inherently random.

    ”Prompt engineering” is throwing paint over and over again until you get what you want. Frankly, it is not engineering at all, and we should stop calling it that. 

    It’s not engineering, it’s not math, it is not coding, it’s not mechanics.

    Let’s call what it is: creativity.