Linux: Einstieg in Stable Diffusion

Stable Diffusion ist ein KI Bild-Generator der auf Deiner eigenen Maschine läuft.

Für Windows gibt es zahlreiche Programme, aber wir wollen es auf Linux laufen lassen.

Meine Aufmerksamkeit wurde durch dieses Video erregt.

Soweit ich es verstehe, wird eine starke NVIDIA Grafikkarte vorausgesetzt. Ich habe eine GeForce 3080 Ti mit 12GB VRAM und kann Bilder von 256x512 Pixeln innerhalb ein paar Sekunden erzeugen, aber nicht viel größer.

Vorbereitung & Durchführung

Ich verwende Ubuntu 22.04.

Der Quellcode von Stable Diffusion ist ohne das angelernte KI Model nicht effektiv.

Das stable-diffusion-v-1-4-original Model kann von huggingface.co nach kostenfreier Registrierung heruntergeladen werden. Es ist etwa 7 GB groß.

Es wird eine Anaconda-Python-Umgebung empfohlen, mit der man alle Abhängkeiten isoliert installieren kann. Schau Dir die Anleitung für Ubuntu Guide an, wie man Anaconda installiert.

bash Anaconda3-2022.05-Linux-x86_64.sh
# only directly after installation:
. ~/.bashrc

Anaconda nistet sich in die ~/.bashrc und ändert Deinen Prompt. Möchtest Du das wieder rausnehmen geht das von Hand oder mit

conda init --reverse

Sobald alles da ist, geht es etwa wie folgt (basierend auf der Anleitung auf der Project Github Seite):

# Ordner vorbereiten
mkdir ~/Stable\ Diffusion
cd ~/Stable\ Diffusion
git clone https://github.com/CompVis/stable-diffusion.git .
mkdir -p models/ldm/stable-diffusion-v1/
 
# Das Model einsetzen
cd ~/Stable\ Diffusion
mv ~/Downloads/sd-v1-4-full-ema.ckpt models/ldm/stable-diffusion-v1
cd models/ldm/stable-diffusion-v1/
ln -s sd-v1-4-full-ema.ckpt model.ckpt
 
# Die Python Umgebung erstellen
cd ~/Stable\ Diffusion
conda env create -f environment.yaml
conda activate ldm
conda install pytorch torchvision -c pytorch

Sobald alles fertig ist, kannst Du Bilder generieren.

cd ~/Stable\ Diffusion
python scripts/txt2img.py --plms --W 512 --H 256 --prompt "a photograph of an astronaut riding a horse"

Die Bilder liegen dann in outputs/

An elefant on a cliff looking at a sunset

Colorful clouds in japan

A drawing of a granny on the porch

Viel Spaß!

Blog-Bilder kann ich jetzt immer so generieren:

python scripts/txt2img.py --plms --W 1088 --H 128 --prompt "granny on an elephant passing a cloud"
python scripts/txt2img.py --plms --W 448 --H 256 --prompt "granny on an elephant passing a cloud"

ki, linux