stable diffusion

2022-12-15 20:17:27 +00:00 · 2022-12-15 20:17:27 +00:00 · 5f8a0cefe5
commit 5f8a0cefe5
parent 89f56103f3
2 changed files with 131 additions and 0 deletions
--- a/blogs/2022/12/15/astronaut_rides_horse.png
+++ b/blogs/2022/12/15/astronaut_rides_horse.png
--- a/blogs/2022/12/15/stable-diffusion.md
+++ b/blogs/2022/12/15/stable-diffusion.md
@ -0,0 +1,128 @@
+# Local Stable Diffusion
+
+![astronaut rides horse](astronaut_rides_horse.png)
+
+Stable diffusion (SD) is an AI technique for generating images from text prompts.
+Similar to DALL-E, which drives the popular [craiyon](https://www.craiyon.com/), SD is available as an [online tool](https://huggingface.co/spaces/stabilityai/stable-diffusion).
+These web tools are amazing, and easy to use, but can be frustrating - they're often under high load, and impose long waiting times.
+They use a good chunk of computational resources, specifically GPUs and so have generally been out of reach for even people with powerful personal machines.
+
+Now, however, SD has reached the point it can be run using (admittedly, high-end) consumer video cards.
+Stability AI - the model's developers - recently [published a blog post](https://stability.ai/blog/stable-diffusion-v2-release) open-sourcing SD 2.
+There's a README for getting started [here](https://huggingface.co/stabilityai/stable-diffusion-2/blob/main/README.md), but it has a couple of gotchas and assumptions which plenty of people (like myself) won't have known if they're not already familiar with the technologies in use, such as Python and CUDA.
+
+This post is descibes my experience setting up SD 2 on my local workstation.
+For hardware, I have an i7-6700k, RTX 2080 Super and 48GB of RAM.
+If you have an AMD video card, you won't be able to use CUDA, but you may be able to use GPU acceleration regardless using something ROCm.
+In this post I'm using Arch Linux, but I have successfully set it up on Windows too.
+Python is an exceedingly portable language, so it should work wherever you're able to get a Python installation.
+
+This post assumes that you already have a working Python installation.
+
+## Install CUDA
+
+CUDA needs to be installed separately from Python dependencies.
+It is quite large, and as with all NVIDIA driver installations, can be a bit confusing.
+On Linux, it's straightforward to install it from your distribution's package manager.
+
+```bash
+sudo pacman -Syu
+sudo pacman -S cuda
+```
+
+On Windows, you will need to go to NVIDIA's site to download the correct version of CUDA.
+At time of writing, the SD 2 script expects CUDA 11.7, and will not work if you install the latest 12.0 version.
+To get older versions, go to their [download archive](https://developer.nvidia.com/cuda-toolkit-archive) and select the appropriate one.
+
+## Set up a virtual environment an PyTorch
+
+Python can be installed at a system level, but it's usually a good idea to set up a virtual environment for your project.
+This isolates the project dependencies from the wider system, and makes your setup reproducible.
+I will use [`pipenv`](https://pipenv.pypa.io/en/latest/index.html) as it's what I'm familiar with.
+
+PyTorch is a deep-learning framework, used to put together machine learning pipelines.
+
+To get a command to install the relevant dependencies, go to [PyTorch's site](https://pytorch.org/get-started/locally/) and choose the options for your setup.
+In my case, I replaced `pip3` with `pipenv` as I want to install dependencies to a new virtual environment instead of to the system.
+
+```bash
+mkdir stable-diffusion && cd stable-diffusion
+pipenv install torch torchvision torchaudio
+```
+
+## Install Stable Diffusion
+
+SD 2 is provided by the `diffusers` package.
+We can install it in our virtual environment as follows:
+
+```bash
+pipenv shell
+pip3 install git+https://github.com/huggingface/diffusers.git transformers accelerate scipy
+exit
+```
+
+We use `pipenv shell` to enter a shell using the virtual environment, before using the `pip3` command described on their README.
+After installing dependencies, we can leave the virtual environment shell and return to our original one.
+`transformers` and `accelerate` are optional, but used to reduce memory usage and so are recommended.
+
+## Create a Python script
+
+Python does have an interactive envronment, but so save our fingers let's use a `stable-diffusion.py` script to contain and run our Python code.
+Here I'll mostly copy the Python included in their README:
+
+```python
+import torch
+from diffusers import StableDiffusionPipeline, EulerDiscreteScheduler
+
+model_id = "stabilityai/stable-diffusion-2"
+
+# Use the Euler scheduler here instead
+scheduler = EulerDiscreteScheduler.from_pretrained(model_id, subfolder="scheduler")
+pipe = StableDiffusionPipeline.from_pretrained(model_id, scheduler=scheduler, revision="fp16", torch_dtype=torch.float16)
+pipe = pipe.to("cuda")
+pipe.enable_attention_slicing()
+
+prompt = "a photo of an astronaut riding a horse on mars"
+image = pipe(prompt, height=768, width=768).images[0]
+    
+image.save("astronaut_rides_horse.png")
+```
+
+I've made two additions here.
+First, I've added `import torch` at the top - I'm not sure why the code in the README omits this, but it's needed to work.
+
+I've also added `pipe.enable_attention_slicing()` - this is a more memory-efficient running mode, which is less intensive at the cost of taking longer.
+If you have a monster video card, this may not be necessary.
+
+At this point, we're done - after running the script successfully, you should have a new picture of an astronaut riding a horse on mars.
+
+## Some nice-to-haves
+
+In this basic script we only have the one, hardcoded prompt.
+To change it, we need to update the file itself.
+Instead, we can change how `prompt` is set, and have it read from command-line parameters instead.
+
+```python
+# at the top of the file
+import sys
+
+...
+
+prompt = " ".join(sys.argv[1:])
+```
+
+While we're at it, we can also base the filename on the input prompt:
+
+```python
+image.save(f"{prompt.replace(" ", "_")}.png")
+```
+
+## Wrapping up
+
+And that's it!
+Enjoy making some generative art.
+My favourites so far have been prefixing "psychedelic" to things.
+I've also been enjoying generating descriptions with [ChatGPT](https://chat.openai.com/chat) and plugging them into SD, for some zero-effort creativity.
+As always, if anything's out of place of if you'd like to get in touch, please [send me an email!](mailto:me@ktyl.dev).
+
+