Optimize SDXL generation: fix memory leak, add CPU offloading, update docs, and improve file saving
This commit is contained in:
parent
713bda3bfa
commit
bba3318ab5
1
.gitignore
vendored
1
.gitignore
vendored
@ -1,6 +1,7 @@
|
||||
venv/
|
||||
hf_cache/
|
||||
models/
|
||||
output/
|
||||
__pycache__/
|
||||
*.pyc
|
||||
.DS_Store
|
||||
|
||||
86
README.md
Normal file
86
README.md
Normal file
@ -0,0 +1,86 @@
|
||||
# ⚡ SDXL-Lightning Image Generator for macOS
|
||||
|
||||
This project runs **Stable Diffusion XL Lightning (4-step)** natively on your Mac using **Apple Silicon (M1/M2/M3)** acceleration (MPS). It is optimized to generate high-quality 1024x1024 images in seconds with minimal setup.
|
||||
|
||||
## ✨ Features
|
||||
|
||||
- **Blazing Fast**: Uses SDXL-Lightning 4-step UNet for rapid generation (approx. ~10-30s per image on M1/M2).
|
||||
- **Native Mac Support**: Leverage your Mac's GPU with Metal Performance Shaders (MPS).
|
||||
- **Memory Optimized**: Automatic CPU offloading to run even on 8GB/16GB Macs without crashing.
|
||||
- **Local Privacy**: All models run locally on your machine. No cloud API keys needed.
|
||||
- **Auto-Download**: Automatically fetches required model weights on first run.
|
||||
|
||||
## 🚀 Prerequisites
|
||||
|
||||
- macOS 12.3+ (Monterey or newer)
|
||||
- Mac with Apple Silicon (M1, M2, M3)
|
||||
- Python 3.9 or newer installed (check with `python3 --version`)
|
||||
|
||||
## 🛠️ Installation & Setup
|
||||
|
||||
1. **Clone the repository** (if you haven't already):
|
||||
```bash
|
||||
git clone <your-repo-url>
|
||||
cd "Image Generation"
|
||||
```
|
||||
|
||||
2. **Create a virtual environment** to keep dependencies clean:
|
||||
```bash
|
||||
python3 -m venv venv
|
||||
source venv/bin/activate
|
||||
```
|
||||
|
||||
3. **Install dependencies**:
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
*(This installs `torch`, `diffusers`, `transformers`, and `accelerate` optimized for Mac)*
|
||||
|
||||
## 🎨 Usage
|
||||
|
||||
### Basic Generation
|
||||
Generate an image with a simple text prompt. The first time you run this, it will download the necessary models (~6GB).
|
||||
|
||||
```bash
|
||||
# Make sure your virtual environment is active!
|
||||
source venv/bin/activate
|
||||
|
||||
# Run the generator
|
||||
python generate.py "A futuristic cityscape at sunset, highly detailed, cyberpunk style, neon lights"
|
||||
```
|
||||
|
||||
### Advanced Options
|
||||
You can customize the resolution and quality settings:
|
||||
|
||||
```bash
|
||||
python generate.py "An astronaut riding a horse on mars, realistic, 8k" --width 1024 --height 1024 --steps 4
|
||||
```
|
||||
|
||||
| Flag | Default | Description |
|
||||
| :--- | :--- | :--- |
|
||||
| `prompt` | (Required) | The description of the image you want to generate. |
|
||||
| `--width` | `1920` | Width of the image. Standard SDXL is optimized for `1024`. |
|
||||
| `--height` | `1080` | Height of the image. |
|
||||
| `--steps` | `4` | Number of inference steps. 4-8 is recommended for Lightning. |
|
||||
|
||||
**Output Location**:
|
||||
Images are saved automatically to the `output/` folder in this directory.
|
||||
|
||||
## ⚡ First Run Note
|
||||
The first time you run the script, it will download:
|
||||
1. **SDXL Base Model**: ~6GB (Cached in `hf_cache/`)
|
||||
2. **Lightning UNet**: ~5GB (Saved in `models/`)
|
||||
|
||||
Use a fast internet connection! Subsequent runs will be instant.
|
||||
|
||||
## 🔧 Troubleshooting
|
||||
|
||||
- **"MPS backend out of memory"**:
|
||||
This means your Mac ran out of GPU memory. The script includes `pipe.enable_model_cpu_offload()` to prevent this. try restarting your computer or closing other heavy apps.
|
||||
- **"Permission denied" errors**:
|
||||
You might see "mpsgraph" permission errors in the terminal. These are harmless warnings from macOS's Metal framework and can be ignored. The image generation will still work.
|
||||
- **Slow First Generation**:
|
||||
Shader compilation happens on the very first run. Future generations will be much faster.
|
||||
|
||||
## 🤝 Contributing
|
||||
Feel free to open issues or submit PRs to improve performance or add features!
|
||||
10
generate.py
10
generate.py
@ -26,7 +26,9 @@ def generate_image(prompt, width=1920, height=1080, steps=4):
|
||||
|
||||
# Load UNet from local file
|
||||
print("Loading UNet from local file...")
|
||||
unet = UNet2DConditionModel.from_config(base, subfolder="unet").to(device, torch.float16)
|
||||
# Fix FutureWarning: Load config first
|
||||
unet_config = UNet2DConditionModel.load_config(base, subfolder="unet")
|
||||
unet = UNet2DConditionModel.from_config(unet_config).to(device, torch.float16)
|
||||
unet.load_state_dict(load_file(local_unet, device=device))
|
||||
|
||||
# Load Pipeline
|
||||
@ -36,7 +38,8 @@ def generate_image(prompt, width=1920, height=1080, steps=4):
|
||||
# Optimizations for Mac/MPS
|
||||
print("Enabling attention slicing for memory efficiency...")
|
||||
pipe.enable_attention_slicing()
|
||||
# pipe.enable_model_cpu_offload() # Uncomment if running out of memory
|
||||
print("Enabling model CPU offloading for memory efficiency...")
|
||||
pipe.enable_model_cpu_offload()
|
||||
|
||||
# Ensure scheduler is correct for Lightning
|
||||
pipe.scheduler = EulerDiscreteScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing")
|
||||
@ -46,7 +49,8 @@ def generate_image(prompt, width=1920, height=1080, steps=4):
|
||||
image = pipe(prompt, num_inference_steps=steps, guidance_scale=0, width=width, height=height).images[0]
|
||||
|
||||
# Save
|
||||
save_dir = os.path.expanduser("~/Documents/Image Generations")
|
||||
# Save
|
||||
save_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), "output")
|
||||
os.makedirs(save_dir, exist_ok=True)
|
||||
|
||||
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user