diff --git a/README.md b/README.md new file mode 100644 index 0000000..26b9aba --- /dev/null +++ b/README.md @@ -0,0 +1,89 @@ +# StableAudioWebUI + +### A Lightweight Gradio Web interface for running Stable Audio Open 1.0 +
+
+ +![image](assets/screenshot.png) + +
+
+ +--- + + Start by cloning the repo: + + git clone https://github.com/Saganaki22/StableAudioWebUI.git + + +
+ Use the below deployment (tested on 24GB Nvidia VRAM): + + cd StableAudioWebUI + conda create -n saowebui python=3.10 + conda activate saowebui + pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121 + pip install -r requirements.txt + + + +
+ (Note if you have an older Nvidia GPU you may need to use CUDA 11.8) + + pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 + +
+ If you haven't got a hugging face account or have not used huggingface-cli before, create an account and then authenticate your Hugging face account with a token (create token at https://huggingface.co/settings/tokens) + + huggingface-cli login + + (paste your token and follow the instructions, token will not be displayed when pasted) + + ## ⚠ If you want to run it using CPU
+ skip 'pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121' and just run + + pip install -r requirements.txt + pip install -r requirements1.txt + +## +Run + + python gradio_app.py + +
+ +# Screenshots + +(All with random seeds)
+ +Prompt: a dog barking
+CFG: 7
+Sigma_Min: 0.3
+Sigma_Max: 500
+ +![image](https://github.com/Saganaki22/StableAudioWebUI/blob/main/assets/screenshot1.png) + +# +
+Prompt: people clapping
+CFG: 7
+Sigma_Min: 0.3
+Sigma_Max: 500
+ +![image](https://github.com/Saganaki22/StableAudioWebUI/blob/main/assets/screenshot2.png) + +# +
+Prompt: didgeridoo
+CFG: 7
+Sigma_Min: 0.3
+Sigma_Max: 500
+ +![image](https://github.com/Saganaki22/StableAudioWebUI/blob/main/assets/screenshot3.png) + +## Model Details + +- **Model type**: `Stable Audio Open 1.0` is a latent diffusion model based on a transformer architecture. +- **Language(s)**: English +- **License**: See the [LICENSE file](https://huggingface.co/stabilityai/stable-audio-open-1.0/blob/main/LICENSE). +- **Commercial License**: to use this model commercially, please refer to [https://stability.ai/membership](https://stability.ai/membership)