eb3a9e1a5a5b5c7b4eead01e124c286807f48e0a
StableAudioWebUI
A Lightweight Gradio Web interface for running Stable Audio Open 1.0
Saves Files in the following directory Output/YYYY-MM-DD/
with the following schema 'prompt.mp3'
Recommended Settings
Prompt: Any
CFG: 7
Sigma_Min: 0.3
Sigma_Max: 500
Duration: Max 47s
Seed: Any
Start by cloning the repo:
git clone https://github.com/Saganaki22/StableAudioWebUI.git
Use the below deployment (tested on 24GB Nvidia VRAM):
cd StableAudioWebUI
conda create -n saowebui python=3.10
conda activate saowebui
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt
(Note if you have an older Nvidia GPU you may need to use CUDA 11.8)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
If you haven't got a hugging face account or have not used huggingface-cli before, create an account and then authenticate your Hugging face account with a token (create token at https://huggingface.co/settings/tokens)
huggingface-cli login
(paste your token and follow the instructions, token will not be displayed when pasted)
⚠ If you want to run it using CPU
skip 'pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121' and just run
pip install -r requirements.txt
pip install -r requirements1.txt
Run
python gradio_app.py
Screenshots
(All with random seeds)
Prompt: a dog barking
CFG: 7
Sigma_Min: 0.3
Sigma_Max: 500
Prompt: people clapping
CFG: 7
Sigma_Min: 0.3
Sigma_Max: 500
Prompt: didgeridoo
CFG: 7
Sigma_Min: 0.3
Sigma_Max: 500
Model Details
- Model type:
Stable Audio Open 1.0
is a latent diffusion model based on a transformer architecture. - Language(s): English
- License: See the LICENSE file.
- Commercial License: to use this model commercially, please refer to https://stability.ai/membership
Huggingface | Stable Audio Tools | Stability AI
Description
Languages
Python
100%