2024-06-06 08:01:56 +01:00
2024-06-06 08:00:13 +01:00
2024-06-06 06:46:46 +01:00
2024-06-06 05:31:43 +01:00
2024-06-06 08:01:56 +01:00
2024-06-06 06:46:46 +01:00
2024-06-06 06:46:46 +01:00

StableAudioWebUI

A Lightweight Gradio Web interface for running Stable Audio Open 1.0



image




Start by cloning the repo:

git clone https://github.com/Saganaki22/StableAudioWebUI.git

Use the below deployment (tested on 24GB Nvidia VRAM):
cd StableAudioWebUI
conda create -n saowebui python=3.10
conda activate saowebui
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt

(Note if you have an older Nvidia GPU you may need to use CUDA 11.8)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

If you haven't got a hugging face account or have not used huggingface-cli before, create an account and then authenticate your Hugging face account with a token (create token at https://huggingface.co/settings/tokens)
huggingface-cli login

(paste your token and follow the instructions, token will not be displayed when pasted)

⚠ If you want to run it using CPU

skip 'pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121' and just run

pip install -r requirements.txt
pip install -r requirements1.txt

Run

python gradio_app.py

Screenshots

(All with random seeds)

Prompt: a dog barking
CFG: 7
Sigma_Min: 0.3
Sigma_Max: 500

image


Prompt: people clapping
CFG: 7
Sigma_Min: 0.3
Sigma_Max: 500

image


Prompt: didgeridoo
CFG: 7
Sigma_Min: 0.3
Sigma_Max: 500

image

Model Details

  • Model typeStable Audio Open 1.0 is a latent diffusion model based on a transformer architecture.
  • Language(s): English
  • License: See the LICENSE file.
  • Commercial License: to use this model commercially, please refer to https://stability.ai/membership
Description
No description provided
Readme 4.6 MiB
Languages
Python 100%