From 66e82bb53222aa9b5a1008acf5a67b1636100c2b Mon Sep 17 00:00:00 2001 From: Blaise Date: Tue, 30 Jul 2024 23:59:34 +0200 Subject: [PATCH] docs: add uvr --- docs/pages/uvr.mdx | 95 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 94 insertions(+), 1 deletion(-) diff --git a/docs/pages/uvr.mdx b/docs/pages/uvr.mdx index a2bfd3d..eaac579 100644 --- a/docs/pages/uvr.mdx +++ b/docs/pages/uvr.mdx @@ -1,3 +1,96 @@ # UVR -🚧 Page under construction! +Learn how to use the `uvr_cli.py` script to perform various operations with UVR. + +## Usage + +To use the UVR CLI, navigate to the directory containing `uvr_cli.py` in your terminal and execute the script using the following syntax: + +``` +python uvr_cli.py --audio_file [options] +``` + +Replace `` with the path to the audio file you want to process and `[options]` with the necessary arguments. For a detailed list of arguments available for each mode, run: + + +``` +python uvr_cli.py -h +``` + +This will display a help message with explanations for each argument. + +## Modes + +### Info and Debugging + +| Argument | Description | Type | Default | Required | +| --------------------- | ------------------------------------------------------------- | ---- | ------- | -------- | +| `-d`, `--debug` | Enable debug logging. Equivalent to `--log_level=debug`. | bool | False | No | +| `-e`, `--env_info` | Print environment information and exit. | bool | False | No | +| `-l`, `--list_models` | List all supported models and exit. | bool | False | No | +| `--log_level` | Log level, e.g. `info`, `debug`, `warning` (default: `info`). | str | info | No | + +### Separation I/O Params + +| Argument | Description | Type | Default | Required | +| ------------------------ | ----------------------------------------------------------------------------------------------------------- | ---- | -------------------------------------------------- | -------- | +| `-m`, `--model_filename` | Model to use for separation. Example: `-m 2_HP-UVR.pth` | str | `model_mel_band_roformer_ep_3005_sdr_11.4360.ckpt` | No | +| `--output_format` | Output format for separated files, any common format (default: `WAV`). Example: `--output_format=MP3` | str | `WAV` | No | +| `--output_dir` | Directory to write output files (default: ``). Example: `--output_dir=/app/separated` | str | `None` | No | +| `--model_file_dir` | Model files directory (default: `uvr/tmp/audio-separator-models/`). Example: `--model_file_dir=/app/models` | str | `uvr/tmp/audio-separator-models/` | No | + +### Common Separation Parameters + +| Argument | Description | Type | Default | Required | +| ----------------- | ------------------------------------------------------------------------------------------------------------------------------------------ | ----- | ------- | -------- | +| `--invert_spect` | Invert secondary stem using spectrogram (default: `False`). Example: `--invert_spect` | bool | False | No | +| `--normalization` | Max peak amplitude to normalize input and output audio to (default: `0.9`). Example: `--normalization=0.7` | float | 0.9 | No | +| `--single_stem` | Output only single stem, e.g. `Instrumental`, `Vocals`, `Drums`, `Bass`, `Guitar`, `Piano`, `Other`. Example: `--single_stem=Instrumental` | str | None | No | +| `--sample_rate` | Modify the sample rate of the output audio (default: `44100`). Example: `--sample_rate=44100` | int | 44100 | No | + +### MDX Architecture Parameters + +| Argument | Description | Type | Default | Required | +| ---------------------- | --------------------------------------------------------------------------------------------------------------------------------------- | ----- | ------- | -------- | +| `--mdx_segment_size` | Larger consumes more resources, but may give better results (default: `256`). Example: `--mdx_segment_size=256` | int | 256 | No | +| `--mdx_overlap` | Amount of overlap between prediction windows, 0.001-0.999. Higher is better but slower (default: `0.25`). Example: `--mdx_overlap=0.25` | float | 0.25 | No | +| `--mdx_batch_size` | Larger consumes more RAM but may process slightly faster (default: `1`). Example: `--mdx_batch_size=4` | int | 1 | No | +| `--mdx_hop_length` | Usually called stride in neural networks, only change if you know what you're doing (default: `1024`). Example: `--mdx_hop_length=1024` | int | 1024 | No | +| `--mdx_enable_denoise` | Enable denoising during separation (default: `False`). Example: `--mdx_enable_denoise` | bool | False | No | + +### VR Architecture Parameters + +| Argument | Description | Type | Default | Required | +| ----------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------- | ----- | ------- | -------- | +| `--vr_batch_size` | Number of batches to process at a time. Higher = more RAM, slightly faster processing (default: `4`). Example: `--vr_batch_size=16` | int | 4 | No | +| `--vr_window_size` | Balance quality and speed. `1024` = fast but lower, `320` = slower but better quality. (default: `512`). Example: `--vr_window_size=320` | int | 512 | No | +| `--vr_aggression` | Intensity of primary stem extraction, `-100` - `100`. Typically `5` for vocals & instrumentals (default: `5`). Example: `--vr_aggression=2` | int | 5 | No | +| `--vr_enable_tta` | Enable Test-Time-Augmentation; slow but improves quality (default: `False`). Example: `--vr_enable_tta` | bool | False | No | +| `--vr_high_end_process` | Mirror the missing frequency range of the output (default: `False`). Example: `--vr_high_end_process` | bool | False | No | +| `--vr_enable_post_process` | Identify leftover artifacts within vocal output; may improve separation for some songs (default: `False`). Example: `--vr_enable_post_process` | bool | False | No | +| `--vr_post_process_threshold` | Threshold for post_process feature: `0.1`-`0.3` (default: `0.2`). Example: `--vr_post_process_threshold=0.1` | float | 0.2 | No | + +### Demucs Architecture Parameters + +| Argument | Description | Type | Default | Required | +| --------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ | ----- | --------- | -------- | +| `--demucs_segment_size` | Size of segments into which the audio is split, `1-100`. Higher = slower but better quality (default: `Default`). Example: `--demucs_segment_size=256` | str | `Default` | No | +| `--demucs_shifts` | Number of predictions with random shifts, higher = slower but better quality (default: `2`). Example: `--demucs_shifts=4` | int | 2 | No | +| `--demucs_overlap` | Overlap between prediction windows, 0.001-0.999. Higher = slower but better quality (default: `0.25`). Example: `--demucs_overlap=0.25` | float | 0.25 | No | +| `--demucs_segments_enabled` | Enable segment-wise processing (default: `True`). Example: `--demucs_segments_enabled=False` | bool | `True` | No | + +### MDXC Architecture Parameters + +| Argument | Description | Type | Default | Required | +| ------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------- | ---- | ------- | -------- | +| `--mdxc_segment_size` | Larger consumes more resources, but may give better results (default: `256`). Example: `--mdxc_segment_size=256` | int | 256 | No | +| `--mdxc_override_model_segment_size` | Override model default segment size instead of using the model default value. Example: `--mdxc_override_model_segment_size` | bool | False | No | +| `--mdxc_overlap` | Amount of overlap between prediction windows, `2-50`. Higher is better but slower (default: `8`). Example: `--mdxc_overlap=8` | int | 8 | No | +| `--mdxc_batch_size` | Larger consumes more RAM but may process slightly faster (default: `1`). Example: `--mdxc_batch_size=4` | int | 1 | No | +| `--mdxc_pitch_shift` | Shift audio pitch by a number of semitones while processing. May improve output for deep/high vocals. (default: `0`). Example: `--mdxc_pitch_shift=2` | int | 0 | No | + +**Example:** + +```bash +uvr_cli.py --audio_file "my_song.mp3" --output_format MP3 --output_dir "/path/to/output" --model_filename "2_HP-UVR.pth" --vr_aggression 10 +```