Skip to content

Commit

Permalink
preparing uvr
Browse files Browse the repository at this point in the history
  • Loading branch information
blaisewf committed Apr 14, 2024
1 parent 8b17804 commit 6803565
Show file tree
Hide file tree
Showing 69 changed files with 12,632 additions and 5 deletions.
28 changes: 24 additions & 4 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,9 +1,29 @@
# Ignore compiled executables
*.exe

# Ignore model files
*.pt
*.onnx
*pth
*.pth

# Ignore Python bytecode files
*.pyc

logs
env
venv
# Ignore audio files
*.wav
*.flac
*.mp3

# Ignore generated logs
logs/

# Ignore environment and virtual environment directories
env/
venv/

# Ignore cached files
.cache/

# Ignore specific project directories
/tracks/
/lyrics/
77 changes: 77 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
2. [Getting Started](#getting-started)
- [Inference](#inference)
- [Training](#training)
- [Audio Separator](#audio-separator)
- [Additional Features](#additional-features)
3. [API](#api)
4. [Credits](#credits)
Expand Down Expand Up @@ -202,6 +203,81 @@ python main.py index --model_name "model_name" --rvc_version "rvc_version"

_Refer to `python main.py index -h` for additional help._

### Audio Separator

```bash
python audio_separator.py [audio_file] [options]
```

#### Info and Debugging

| Parameter Name | Required | Default | Valid Options | Description |
| --------------------- | -------- | ------- | ------------------------- | ---------------------------------------------------------------------- |
| `audio_file` | Yes | None | Any valid audio file path | The path to the audio file you want to separate, in any common format. |
| `-d`, `--debug` | No | False | | Enable debug logging. |
| `-e`, `--env_info` | No | False | | Print environment information and exit. |
| `-l`, `--list_models` | No | False | | List all supported models and exit. |
| `--log_level` | No | info | info, debug, warning | Log level. |

#### Separation I/O Params

| Parameter Name | Required | Default | Valid Options | Description |
| ------------------------ | -------- | ---------------------------- | ------------------------- | ---------------------------------- |
| `-m`, `--model_filename` | No | UVR-MDX-NET-Inst_HQ_3.onnx | Any valid model file path | Model to use for separation. |
| `--output_format` | No | WAV | Any common audio format | Output format for separated files. |
| `--output_dir` | No | None | Any valid directory path | Directory to write output files. |
| `--model_file_dir` | No | /tmp/audio-separator-models/ | Any valid directory path | Model files directory. |

#### Common Separation Parameters

| Parameter Name | Required | Default | Valid Options | Description |
| ----------------- | -------- | ------- | ------------------------------------------------------- | ---------------------------------------------------------- |
| `--invert_spect` | No | False | | Invert secondary stem using spectrogram. |
| `--normalization` | No | 0.9 | Any float value | Max peak amplitude to normalize input and output audio to. |
| `--single_stem` | No | None | Instrumental, Vocals, Drums, Bass, Guitar, Piano, Other | Output only a single stem. |
| `--sample_rate` | No | 44100 | Any integer value | Modify the sample rate of the output audio. |

#### MDXC Architecture Parameters

| Parameter Name | Required | Default | Valid Options | Description |
| ------------------------------- | -------- | ------- | ----------------- | ----------------------------------------------------------------------------------------------- |
| `--mdxc_segment_size` | No | 256 | Any integer value | Size of segments for MDXC architecture. |
| `--mdxc_use_model_segment_size` | No | False | | Use model default segment size instead of the value from the config file for MDXC architecture. |
| `--mdxc_overlap` | No | 8 | 2 to 50 | Amount of overlap between prediction windows for MDXC architecture. |
| `--mdxc_batch_size` | No | 1 | Any integer value | Batch size for MDXC architecture. |
| `--mdxc_pitch_shift` | No | 0 | Any integer value | Shift audio pitch by a number of semitones while processing for MDXC architecture. |

#### MDX Architecture Parameters

| Parameter Name | Required | Default | Valid Options | Description |
| ---------------------- | -------- | ------- | ----------------- | ------------------------------------------------------------------ |
| `--mdx_segment_size` | No | 256 | Any integer value | Size of segments for MDX architecture. |
| `--mdx_overlap` | No | 0.25 | 0.001 to 0.999 | Amount of overlap between prediction windows for MDX architecture. |
| `--mdx_batch_size` | No | 1 | Any integer value | Batch size for MDX architecture. |
| `--mdx_hop_length` | No | 1024 | Any integer value | Hop length for MDX architecture. |
| `--mdx_enable_denoise` | No | False | | Enable denoising during separation for MDX architecture. |

#### Demucs Architecture Parameters

| Parameter Name | Required | Default | Valid Options | Description |
| --------------------------- | -------- | ------- | ----------------- | ----------------------------------------------------------------- |
| `--demucs_segment_size` | No | Default | Any integer value | Size of segments for Demucs architecture. |
| `--demucs_shifts` | No | 2 | Any integer value | Number of predictions with random shifts for Demucs architecture. |
| `--demucs_overlap` | No | 0.25 | 0.001 to 0.999 | Overlap between prediction windows for Demucs architecture. |
| `--demucs_segments_enabled` | No | True | | Enable segment-wise processing for Demucs architecture. |

#### VR Architecture Parameters

| Parameter Name | Required | Default | Valid Options | Description |
| ----------------------------- | -------- | ------- | ----------------- | --------------------------------------------------------------------- |
| `--vr_batch_size` | No | 4 | Any integer value | Batch size for VR architecture. |
| `--vr_window_size` | No | 512 | Any integer value | Window size for VR architecture. |
| `--vr_aggression` | No | 5 | -100 to 100 | Intensity of primary stem extraction for VR architecture. |
| `--vr_enable_tta` | No | False | | Enable Test-Time-Augmentation for VR architecture. |
| `--vr_high_end_process` | No | False | | Mirror the missing frequency range of the output for VR architecture. |
| `--vr_enable_post_process` | No | False | | Identify leftover artifacts within vocal output for VR architecture. |
| `--vr_post_process_threshold` | No | 0.2 | 0.1 to 0.3 | Threshold for post-process feature for VR architecture. |

### Additional Features

#### Model Extract
Expand Down Expand Up @@ -325,6 +401,7 @@ The RVC CLI builds upon the foundations of the following projects:
- [Gradio](https://github.com/gradio-app/gradio) by gradio-app
- [FFmpeg](https://github.com/FFmpeg/FFmpeg) by FFmpeg
- [audio-slicer](https://github.com/openvpi/audio-slicer) by openvpi
- [python-audio-separator](https://github.com/karaokenerds/python-audio-separator) by karaokenerds
- [VITS](https://github.com/jaywalnut310/vits) by jaywalnut310
- [RMVPE](https://github.com/Dream-High/RMVPE) by Dream-High
- [FCPE](https://github.com/CNChTu/FCPE) by CNChTu
Expand Down
Loading

0 comments on commit 6803565

Please sign in to comment.