Skip to content

codename0og/SmartCutter

Repository files navigation

SmartCutter

ㅤㅤ👇 You can join my discord server below ( RVC / AI Audio friendly ) 👇ㅤㅤ

Codename's Sanctuary

ㅤㅤ👆 To stay up-to-date with advancements, hang out or get support 👆ㅤㅤ

A lil bit more about the project:

Machine Learning based silence-truncation.
Made with Applio / RVC and my Codename-RVC-Fork-4 in mind. ✨

Features:

  • Automatically truncates the silences ( whether dirty / noisy or not.. tho there's limits. It's not a noise-gate trimmer afterall haha. ).
    While trying to ensure more or less consistent ~100ms spacings ( Some deviations are present and expected. )
  • Respects zero-crossing boundaries.
  • Respects breathing ( hopefully.. can't promise much if they're too quiet or way too much noise-like. ).
  • Doesn't damage word-tails or inter-phonetic gaps ( unlike gating )
  • Truncated areas are automatically replaced by pure silence ( in case of noise-contamination between words or sentences. ).
  • No need for user input when it comes to adjusting any params or values. All's handled automatically.

Scenarios it handles very reliably:

image

Scenarios it might fail or the reliability is uncertain:

1. image

2. image

Therefore, for such " hard cases " ( 1, 2 ) spectral de-noise ( or gating if you're careful ) is recommended.


⚠️IMPORTANT⚠️

  • For now only CUDA ( nvidia ) or CPU.
  • Supported sample rates: 32, 40 and 48khz.
  • Silence / Sub-Silence ( noisy ) spacings below 100ms are ignored / not processed by design.
  • There are limits, it is still a very-low-noise or pure silence focused truncator. ( So keep in mind models might hiccup on some really hard cases. )

✨ to-do list ✨

  • Better pretrained models. done.
  • Eventually ( potentially ) move over to v5 arch ~ Bigger dataset's needed.

💡 Ideas / concepts 💡

  • Currently none. Open to your ideas ~

❗ For contact, please join my discord server ❗


Getting Started:

INSTALLATION:

Run the installation script:

  • Double-click install.bat.

PRETRAINED MODELS:

INFERENCE:

To start inference:

  • First put the concatenated sample or samples ( .wav or .flac ) into "infer_input" dir.
  • Double-click run-infer.bat.
  • Results will land in "infer_output" dir.
    ( Concatenated = Simply join up all samples / segments into 1 file )

    NOTE: supports multiple samples AND multiple sr.

TRAINING:

  • Training of custom pretrains is supported.
    Instruction regarding that will be published in future.

About

Machine Learning based silence-truncation, made specifically with Applio / RVC and my Codename-RVC-Fork-4 in mind.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors