NeMo/examples/tts/conf/hifigan/model/generator/v2.yaml at 7d8b7d679a2a4c9d3a6b2d8def8888e53c804ded - maxmustermann/NeMo - tilera Git

maxmustermann/NeMo

Felix Kreuk 7d8b7d679a

HifiGAN MelSpectrogram Vocoder Model (#1706 )

* HifiGAN: initial commit

Signed-off-by: Felix Kreuk <felixkreuk@gmail.com>

* switched to manual optimization, added grads to FilterBanks object,
added v1,v2,v3 configurations for the generator
added multiple audio/spec examples in wandb

Signed-off-by: Felix Kreuk <felixkreuk@gmail.com>

* added trg_mel_fn for different mel params in L1 loss

Signed-off-by: Felix Kreuk <felixkreuk@gmail.com>

* fixed ci checks

Signed-off-by: Felix Kreuk <felixkreuk@gmail.com>

* moved losses to separate classes, fixed exp_manager in yaml, style fixes

Signed-off-by: Felix Kreuk <felixkreuk@gmail.com>

* fixed `min_duration` in yamls of MelGAN and HifiGAN

Signed-off-by: Felix Kreuk <felixkreuk@gmail.com>

* fixed f_max in loss calculation, added docs

Signed-off-by: Felix Kreuk <felixkreuk@gmail.com>

Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Jason <jasoli@nvidia.com>

2021-02-16 16:56:29 -05:00

9 lines

272 B

YAML

Raw Blame History

 # @package _group_
 _target_: nemo.collections.tts.modules.hifigan_modules.Generator
 resblock: 1
 upsample_rates: [8,8,2,2]
 upsample_kernel_sizes: [16,16,4,4]
 upsample_initial_channel: 128
 resblock_kernel_sizes: [3,7,11]
 resblock_dilation_sizes: [[1,3,5], [1,3,5], [1,3,5]]