Add Delightful-TTS model #2095

loganhart02 · 2022-10-25T08:48:22Z

model implementation from: https://arxiv.org/pdf/2110.12612.pdf

erogol

We must test the model like we do vits.py at the very least and testing individual layers would be even better.

TTS/tts/layers/delightful_tts/conformer.py

erogol · 2022-10-29T10:44:44Z

TTS/tts/layers/delightful_tts/encoders.py

+        encoding: torch.Tensor,
+    ) -> torch.Tensor:
+        """
+        x --- [N, seq_len, encoder_embedding_dim]


These shape def docstrings need also reformatting as the other models to be compatible with our documentation.

TTS/tts/layers/delightful_tts/acoustic_model.py

loganhart02 · 2022-11-01T19:50:33Z

@erogol The most recent push of code I know works and is currently training a model. after I confirm it converges Ill clean up the code and write the docs for the model

loganhart02 · 2022-11-30T13:06:36Z

@erogol I'm working on fixing a bug in unittest but the code to the model is ready to start the review

erogol · 2022-12-20T15:12:03Z

TTS/tts/configs/delightful_tts_config.py

+
+@dataclass
+class DelightfulTTSConfig(BaseTTSConfig):
+


You can consider typing docstrings for the config arguments. I'd help you understand architecture better.

TTS/tts/datasets/dataset.py

TTS/tts/layers/delightful_tts/acoustic_model.py

erogol · 2022-12-20T15:18:32Z

TTS/tts/layers/delightful_tts/acoustic_model.py

+        encoder_outputs_res = encoder_outputs
+
+        # Pitch predictor
+        pitch_pred, avg_pitch_target, pitch_emb = self.pitch_adaptor.get_pitch_embedding_train(


Do we normalize the ground truth pitch somewhere?

TTS/tts/layers/delightful_tts/variance_predictor.py

TTS/tts/models/delightful_tts.py

TTS/tts/utils/emotions.py

erogol · 2022-12-20T15:36:21Z

tests/tts_tests/test_delightful_tts_layers.py

@@ -0,0 +1,89 @@
+import torch


You need to do the gradient pass test as we discuss before.

stale · 2023-01-20T15:36:57Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

iamkhalidbashir · 2023-02-23T12:30:12Z

Any idea when this will be merged? And will it have a pre-trained model?

iamkhalidbashir · 2023-03-14T08:10:37Z

is this PR for Delightful TTS 1 or Delightful TTS 2 (https://arxiv.org/abs/2207.04646)

stale · 2023-05-12T20:32:23Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

erogol · 2023-05-14T10:38:44Z

@loganhart420 lets wrap up this PR

iamkhalidbashir · 2023-05-14T10:39:41Z

Would we have a trained model ?

On Sun, 14 May 2023 at 3:38 PM Eren Gölge ***@***.***> wrote: @loganhart420 <https://github.com/loganhart420> lets wrap up this PR — Reply to this email directly, view it on GitHub <#2095 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGS5WW24GACUO25ME47RDS3XGCY35ANCNFSM6AAAAAARNXTJQE> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

-- *Mr. Bashir,* *CEO, AMOXT Pvt. Ltd*

loganhart02 · 2023-05-14T13:15:03Z

@loganhart420 lets wrap up this PR

doing it now, should I just put the pertained weights in a draft release?

loganhart02 · 2023-05-14T13:15:17Z

Would we have a trained model ?
On Sun, 14 May 2023 at 3:38 PM Eren Gölge @.> wrote: @loganhart420 https://github.com/loganhart420 lets wrap up this PR — Reply to this email directly, view it on GitHub <#2095 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGS5WW24GACUO25ME47RDS3XGCY35ANCNFSM6AAAAAARNXTJQE . You are receiving this because you are subscribed to this thread.Message ID: @.>
-- Mr. Bashir, CEO, AMOXT Pvt. Ltd

yea

loganhart02 · 2023-05-14T13:15:54Z

is this PR for Delightful TTS 1 or Delightful TTS 2 (https://arxiv.org/abs/2207.04646)

1

iamkhalidbashir · 2023-05-14T13:16:01Z

Awesome!

On Sun, 14 May 2023 at 6:15 PM logan hart ***@***.***> wrote: Would we have a trained model ? On Sun, 14 May 2023 at 3:38 PM Eren Gölge *@*.*> wrote: @loganhart420 <https://github.com/loganhart420> https://github.com/loganhart420 <https://github.com/loganhart420> lets wrap up this PR — Reply to this email directly, view it on GitHub <#2095 (comment) <#2095 (comment)>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGS5WW24GACUO25ME47RDS3XGCY35ANCNFSM6AAAAAARNXTJQE <https://github.com/notifications/unsubscribe-auth/AGS5WW24GACUO25ME47RDS3XGCY35ANCNFSM6AAAAAARNXTJQE> . You are receiving this because you are subscribed to this thread.Message ID: @.*> -- *Mr. Bashir,* *CEO, AMOXT Pvt. Ltd* yea — Reply to this email directly, view it on GitHub <#2095 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGS5WW4NFZEHQJM7LLZ4RYTXGDLHDANCNFSM6AAAAAARNXTJQE> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

-- *Mr. Bashir,* *CEO, AMOXT Pvt. Ltd*

* add configs * Update config file * Add model configs * Add model layers * Add layer files * Add layer modules * change config names * Add emotion manager * fIX missing ap bug * Fix missing ap bug * Add base TTS e2e class * Fix wrong variable name in load_tts_samples * Add training script * Remove range predictor and gaussian upsampling * Add helper function * Add vctk recipe * Add conformer docs * Fix linting in conformer.py * Add Docs * remove duplicate import * refactor args * Fix bugs * Removew emotion embedding * remove unused arg * Remove emotion embedding arg * Remove emotion embedding arg * fix style issues * Fix bugs * Fix bugs * Add unittests * make style * fix formatter bug * fix test * Add pyworld compute pitch func * Update requirments.txt * Fix dataset Bug * Chnge layer norm to instance norm * Add missing import * Remove emotions.py * remove ssim loss * Add init layers func to aligner * refactor model layers * remove audio_config arg * Rename loss func * Rename to delightful-tts * Rename loss func * Remove unused modules * refactor imports * replace audio config with audio processor * Add change sample rate option * remove broken resample func * update recipe * fix style, add config docs * fix tests and multispeaker embd dim * remove pyworld * Make style and fix inference * Split tts tests * Fixup * Fixup * Fixup * Add argument names * Set "random" speaker in the model Tortoise/Bark * Use a diff f0_cache path for delightfull tts * Fix delightful speaker handling * Fix lint * Make style --------- Co-authored-by: loganhart420 <[email protected]> Co-authored-by: Eren Gölge <[email protected]>

loganhart02 added the model implementation label Oct 25, 2022

loganhart02 requested a review from erogol October 25, 2022 08:48

loganhart02 self-assigned this Oct 25, 2022

erogol requested changes Oct 29, 2022

View reviewed changes

erogol approved these changes Nov 1, 2022

View reviewed changes

erogol approved these changes Nov 3, 2022

View reviewed changes

loganhart02 mentioned this pull request Nov 4, 2022

Delightful TTS implementation #1715

Closed

erogol approved these changes Nov 7, 2022

View reviewed changes

loganhart02 marked this pull request as ready for review November 30, 2022 13:05

erogol requested changes Dec 20, 2022

View reviewed changes

stale bot added the wontfix This will not be worked on but feel free to help. label Jan 20, 2023

stale bot closed this Jan 27, 2023

loganhart02 reopened this Jan 27, 2023

stale bot removed the wontfix This will not be worked on but feel free to help. label Jan 27, 2023

erogol approved these changes Feb 13, 2023

View reviewed changes

erogol approved these changes Feb 23, 2023

View reviewed changes

stale bot added the wontfix This will not be worked on but feel free to help. label May 12, 2023

stale bot removed the wontfix This will not be worked on but feel free to help. label May 14, 2023

loganhart02 and others added 15 commits July 6, 2023 11:27

refactor model layers

f9c80a6

remove audio_config arg

658bd79

Rename loss func

759df28

Rename to delightful-tts

cd03d67

Rename loss func

0dd3aef

Remove unused modules

7b934e4

refactor imports

6160cd2

replace audio config with audio processor

378370a

Add change sample rate option

ced8f34

remove broken resample func

7a8b825

update recipe

156557c

fix style, add config docs

cfece08

fix tests and multispeaker embd dim

21dad7a

remove pyworld

03007a5

Make style and fix inference

a026cfc

erogol force-pushed the delightful-tts branch from b05da58 to a026cfc Compare July 6, 2023 09:28

erogol and others added 11 commits July 7, 2023 20:44

Split tts tests

c49a418

Fixup

96841c6

Fixup

09a2424

Fixup

4a287c1

Add argument names

2abb754

Set "random" speaker in the model Tortoise/Bark

1362cb1

Use a diff f0_cache path for delightfull tts

1fe6a53

Fix delightful speaker handling

6349950

Fix lint

dd093e0

Make style

3fde149

Merge branch 'dev' into delightful-tts

b5bf9e6

erogol merged commit 6fdb88f into dev Jul 24, 2023
38 of 44 checks passed

erogol deleted the delightful-tts branch July 24, 2023 11:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Delightful-TTS model #2095

Add Delightful-TTS model #2095

loganhart02 commented Oct 25, 2022

erogol left a comment

erogol Oct 29, 2022

loganhart02 commented Nov 1, 2022

loganhart02 commented Nov 30, 2022

erogol Dec 20, 2022

erogol Dec 20, 2022

erogol Dec 20, 2022

stale bot commented Jan 20, 2023

iamkhalidbashir commented Feb 23, 2023 •

edited

Loading

iamkhalidbashir commented Mar 14, 2023

stale bot commented May 12, 2023

erogol commented May 14, 2023

iamkhalidbashir commented May 14, 2023 via email

loganhart02 commented May 14, 2023

loganhart02 commented May 14, 2023

loganhart02 commented May 14, 2023

iamkhalidbashir commented May 14, 2023 via email

Add Delightful-TTS model #2095

Add Delightful-TTS model #2095

Conversation

loganhart02 commented Oct 25, 2022

erogol left a comment

Choose a reason for hiding this comment

erogol Oct 29, 2022

Choose a reason for hiding this comment

loganhart02 commented Nov 1, 2022

loganhart02 commented Nov 30, 2022

erogol Dec 20, 2022

Choose a reason for hiding this comment

erogol Dec 20, 2022

Choose a reason for hiding this comment

erogol Dec 20, 2022

Choose a reason for hiding this comment

stale bot commented Jan 20, 2023

iamkhalidbashir commented Feb 23, 2023 • edited Loading

iamkhalidbashir commented Mar 14, 2023

stale bot commented May 12, 2023

erogol commented May 14, 2023

iamkhalidbashir commented May 14, 2023 via email

loganhart02 commented May 14, 2023

loganhart02 commented May 14, 2023

loganhart02 commented May 14, 2023

iamkhalidbashir commented May 14, 2023 via email

iamkhalidbashir commented Feb 23, 2023 •

edited

Loading