Difference results of Faster-Whisper wrt Whisper for 'no_speech_threshold' #316

raulqf · 2023-06-21T10:37:42Z

raulqf
Jun 21, 2023

I've been working with an audio clip in spanish that contains speech along large silence segments. Faster-whisper transcription is failing, if the 'no_speech_threshold' is set to 0.6, its default value. I've tested, large-v2 and medium obtaining a similar behavior. However, if you increment it to 1.0, the transcription is correct.

I've compared the results with Whisper and the transcription is correct using the default values.

If anyone wants to play with it, you could download the audio clip from here:

https://www.dropbox.com/s/1r4cjvdkqcc0nr1/test_audio.wav?dl=0

guillaumekln · 2023-06-21T12:25:03Z

guillaumekln
Jun 21, 2023

I think you are using the master version of openai/whisper which changed the behavior of no_speech_threshold: openai/whisper@e334ff1

We prepared the same change in #225 but I wanted to wait for OpenAI to cut a new version before merging it. If you use this branch you should get the same transcription as openai-whisper.

However, note that faster-whisper has a more robust way to deal with silence with the argument vad_filter=True.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Difference results of Faster-Whisper wrt Whisper for 'no_speech_threshold' #316

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Difference results of Faster-Whisper wrt Whisper for 'no_speech_threshold' #316

raulqf Jun 21, 2023

Replies: 1 comment

guillaumekln Jun 21, 2023

raulqf
Jun 21, 2023

guillaumekln
Jun 21, 2023