Vocal Concepts

Paralanguage

Metacomunnication that modify textual meaning by prosody, pitch, volume, intonation

Loudness

Subjective perceptino of sound pressure:

Pa and dB are units of pressure https://en.wikipedia.org/wiki/Sound_pressure#Sound_pressure_level

Prosody

Study of elements that are not individual phnetic segments (vowels, consonants) Properties of syllables Attributes of Prosody

Speech Tempo

Number of speech units within a given amount of time

Most commonly measured in Words Per Minute WPM Also Sounds per Second

9.4 sounds per second for poetry reading
13.83 per second for sports commentary

Intonation

Variation in pitch to indicate speaker's attitude and emotions

Isochrony or Rythm

Language feature Rhytmic division of time into equal portions by a language "French, Telugu and Yoruba ... are syllable-timed languages, ... English, Russian and Arabic ... are stress-timed languages."

Three ways to divide languge over time

Syllable timed (French, Italian, Spanish, Romanian, Brazilian Portuguese, Icelandic, Singlish,[14][15][16] Cantonese, Mandarin Chinese, Armenian, Turkish and Korean[)
Mora timed (time equal or shorter to a syllable): Japanese, Gilbertese, Slovak and Ganda
Stress timed: English, Thai, Lao, German, Russian, Danish, Swedish, Norwegian, Faroese, Dutch, European Portuguese

Stress

Means of making a syllable, words or part of tha sentence prominent

Prosodic Stress

Emphasizing words or ideas

I didn't take the test yesterday. (Somebody else did.)
I didn't take the test yesterday. (I did not take it.)
I didn't take the test yesterday. (I did something else with it.)
I didn't take the test yesterday. (I took one of several, or I didn't take the specific test that would have been implied.)
I didn't take the test yesterday. (I took something else.)
I didn't take the test yesterday. (I took it some other day.)

Stress is can be executed by:

Pitch variation
Increased duration
Increased Loudness
Timbre differences???

Pause

Interruption of sound Can convey hesitation, importane

Filled Pauses (eh, uh)
Paralingual pauses (sighs)

Chunking

Pattern of pausing of lack of pausing:

"You know what I mean?" - "No wada meeen?"
"y lo sabes" - "ylosaes"

Timbre

Preceived osound quality UNIQUE for voices

Interesting Info

Singer Identity Representation

Singer Identity Representation Learning using Self-Supervised Techniques

@inproceedings{torres2023singer,
  title={Singer Identity Representation Learning using Self-Supervised Techniques},
  author={Torres, Bernardo and Lattner, Stefan and Richard, Gael},
  booktitle={International Society for Music Information Retrieval Conference (ISMIR 2023)},
  year={2023}
}

Speaker Identification

https://www.researchgate.net/publication/360961643_Speaker_Identification_using_Speech_Recognition

It wasn't clear what features are extracted to identify speakers. Found sample audios Large-scale (1000 hours) corpus of read English speech

Read further

https://en.wikipedia.org/wiki/Psychoacoustics

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

voice_features_notes.md

voice_features_notes.md

Vocal Concepts

Paralanguage

Loudness

Prosody

Speech Tempo

Intonation

Isochrony or Rythm

Stress

Prosodic Stress

Pause

Chunking

Timbre

Interesting Info

Singer Identity Representation

Speaker Identification

Files

voice_features_notes.md

Latest commit

History

voice_features_notes.md

File metadata and controls

Vocal Concepts

Pause

Chunking

Interesting Info

Singer Identity Representation

Speaker Identification