Ovido
Språk
  • Engelska
  • Spanska
  • Franska
  • Portugisiska
  • Tyska
  • Italienska
  • Nederländska
  • Svenska
Text
  • Stora bokstäver

Användare

  • Logga in
  • Skapa konto
  • Uppgradera till Premium
Ovido
  • Hem
  • Logga in
  • Skapa konto

PY2507.2 ~ {Phonetics}

Source-Filter Theory (Fant, 1960)

describes speech production as a 2 stage process: a sound source is modified by the vt, acting as a filter, to produce distinct speech sounds

resonance freq

systems that vibrate when they are stimulated [e.g. pendulum] freq depends on length of string so longer the string slower the movement

resonance

the reinforcement or prolongation of sound by refelection from a surface or by the synchronous vibration of a neighbouring object

what do peaks represent

prefered freq responses of that system [resonant freq] + because this is diff for every ound referred to as a filter

what happens with buzz like sound

taken through air filled column to the lips and any freq componenets near these peaks [resonant freq] get boosted + pass through the system well

formants

resonances of vt + spaced uniformly in freq when the vocal tract has a constant width along its length [case for the Schwa vowel] + are numbered [from low to high freq] F1, F2, F3 etc

what does changing ths shape of vt do

changes position of formant freq

why is it called source filter theory

separate control of the 2 parts of the system: source [pitch: which is the buzzing from glottal source] and filter [vowel type]

how to change vowel identity

we have indeoenedeont control of the 2 parts of the system; we can adjust the rate of vocal-fold vibration to change voice pitch + shape of vt to change vowel identity

what makes a spoken consonant diff from a vowel

all consonants involve narrowing or constriction of vt [somehwere along its length] + many consonants involve many/abrupt changes in vt shape [can't sing a consonant]

place of articulation

point of max constriction along the vocal tract [there are many places where that can happen]

frication [fricative noise]

if constriction is narrow enough to rpoduce air turbulence, a hiss like sound is produced + modified by shape of vt

plosion [plosive burst]

if airflow is completely blocked & abruptly released, a brief burst of noise is produced + modified by shape of vt

8 diff places of articulation

bilibial [ba] + labiodental [fa] + dental + aiveolar [da] + retroflex + palato-aiveolar + palatal [sa] + velar [ga]

3 characteristics that classify consonants

place or articulation + manner of articulation + voicing

cross-speaker variation

refers to the diff in acoustic characteristics, such as speaking rate, intensity, & affect, that exist between diff speakers, which can pose challenges for technologies like Automatic Speech Recognition

vocal tract length

measured from the glottis to the lips, is a crucial factor in speech production + where sound produced at the larynx (or syrinx in birds) is filtered and shaped into speech

what does length of vt indicate

varies considerably between men + women: longer the vt of speaker the lower the freq of the formants produced when arriclating a particular vowel + acoustic properties of consonants affected

what isn't affected by vt

ratio of formant freq to one another [hence why we listeners can undetsand men women and children talking about the same thing]

tongue movement in speech

does not complete its movements towards idealized positions for vowel articulation [e.g. less movement from the centre of the mouth on the front-back + high-low dimensions]

vowel undershoot

makes all vowels acoustically more similar to the neutral [Schwa] vowel 'uh' + it reflects inertia of the articulators +becomes more marked as the rate of speech increases

coarticulation

particularly important for consonant production
the articulation of 2 or more speech sounds together so that one influences the other

the anticipatory principle

articulations of neighbouring phonemes interact, vowel 'kee' is produced with front tongue position + spread lips whilst vowel 'koo' produced with back tongue posiiton + rounded lips [initial stop is produced diff in anticipation of following vowel]

how is speech info transmitted

in parallel, one segment of speech signal may carry info about more than one phoneme at once [non-linear process]

speech spectogram

visual representation of speech produced by a sound spectograph - in free-flowing speech the freq spectrum is almost continuously changing

what does a spectogram show you

x axis = freq + y axis = time, and how much energy there is at any particular freq in time is shown by how dark the trace is [so dark is where energy is concentrated]

stimuli used in speech research

recordings of natural utterances
modified natural utterances

synthesize artificial speech-like stimuli

recording of natural utterances

you get lots of people to speak various bits of text and you record them then look for common features that seem reliably to be present when diff people speak particular phonemes or syllables in particular contexts

modified natural utterances

take natural recordings but this time to deliberately modify them in some way - so you deliberately distort or eliminate some of the features present in speech then measure impact that has on intelligibility of the speech [to listeners-essentially if you eliminate a critical part of speech it isn't intelligible]

synthesize artificial speech-like stimuli

most widely used + most flexible - deliberately synthesize speech like stimuli [advantage - you can chose to simulate a particular subset of features-then establish whether that is capable of supporting intelligibility] again we use listeners yo measure effectiveness

speech intelligibility

refers to clarity with which speech can be understood by a listener - a measure of how well a person's speech can be perceived + interpreted based on ability of listener to recognize individual words, sounds + sentences

articulation tests

most widely used method of assessing intelligibility - you have a panel of listeners and you play them a set of recordings that are carefully crafted they contain a list of items [such as sentences, words, or nonsense syllables]

articulation score

in an articulation test - the % of items correctly perceived

phonetically balanced lists

we have roughly 40 speech sounds in english - these are represented in these lists in roughly the same proportion they would be in english language

what happens if you keep listening conditions constant

the scores that people get vary with the type of list you give them - highest scores for sentence list, middle scores for word lists and lowest for nonsense words -

why do scores vary with type of list you give them

sentence can be understood fully even if every word if not perceived correctly when presented in isolation [context of sentence also provides info about whats being said]

borderline intelligibility

normal convo can be carried out without too much difficulty in conditions that would give 50% articulation scores on typical word lists [ if you give people a list of isolated words and they get 50% of them right you can have a decent convo with them]

white noise

noise with a flat spectrum - its got roughly equal energy across the whole of the range audible freqs - equal energy spread across freq and is a hiss like sound

a signal-to-noise ratio of 20dB + 0dB

20 - has no effect on intelligibility
0 - gives borderline interligibility

negative S/N ratios

speech is intelligible - if the noise comes from a diff direction or is interrupted, or fluctuates [periodic gaps in it] when noise is more intense than the speech [so we're robust to the noise]

what do filtering studies show

no particular freq region is essential for speech recognition
transimission systems [telephone systems] doesn't do a perfect job at reproducing all content of og signal and only pass some of freqs and the rest are filtered

high pass [+ low pass] filter

systems transmit all freq above a secific freq and freq below get cut out [low pass does the opposite]

pass band of landline telephone

3.2 kHz wide

peak clipping

if you overload an amplifier - it chops off peaks and troughs at a venue - produces highly unnatural but still intelligible speech [has to be really severe for you to not understand aat all]

speech & cues

speech is often intelligible even when the acoustic cues are highly degraded cause speech wave contains multiple cues to iits message - so anyone distortion destroys some cues but others remain that can allow speech recognition in adverse listening conditions

sine-wave speech

an acoustuc 'cartoon' of normal speech - you start with a spectogram of real utterance you track each formant and reaplace with a pure tone whistle movement that follows the trace of that formant + changes only in freq & level

listening in speech mode

most people when hearing this unnatural sounding version of speech can begin to understand these stimuli only after learning that they are degraded speech

what don't we need in terms of understanding real speech

all the acoustic complexity of real speech - although synthesised speech can sound very unnatural

which formants are critical for formant identity

experiments using synthetic speech have shown that the 1st three [esp F1 + F2] most important for vowel recognition

F1 freq

inversely proportional to vowel height [higher tongue position the lower the 1st formant freq]

F2 freq

proportional vowel frontedness [front vowels have a high F2 whilst low vowels have a low F2]

what affects formant freq of vowels

cross-speaker variation + undershoot

formant freq

must be normalized by perceiver to take account of diferences in vocal tract length between men, women + children

vowel undersoot

when a person doesnt move ther tongue to the ideal position in the mouth during conversational speech [so F1 + F2 are not far enough to make it easier for listener to disinguish diff vowels from eachother]

pattern playback synthesiser

started with a handdrawn spectogram and then change that into synthetic speech like sounds that corresponded to it

context-dependent

variation of plosive burst then diff individual vowels followed
so dependent on freq of plosive burst + freq of vowel formants following

coarticulation

whenever you're producing speechsound [phoneme] currently you're already preparing for the next one [so they interact]

absence of plosive burst

if you have a rapid change in freq in the beginning of 2nd formant (F2 transition) that can generate the percept of a stop consonant vowel syllable tone [e.g hat vs hot]

why is speech described as non-linear + non-invariant

as much as phonemes sound like a bead of words one after another, real life speech involves complex interactions between neighbouring things we're articulating

what does satisfactory phoneme perception typically require

relating acoustic features at several diff points in time as well as at diff points in the freq spectrum

audiovisual intergration

the McGurk effect - made recordings of bii-syllabke both auido and visual recordings and deliberately mismatched them and then played them at the same time

why is speech perception not an auditory skill

McGurk effect shows how important visual cues are in speech perception, although syllable was acoustically the same each time, seeing the articulatory movements could alter what was heard

effect of linguistic cues

expectation can influence speech intelligibility - the rules of language constrain the possible identities of the speech signal far more than most people realize [i.e. Christmas always occurs in the month of.....]

effect of linguistic cues: phonological rules + lexical constraints

most languages tend to restrict possible combinations of phonemes - 'ngees' are impossible constructions in the english language + when you see 'sh...p' you'll assume 'sheep' not 'shoop'

effect of linguistic cues: sentence structure

meaning [semantics] + context [speaker identity / subject of convo = allows you to infer what words might come next]

Quiz
PY2507.1 ~ {Communication by sound}1.0
eval chap 3
Läkemedel
wx
metafisicas espiritualistas.
spans
Quiz Time
Samhällsekonomi
Test
Chemické UH
Organogenní UH
Hořlavé UH
Hořlavé UH
phyisque
Unité 3 Volet 2
protection sociale
filosofi
preliminary
Rapporti Anatomia II
examen rocio - copia
examen rocio
1er exanen
Erdkunde Klausur
C# Season 2 Chapter 11 part 2
Clases Noviembre
Clases Octubre
Salter
omtenta anatomi 2 hela tentandetta är själva tentan
Matspjälkning Agne
Scienze
Cose da ricordare di Anatomia 2blabla
frans
seminarium
fire
Creation
demokrati och diktatur
ciênciassistema urinário 6-ano.
diass
Juridik
test 1
test 2 (first)
test 1
eval chap 2
Intro to AIS
Ethics
busuu_invitaciones
Frequency
RVUAGM (nephro)
Toca Boca
concorso 5
1. Productos con historia
Traduce_semana_8_parte2
Bipolarisation du monde et émergence du Tiers-monde
digital marketing
oliver
Religion quiz 2
historydefinitions
A2
Ai
Krukväxter
Snittblommor
Materiallära
De fem Stilformerna
catala
Literatura SXV
Florist skadedjur osv
Tus metas
Etapa 5
engelska glosor v 8
clothes
10. Llegar a la meta
grupo R
psychologie devoirs 1
CLJ 3
Värme
Fysik, hävstänger
Computer exam 1
bedömning och lärande
GET
Friend Quiz
Homie Quiz
Spin scores
Vetenskap teori och metod
libro 2 lección 1
w/Article
immune system
blood bio12https://www.purposegames.com/game/blood-types-antigens-and-antibodies
Arabo
tema 4 estequimetría y química industrial
Soccer Positions
unit 4,5 y 6 (2)
doa
FC1
A+Bthis is a quiz i made for u aden
Química UNAM (2)
filosofie begrippen module 2
20英语
Fysik, Rörelse
Fysik, kraft
describing house and rooms
prefixes
part of the house and garden
type of house
concorso 4
vocabulario unit 4
c# Season 2 chapter 11 part 1
c# Season 2 chapter 10 part 4
c# Season 2 chapter 10 part 3
c# Season 2 chapter 10 part 2
c# Season 2 Chapter 10 part 1
Geographic features
c# Season 2 chapter 9 part 3
Genetik
c# Season 2 chapter 9 part 2
c# Season 2 chapter 9 part 1
Vocabulario ingles
漢字 Lesson 9-13
Ruimtemeetkunde def
Histo tissus nerveux
yasirintentional torts
yasir
Vocabulaire T7
landen
spanjoren
bisectriz
sozij
Omvårdnad
alphabet
PLENOS
20英语
Chapter 3
Quiz 2 Prelim
Religion quiz 1
literary terms
Braveheart
interrogazione
organica
kreglinger deel 3
kreglinger deel 2
Francês "Mes copains et moi
Come Sample
Political
Kreglinger deel 1
Sociolingvistik
vocabulaire Frans
Groupe
extras
nouriture
Q3 NRI Quiz #2
Ficha formativa
nourriture
nourriture
nourriture
Anatomie-chap1
nourriture
nourriture
concorso 3
nourriture
Literatura Novelas s. XX hasta hoy
social studies
What nexthola
Social Causes
Extern redovisning - kopia
democratie
Extern redovisning - kopia
vocab unit 4
Ethics 3
Extern redovisning - kopia - kopia
Ethics 2
ethics 1
Economic causes
PART 1
Examen 2 (1° parte)
methodologie
psychologie sociale G.DELELIS
8
vocabulary unit 4,5& 6
ALLOS in the Philippines
Lexture 1 notes
Prefix, Grekiska/Latinska ord översatt till Svenska
Suffix, Grekiska/Latinska ord översatt till Svenska
Français
Histoire
Natur 2
Expansions du nom Français
2.7 Lichaamsverzorging en cosmetica
Español UNAM (2)
2.6 Uiterlijk
Industriella revolutionen
Espagnol
histoire leh
Socio. : culture - socialisation - socialisation de genre - immigration et diversité ethno. - santé = phénomène social
Extern redovisning
Plugg HKKAktie
glosor
preguntas morfemas
tyska prov 19/2
Español UNAM
so
2.5 Activiteit, beweging en toestand
Engelska glosor
2.4 Zintuigen en lichamelijke reacties
Spanska glosor
Tema 5 curvas
Biología UNAM (3)
2.3 Geboorte, levensloop, dood
2.2 Seksualiteit en voortplanting
Geographie Afrika
mesa
2.1 Lichaamsdelen en organen
geographie
Histoire de france du 20e au 21e
quiz 1
Forskningsmetod
Suffix
Defi 2 M2 et M3
Traduce_semana_8
Examen 1 (2° parte)
Food and drinks
Spanish
Componisten
Lektion 1
Gudarna
Lexical Language Features
Phonological Language Features
Diritto Costituzionale
Grammatical Features of English
All Quize Data o Nätt
vocabulario libro ingles
ethik
gui,yuiyi,uvygj,uy,ku
Week 8- Skin Care2: Treating Hyperpigmentation - copy
vocab book
omprov spanska
Inglés