Date of Award

Spring 1-1-2016

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Linguistics

First Advisor

Rebecca Scarborough

Second Advisor

Eliana Calunga

Third Advisor

Bhuvana Narasimhan

Fourth Advisor

David Rood

Fifth Advisor

Sarel Van Vuuren

Abstract

This investigation explores the potential effect on perception of speech visual cues associated with Arabic gutturals (AGs) and Arabic emphatics (AEs); AEs are pharyngealized phonemes characterized by a visually salient primary articulation but a rather invisible secondary articulation produced deep in the pharynx. The corpus consisted of 72 minimal pairs each containing two contrasting consonants of interest (COIs), an emphatic versus a non-emphatic, or a guttural paired with another guttural. In order to assess the potential effect that visual speech information in the lips, chin, cheeks, and neck has on the perception of the COIs, production data elicited from 4 native Lebanese speakers was captured on videos that were edited to allow perceivers to see only certain regions of the face. Fifty three Lebanese perceivers watched the muted movies each presented with a minimal pair containing the word uttered in the video, and selected in a forced identification task the word they thought they saw the speaker say.

The speakers’ speech was analyzed to help explore what in their production informed correct identification of the COIs. Perceivers were above chance at correctly identifying AEs and AGs, though AEs were better perceived than AGs. In the emphatic category, the effect on perception of measurement differences between a word and its pair was submitted to automatic speech recognition. The machine learning models were generally successful at correctly classifying COIs as emphatic or non-emphatics across vowel contexts; the models were able to predict the probability of perceivers’ accuracy in identifying certain COIs produced by certain speakers; also, an overlap between the measurements selected by the computer and those selected by human perceivers was found. No difference in perception of AEs according to the part of the face that was visible was observed, suggesting that the lips, present in all of the videos, were most important for perception of emphasis. Conversely, in the perception of AGs, lips were not as informative and perceivers relied more on cheeks and chin. The presence of visible cues associated with the AEs, particularly in the lips, suggests that such visual cues might be informative for non-native learners as well, if they were trained to attend to them.

Share

COinS