![]() ![]() Several lines of evidence support the contribution of the latter: the capacity of our working memory is estimated at 4–7 words (e.g., ,). Answers posed in the literature involve the human physiology or aspects of cognitive processes related to the production of vocal output (e.g., ). The reasons that we converse using short phrases are unclear. There is no widespread agreement on the nature of intonation units, and even their existence has been contested by some scholars (e.g., ). These units are often referred to as intonational phrases or intonation units (IUs) and although a precise definition is hard to come by, the notion of a well-defined (‘single’) pitch contour is often regarded as a necessary trait. question, saliency of information via emphasis, conversation action, discourse function, attitudes and sentiments. ) that conveniently avail to the interlocutor a variety of linguistic functions: sentence mode, e.g., assertion vs. Short, often distinctive phrases, which are bounded by prosodic cues, (cf. It is also widely recognized that the distribution of prosodic information throughout the flow of speech is neither uniform nor random (e.g., question/statement boundary tones). Information in spoken language is conveyed not only through words but concurrently through acoustic cues–fundamental frequency (pitch), intensity (volume), speech rate and rhythm, and timbre, collectively termed Prosody. 713218 ( ) and the Braginsky Centre grant no.435300353612 ( ).Ĭompeting interests: The authors have declared that no competing interests exist. Author PI Elisha Moses was awarded Yeda Sela grant no. įunding: This work was supported in part by the ISF grant number 1385_16, the Yeda-Sela Fund and the Minerva Foundation, Germany. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.ĭata Availability: The input speech data is available at. Received: AugAccepted: ApPublished: May 3, 2021Ĭopyright: © 2021 Biron et al. PLoS ONE 16(5):Įditor: Claudia Männel, Max-Planck-Institut fur Kognitions- und Neurowissenschaften, GERMANY ![]() (2021) Automatic detection of prosodic boundaries in spontaneous speech. Collectively, our findings support the notion of prosodic phrases that represent coherent patterns across textual and acoustic parameters.Ĭitation: Biron T, Baum D, Freche D, Matalon N, Ehrmann N, Weinreb E, et al. The resulting phrases preserve syntactic validity, exhibit pitch reset, and compare well with manual tagging of prosodic boundaries. Boundaries are identified using discontinuities in speech rate (pre-boundary lengthening and phrase-initial acceleration) and silent pauses. We propose a method which does not require model training and utilizes two prosodic cues that are based on ASR output. Efforts to date have focused on detecting phrase boundaries using a variety of linguistic and acoustic cues. This is done naturally by the human ear, yet has proved surprisingly difficult to achieve reliably and simply in an automatic manner. The ability to parse conversational speech depends crucially on the ability to identify boundaries between prosodic phrases. ![]() Automatic speech recognition (ASR) and natural language processing (NLP) are expected to benefit from an effective, simple, and reliable method to automatically parse conversational speech. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |