Generation of Text and Speech Corpora

Lieferzeit: Lieferbar innerhalb 14 Tagen

71,90 

For Computer Processing and Recognition of Bangla

ISBN: 3659777129
ISBN 13: 9783659777127
Autor: Khan, Md Farukuzzaman
Verlag: LAP LAMBERT Academic Publishing
Umfang: 168 S.
Erscheinungsdatum: 06.09.2019
Auflage: 1/2019
Format: 1.1 x 22 x 15
Gewicht: 268 g
Produktform: Kartoniert
Einband: Kartoniert
Artikelnummer: 7957232 Kategorie:

Beschreibung

Recent trends in the development of language related technology finds unavoidable requirement of language related resources and acquiring knowledge from these resources. In this prospect corpus-based methods are getting strong push from various laboratories throughout the world in Bangla language processing. As a continuation of these efforts, new Bangla text corpus BdNC01 and several speech corpora were generated in this work. The texts were collected from web editions of several leading Bangla news papers over a long period of time to avoid time dependency of word frequency. More than eleven million word tokens were collected during a period of six years. The corpus was manually checked and error-corrected each time before preserving in final repository as ASCII and Unicode texts. Popular words derived from text corpus, we recorded the largest speech corpora in Bangla language. It has been specifically designed for various research activities related to HMM-based speaker-independent speech recognition.

Autorenporträt

Dr. Md. Farukuzzaman Khan is a professor of Islamic University, Bangladesh. His work now focuses on language technology major in Bangla. He received Ph.D. and M.Phil. from Islamic University in 2014 and 2003. Dr. Khan earned M.Sc. and B.Sc. in Applied Physics and Electronics from University of Rajshahi. He was born in Bangladesh on 10 August 1966.

Herstellerkennzeichnung:


BoD - Books on Demand
In de Tarpen 42
22848 Norderstedt
DE

E-Mail: info@bod.de

Das könnte Ihnen auch gefallen …