DNN-HMM based Automatic Speech Recognition for HRI Scenarios

التفاصيل البيبلوغرافية
العنوان: DNN-HMM based Automatic Speech Recognition for HRI Scenarios
المؤلفون: Juan Pablo Escudero, Jorge Wuth, Néstor Becerra Yoma, Rodrigo Mahu, José Novoa, Josué Fredes
المصدر: HRI
بيانات النشر: ACM, 2018.
سنة النشر: 2018
مصطلحات موضوعية: Black box (phreaking), Computer science, Speech recognition, Testbed, Word error rate, 020206 networking & telecommunications, 02 engineering and technology, Markov model, 030507 speech-language pathology & audiology, 03 medical and health sciences, 0202 electrical engineering, electronic engineering, information engineering, Robot, Loudspeaker, Noise (video), 0305 other medical science, Hidden Markov model
الوصف: In this paper, we propose to replace the classical black box integration of automatic speech recognition technology in HRI applications with the incorporation of the HRI environment representation and modeling, and the robot and user states and contexts. Accordingly, this paper focuses on the environment representation and modeling by training a deep neural network-hidden Markov model based automatic speech recognition engine combining clean utterances with the acoustic-channel responses and noise that were obtained from an HRI testbed built with a PR2 mobile manipulation robot. This method avoids recording a training database in all the possible acoustic environments given an HRI scenario. Moreover, different speech recognition testing conditions were produced by recording two types of acoustics sources, i.e. a loudspeaker and human speakers, using a Microsoft Kinect mounted on top of the PR2 robot, while performing head rotations and movements towards and away from the fixed sources. In this generic HRI scenario, the resulting automatic speech recognition engine provided a word error rate that is at least 26% and 38% lower than publicly available speech recognition APIs with the playback (i.e. loudspeaker) and human testing databases, respectively, with a limited amount of training data.
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_________::023ce19e5b6b4789f2734547fc1d5fdb
https://doi.org/10.1145/3171221.3171280
حقوق: OPEN
رقم الأكسشن: edsair.doi...........023ce19e5b6b4789f2734547fc1d5fdb
قاعدة البيانات: OpenAIRE