نتائج البحث - "Shigeki Karita"

1

A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition

المؤلفون: Llion Jones, Michiel Bacchiani, Yotaro Kubo, Shigeki Karita

مصطلحات موضوعية: FOS: Computer and information sciences, Sound (cs.SD), Computer Science - Computation and Language, Computer science, Character (computing), Speech recognition, Training methods, Computer Science - Sound, Tokenization (data security), Connectionism, Moving average, Audio and Speech Processing (eess.AS), FOS: Electrical engineering, electronic engineering, information engineering, Noise (video), Computation and Language (cs.CL), Word (computer architecture), Electrical Engineering and Systems Science - Audio and Speech Processing

URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::fa3b42d54fba63daa5b94328fa765023

2

Self-Distillation for Improving CTC-Transformer-Based ASR Systems

المؤلفون: Ryo Masumura, Shigeki Karita, Tsubasa Ochiai, Yusuke Shinohara, Tomohiro Tanaka, Takafumi Moriya, Takanori Ashihara, Marc Delcroix, Hiroshi Sato

المصدر: INTERSPEECH

مصطلحات موضوعية: law, business.industry, Computer science, Transformer, Process engineering, business, Distillation, law.invention

URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_________::6a1a691481df0d15208d23e5e1df3cc3
https://doi.org/10.21437/interspeech.2020-1223

3

ESPnet-ST: All-in-One Speech Translation Toolkit

المؤلفون: Hirofumi Inaguma, Kevin Duh, Nelson Yalta, Shigeki Karita, Tomoki Hayashi, Shinji Watanabe, Shun Kiyono

المصدر: ACL (demo)

مصطلحات موضوعية: FOS: Computer and information sciences, Sound (cs.SD), Machine translation, Computer science, Feature extraction, 02 engineering and technology, 010501 environmental sciences, computer.software_genre, Translation (geometry), 01 natural sciences, Computer Science - Sound, Audio and Speech Processing (eess.AS), Speech translation, 0202 electrical engineering, electronic engineering, information engineering, FOS: Electrical engineering, electronic engineering, information engineering, 0105 earth and related environmental sciences, Computer Science - Computation and Language, business.industry, Speech processing, Benchmark (computing), 020201 artificial intelligence & image processing, Artificial intelligence, business, computer, Computation and Language (cs.CL), Natural language processing, Decoding methods, Range (computer programming), Electrical Engineering and Systems Science - Audio and Speech Processing

URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::c9d806aea6f1f867569cc1aae2be74f3
http://arxiv.org/abs/2004.10234

4

Unsupervised Learning of Disentangled Speech Content and Style Representation

المؤلفون: Andros Tjandra, Yu Zhang, Ruoming Pang, Shigeki Karita

مصطلحات موضوعية: FOS: Computer and information sciences, Sound (cs.SD), Computer Science - Computation and Language, Computer science, Speech recognition, Latent variable, Speaker recognition, Computer Science - Sound, Audio and Speech Processing (eess.AS), Encoding (memory), FOS: Electrical engineering, electronic engineering, information engineering, Unsupervised learning, Representation (mathematics), Encoder, Computation and Language (cs.CL), Utterance, Word (computer architecture), Electrical Engineering and Systems Science - Audio and Speech Processing

URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::5f71c65a8f1936418a6997949654d9a2

5

The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans

المؤلفون: Wen-Chin Huang, Shinji Watanabe, Xuankai Chang, Jing Shi, Hirofumi Inaguma, Naoyuki Kamo, Shigeki Karita, Takaaki Hori, Pengcheng Guo, Yosuke Higuchi, Aswin Shanmugam Subramanian, Wangyou Zhang, Tomoki Hayashi, Florian Boyer, Chenda Li

URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::20f88a4e9c4e7182e9495ec7e814ad57

6

Improving Transformer-Based End-to-End Speech Recognition with Connectionist Temporal Classification and Language Model Integration

المؤلفون: Marc Delcroix, Tomohiro Nakatani, Atsunori Ogawa, Shinji Watanabe, Shigeki Karita, Nelson Yalta

المصدر: INTERSPEECH

مصطلحات موضوعية: Connectionism, End-to-end principle, Computer science, Speech recognition, Language model, Transformer (machine learning model)

URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_________::9420576d3b4396bffb5521f72b10f12e
https://doi.org/10.21437/interspeech.2019-1938

7

Improved Deep Duel Model for Rescoring N-Best Speech Recognition List Using Backward LSTMLM and Ensemble Encoders

المؤلفون: Shigeki Karita, Marc Delcroix, Atsunori Ogawa, Tomohiro Nakatani

المصدر: INTERSPEECH

مصطلحات موضوعية: Computer science, Speech recognition, Encoder

URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_________::2108d0663a431633d81583d4a5ed92fe
https://doi.org/10.21437/interspeech.2019-1949

8

End-to-End SpeakerBeam for Single Channel Target Speech Recognition

المؤلفون: Keisuke Kinoshita, Tomohiro Nakatani, Marc Delcroix, Tsubasa Ochiai, Atsunori Ogawa, Shinji Watanabe, Shigeki Karita

المصدر: INTERSPEECH

مصطلحات موضوعية: End-to-end principle, Computer science, Speech recognition, Channel (broadcasting)

URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_________::925cd8e3d1ca14a52c398bd4bb6dc090
https://doi.org/10.21437/interspeech.2019-1856

9

A Comparative Study on Transformer vs RNN in Speech Applications

المؤلفون: Takenori Yoshimura, Shinji Watanabe, Ryuichi Yamamoto, Wangyou Zhang, Shigeki Karita, Hirofumi Inaguma, Nelson Yalta, Takaaki Hori, Nanxin Chen, Xiaofei Wang, Tomoki Hayashi, Masao Someki, Ziyan Jiang

المصدر: ASRU

URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::09c827041846ec3822fb8eafc296343f
http://arxiv.org/abs/1909.06317

10

Semi-supervised End-to-end Speech Recognition Using Text-to-speech and Autoencoders

المؤلفون: Marc Delcroix, Atsunori Ogawa, Shigeki Karita, Tomoharu Iwata, Tomohiro Nakatani, Shinji Watanabe

المصدر: ICASSP

مصطلحات موضوعية: Computer science, Speech recognition, Speech coding, Word error rate, 020206 networking & telecommunications, Speech synthesis, 02 engineering and technology, Semi-supervised learning, 010501 environmental sciences, computer.software_genre, 01 natural sciences, Autoencoder, 0202 electrical engineering, electronic engineering, information engineering, Joint (audio engineering), Encoder, computer, 0105 earth and related environmental sciences

URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_________::1dc9a7f37b0066ad8f5d0df363288bc0
https://doi.org/10.1109/icassp.2019.8682890

تنقيح النتائج