Publications

2021

  1. SUPERB: Speech processing Universal PERformance Benchmark
    Shu-wen Yang, Po-Han Chi, Yung-Sung Chuang, Cheng-I Jeff Lai, Kushal Lakhotia, Yist Y Lin, Andy T Liu, Jiatong Shi, Xuankai Chang, Guan-Ting Lin, and others
    arXiv preprint arXiv:2105.01051 2021
  2. Hypothesis Stitcher for End-to-End Speaker-attributed ASR on Long-form Multi-talker Recordings
    Xuankai Chang, Naoyuki Kanda, Yashesh Gaur, Xiaofei Wang, Zhong Meng, and Takuya Yoshioka
    In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021
  3. Recent developments on espnet toolkit boosted by conformer
    Pengcheng Guo, Florian Boyer, Xuankai Chang, Tomoki Hayashi, Yosuke Higuchi, Hirofumi Inaguma, Naoyuki Kamo, Chenda Li, Daniel Garcia-Romero, Jiatong Shi, and others
    In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021
  4. Highland Puebla Nahuatl Speech Translation Corpus for Endangered Language Documentation
    Jiatong Shi, Jonathan D Amith, Xuankai Chang, Siddharth Dalmia, Brian Yan, and Shinji Watanabe
    In Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas 2021
  5. ESPnet-SE: end-to-end speech enhancement and separation toolkit designed for ASR integration
    Chenda Li, Jing Shi, Wangyou Zhang, Aswin Shanmugam Subramanian, Xuankai Chang, Naoyuki Kamo, Moto Hira, Tomoki Hayashi, Christoph Boeddeker, Zhuo Chen, and others
    In 2021 IEEE Spoken Language Technology Workshop (SLT) 2021
  6. Investigation of end-to-end speaker-attributed ASR for continuous multi-talker recordings
    Naoyuki Kanda, Xuankai Chang, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, and Takuya Yoshioka
    In IEEE Spoken Language Technology Workshop (SLT), 2021

2020

  1. The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans
    Shinji Watanabe, Florian Boyer, Xuankai Chang, Pengcheng Guo, Tomoki Hayashi, Yosuke Higuchi, Takaaki Hori, Wen-Chin Huang, Hirofumi Inaguma, Naoyuki Kamo, and others
    arXiv preprint arXiv:2012.13006 2020
  2. Sequence to Multi-Sequence Learning via Conditional Chain Mapping for Mixture Signals
    Jing Shi, Xuankai Chang, Pengcheng Guo, Shinji Watanabe, Yusuke Fujita, Jiaming Xu, Bo Xu, and Lei Xie
    Advances in Neural Information Processing Systems 2020
  3. End-to-End Far-Field Speech Recognition with Unified Dereverberation and Beamforming
    Wangyou Zhang, Aswin Shanmugam Subramanian, Xuankai Chang, Shinji Watanabe, and Yanmin Qian
    Proc. Interspeech 2020 2020
  4. End-to-end multi-speaker speech recognition with transformer
    Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, and Shinji Watanabe
    In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020
  5. CHiME-6 challenge: Tackling multispeaker speech recognition for unsegmented recordings
    Shinji Watanabe, Michael Mandel, Jon Barker, Emmanuel Vincent, Ashish Arora, Xuankai Chang, Sanjeev Khudanpur, Vimal Manohar, Daniel Povey, Desh Raj, and others
    arXiv preprint arXiv:2004.09249 2020
  6. Improving end-to-end single-channel multi-talker speech recognition
    Wangyou Zhang, Xuankai Chang, Yanmin Qian, and Shinji Watanabe
    IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2020
  7. End-to-End ASR with Adaptive Span Self-Attention
    Xuankai Chang, Aswin Shanmugam Subramanian, Pengcheng Guo, Shinji Watanabe, Yuya Fujita, and Motoi Omachi
    Proc. Interspeech 2020 2020

2019

  1. MIMO-Speech: End-to-end multi-channel multi-speaker speech recognition
    Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, and Shinji Watanabe
    In IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2019
  2. End-to-end monaural multi-speaker ASR system without pretraining
    Xuankai Chang, Yanmin Qian, Kai Yu, and Shinji Watanabe
    In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019
  3. Knowledge Distillation for End-to-End Monaural Multi-Talker ASR System$
    Wangyou Zhang, Xuankai Chang, and Yanmin Qian
    Conference of the International Speech Communication Association (InterSpeech), 2019

2018

  1. Single-channel multi-talker speech recognition with permutation invariant training
    Yanmin Qian, Xuankai Chang, and Dong Yu
    Speech Communication 2018
  2. Monaural Multi-Talker Speech Recognition with Attention Mechanism and Gated Convolutional Networks
    Xuankai Chang, Yanmin Qian, and Dong Yu
    Conference of the International Speech Communication Association (InterSpeech), 2018
  3. Adaptive permutation invariant training with auxiliary information for monaural multi-talker speech recognition
    Xuankai Chang, Yanmin Qian, and Dong Yu
    In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018
  4. Past review, current progress, and challenges ahead on the cocktail party problem
    Yan-min Qian, Chao Weng, Xuan-kai Chang, Shuai Wang, and Dong Yu
    Frontiers of Information Technology & Electronic Engineering 2018

2017

  1. Recognizing Multi-Talker Speech with Permutation Invariant Training
    Dong Yu, Xuankai Chang, and Yanmin Qian
    Conference of the International Speech Communication Association (InterSpeech), 2017

2016

  1. Unrestricted Vocabulary Keyword Spotting Using LSTM-CTC.
    Yimeng Zhuang, Xuankai Chang, Yanmin Qian, and Kai Yu
    In Conference of the International Speech Communication Association (InterSpeech), 2016