Atlas / Learn / Papers / oai:commons.erau.edu:jaaer-2052
Embry-Riddle Scholarly Commons · Journal article (JAAER)
Low-Resource Automatic Speech Recognition Domain Adaptation – A Case-Study in Aviation Maintenance
Attribution
This is the abstract and citation. Full text lives at Embry-Riddle Scholarly Commons — we link out rather than host. All credit to the authors and Embry-Riddle Aeronautical University.
Abstract
Verbatim from Embry-Riddle Scholarly Commons. Not paraphrased, not summarized.
With timeliness and efficiency being critical in the aviation maintenance industry, the need has been growing for smart technological solutions that optimize and streamline the different underlying tasks (Bergkvist & Sabbagh, 2021). One such task is the technical documentation of the performed maintenance operations (Chandola et al., 2022). Instead of manual documentation, voice tools that transcribe spoken logbook entries allow technicians to document their work right away in a hands-free and time efficient manner. However, an accurate automatic speech recognition (ASR) model requires large training corpora (Siyaev & Jo, 2021a), which are lacking in the domain of aviation maintenance. In addition, ASR models which are trained on huge corpora in standard English perform poorly in such a technical domain with non-standard terminology (Siyaev & Jo, 2021a, 2021b). Hence, this study investigates the extent to which fine-tuning an ASR model, pre-trained on standard English corpora, on limited in-domain data improves its recognition performance in the technical domain of aviation maintenance. We present a case study on one such pre-trained ASR model, wav2vec 2.0 (Baevski et al., 2020). Results showed that fine-tuning the model on a limited anonymized dataset of maintenance logbook entries significantly reduced its error rates when tested on not only an anonymized in-domain dataset, but also a non-anonymized one. This suggests that any available aviation maintenance logbooks, even if anonymized for privacy, can be used to fine-tune general-purpose ASR models and enhance their in-domain performance. Lastly, an analysis on the influence of voice characteristics on model performance stressed the need for balanced datasets representative of the population of aviation maintenance technicians.
Authors
- Amin, Nadine, M.S. Embry-Riddle Aeronautical University
- Yother, Tracy L., Ph.D. Embry-Riddle Aeronautical University
- Rayz, Julia, Ph.D. Embry-Riddle Aeronautical University
Keywords
- automatic speech recognition
- aviation maintenance logbooks
- domain adaptation
- Artificial Intelligence and Robotics
- Maintenance Technology
Citation: Amin, Nadine, M.S., Yother, Tracy L., Ph.D., Rayz, Julia, Ph.D. (2024). Low-Resource Automatic Speech Recognition Domain Adaptation – A Case-Study in Aviation Maintenance. Embry-Riddle Aeronautical University. Embry-Riddle Scholarly Commons ID oai:commons.erau.edu:jaaer-2052. https://commons.erau.edu/jaaer/vol33/iss4/8 ↗