The current status and challenges in constructing a civil aviation English teaching corpus are introduced, highlighting issues such as high professional barriers, significant time costs, and limitations in teaching applications. It is also noted that the rise of large language models offers new pathways to address these problems. Their advantages in handling massive language data, text generation, translation, and analysis are emphasized. This study aims to evaluate the practical effectiveness of large language models in various stages of building a civil aviation English teaching corpus and proposes three core research questions.
First, the application of artificial intelligence and large language models in corpus construction is discussed. Natural language processing technologies have long been used for foundational corpus tasks, and with technological advancements, AI now engages in more complex aspects such as data collection and text cleaning. Domestic scholars have also attempted to introduce AI technology to construct corpora, with large language models demonstrating cross-domain knowledge integration capabilities. However, systematic research on building specialized English teaching corpora for specific purposes remains insufficient. Subsequently, the chapter examines the current application status of civil aviation English teaching corpora, noting a growing trend in applied research within the field of English for Specific Purposes (ESP). Civil aviation English teaching relies on authentic language resources, and corpora hold potential in terminology teaching and professional discourse comprehension. Nevertheless, existing studies often depend on small-scale corpora and lack systematic empirical research on the role of emerging intelligent technologies in constructing civil aviation English corpora and terminology teaching.
This chapter elaborates on the research design and experimental methodology for applying large language models in the construction and application of civil aviation English teaching corpora. The study is divided into two main phases: corpus construction and teaching application. A combination of qualitative and quantitative methods is employed to systematically evaluate the performance of large language models in data collection, processing, alignment, and teaching application, with comparisons made against traditional methods. The experiment utilizes three large language models—Doubao, Kimi, and DeepSeek—with authoritative documents from the International Civil Aviation Organization (ICAO) serving as the benchmark corpus, ensuring that experimental data is current up to June 30, 2025. In terms of experimental control, a unified baseline prompt is designed and fine-tuned according to each model's instruction-following capability. During the corpus construction phase, the efficiency and accuracy of large language models in data collection recommendations, preprocessing (denoising), and bilingual alignment are primarily assessed. A control group using traditional manual methods combined with tools like Tmxmall is established for comparison, with quantitative metrics such as processing duration and alignment accuracy rates recorded. In the teaching application phase, the focus is on verifying how large language models can assist in developing new teaching applications based on the corpus. This includes generating teaching resources and tools, conducting terminology co-occurrence statistics and contextual example extraction, constructing a "Three-Dimensional Filtering Model for Teaching Adaptability" to screen example sentences, and generating diverse test questions. Feasibility, adaptability, and practicality are evaluated through manual review.
This chapter focuses on the application of large language models in constructing a civil aviation English teaching corpus, covering data collection, processing, and alignment. In data collection, large language models can recommend highly adaptable corpora for specific teaching scenarios, such as air traffic control operations and safety management, and can extract publicly available textual data from web pages. In the data processing stage, large language models can perform denoising on PDF files, though issues such as long-text truncation and errors in table and chart recognition may arise. Strategies like manual preprocessing and document segmentation can improve text usability. In terms of corpus alignment, large language models demonstrate high accuracy and efficiency when handling Chinese-English bilingual corpora from ICAO standard documents. In particular, Doubao and Kimi show stable performance, outperforming traditional manual methods combined with alignment tools. Moreover, they can output text in XML format, facilitating subsequent research and usage.
This chapter focuses on the teaching applications of a large language model-assisted civil aviation English teaching corpus, primarily exploring three aspects: terminology teaching, teaching example sentence selection, and test question generation. In terminology teaching, the integration of large language models with the corpus addresses challenges in traditional teaching, such as "polysemy" and "synonym differentiation." Taking the term "clearance" as an example, by analyzing bilingual corpora, the system generates a visual teaching resource package containing contextual examples and collocation word clouds, thereby lowering the barrier for teachers to use the corpus. For teaching example sentence selection, a "Three-Dimensional Filtering Model for Teaching Adaptability" is designed to intelligently screen sentences based on three dimensions: sentence length, vocabulary difficulty, and domain relevance, overcoming the limitations of traditional manual selection. In test question generation, large language models, leveraging professional corpora, can intelligently produce high-quality, multi-dimensional test questions, alleviating the burden on teachers in test design and enhancing the professional validity and coverage of assessments.
Through empirical experiments, this study validates the efficiency and accuracy of large language models in constructing and applying civil aviation English teaching corpora, reducing construction costs and developing various teaching application tools. However, the study has limitations, such as constraints related to model versions and a lack of long-term teaching practice applications. Future research should explore more model versions and integrate intelligent tools into teaching practices for in-depth evaluation.
* 以上内容由AI自动生成,内容仅供参考。对于因使用本网站以上内容产生的相关后果,本网站不承担任何商业和法律责任。