Establishing a COVID-19 lemmatized word list for journalists and ESP learners

Hadeel A Saed, Riyad F. Hussein, Ahmad S Haider, Saleh Al-Salman, Iyad M. Odeh


The aim of this research is two-fold; first, to explore the most frequent COVID-19 inspired words in medical news reporting contexts, and second, to classify them into different categories. This paper adopts a corpus-based approach to build a lemmatized academic word list (AWL) inspired by the COVID-19 pandemic. Factiva was used to retrieve the pandemic-related articles published in News Rx from January 1 - October 31, 2020. A total number of 18,249,093-word corpus was compiled. The corpus linguistic software program Wordsmith (WS-6) (Scott, 2012) was used to generate a word list based on the complied corpus. Subsequent to compiling, lemmatizing, and analyzing the AWL, six categories were identified, namely, acronyms and abbreviation, diseases, COVID-19, biology, medicine, and scientific disciplines, all of which are of essential use for media workers, ESP learners of journalism, medicine, nursing, pharmacy, and allied health sciences. Building such a discipline-specific glossary will be of special pedagogical value for health journalists, textbook writers and curriculum designers, instructors, and ESP learners in the health sciences field. One of the major contributions of this research is establishing lemmas of a large set of AWL. This set can be utilized by news media workers, health communication specialists, and ESP learners. Lemmatization will ensure rapid dissemination of the word list and its integration in the linguistic system through derivation and other word-formation processes.


COVID-19; ESP; journalism; lemmatization; pedagogy

Full Text:



Abu Melhim, A. (2013). Exploring the historical development of ESP and its relation to English language teaching today. European Journal of Social Sciences, 40(4), 615-627.

Akut, K. B. (2020). Morphological analysis of the Neologisms during the COVID-19 pandemic. International Journal of English Language Studies, 2(3), 01-07.

Al-Abbas, L. S., & Haider, A. S. (2020). The representation of homosexuals in Arabic-language news outlets. Equality, Diversity Inclusion: An International Journal, 1-29.

Al-Salman, S., & Haider, A. S. (2021a). COVID-19 trending neologisms and word formation processes in English. Russian Journal of Linguistics, 25(1).

Al-Salman, S., & Haider, A. S. (2021b). Jordanian university students’ views on emergency online learning during COVID-19. Online Learning, 25(1), 286-302.

Al-Salman, S., & Haider, A. S. (2021c). The representation of Covid-19 and China in Reuters’ and Xinhua’s headlines. Search (Malaysia), 13(1), 93-110.

Almahasees, Z., Mohsen, K., & Omer, M. (2021). Faculty’s and students’ perceptions of online learning during COVID-19. Frontiers in Education.

Coxhead, A. (2000). A new academic word list. TESOL quarterly, 34(2), 213-238.

Crystal, D. (2008). A Dictionary of Linguistics and Phonetics (6 ed.). Blackwell Publishing.

Csomay, E., & Prades, A. (2018). Academic vocabulary in ESL student papers: A corpus-based study. Journal of English for Academic Purposes, 33, 100-118.

Durrant, P. (2016). To what extent is the Academic Vocabulary List relevant to university student writing? English for Specific Purposes, 43, 49-61.

Gablasova, D., Brezina, V., & McEnery, T. (2017). Collocations in corpus‐based language learning research: Identifying, comparing, and interpreting the evidence. Language learning, 67(S1), 155-179.

Gilmore, A., & Millar, N. (2018). The language of civil engineering research articles: A corpus-based approach. English for Specific Purposes, 51, 1-17.

Green, C. (2019). Enriching the academic wordlist and Secondary Vocabulary Lists with lexicogrammar: Toward a pattern grammar of academic vocabulary. System, 87, 1-10.

Green, C., & Lambert, J. (2018). Advancing disciplinary literacy through English for academic purposes: Discipline-specific wordlists, collocations and word families for eight secondary subjects. Journal of English for Academic Purposes, 35, 105-115.

Haider, A. S. (2019). Using corpus linguistic techniques in (Critical) discourse studies reduces but does not remove bias: Evidence from an Arabic corpus about refugees. Poznan Studies in Contemporary Linguistics, 55(1), 89-133.

Haider, A. S., & Al-Salman, S. (2020). Dataset of Jordanian university students’ psychological health impacted by using e-learning tools during COVID-19. Data in Brief, 32, 1-8.

Heidari, F., Jalilifar, A., & Salimi, A. (2020). Developing a corpus-based word list in pharmacy research articles: A focus on academic culture. International Journal of Society, Culture & Language, 8(1), 1-15.

Hsu, W. (2011). A business word list for prospective EFL business postgraduates. The Asian ESP Journal, 7(4), 63-99.

Hsu, W. (2014). Measuring the vocabulary load of engineering textbooks for EFL undergraduates. English for Specific Purposes, 33, 54-65.

Hunston, S. (2008). Collection strategies and design decisions. In A. Ludeling & M. Kyto (Eds.), Corpus Linguistics: an international handbook (Vol. 1, pp. 154-167). Mouton de Gruyter.

Hussein, R. F., Haider, A. S., & Al-Sayyed, S. (2021). A corpus-driven study of terms used to refer to articles and methods in research abstracts in the fields of economics, education, english literature, nursing, and political science. Journal of Educational Social Research, 11(3), 119-131.

Johns, A. M. (2013). The history of English for specific purposes research. In B. Paltridge & S. Starfield (Eds.), The handbook of English for specific purposes (Vol. 5, pp. 1-47). Wiley-Blackwell.

Johns, A. M., & Dudley‐Evans, T. (1991). English for specific purposes: International in scope, specific in purpose. TESOL quarterly, 25(2), 297-314.

Liu, J., & Han, L. (2015). A corpus-based environmental academic word list building and its validity test. English for Specific Purposes, 39, 1-11.

Loong, Y. C., & Chan, L. (2012). A study of vocabulary learning strategies adopted by dentistry students in Hong Kong in learning specialized dental vocabulary. The Asian ESP Journal, 8, 28-49.

Masaya, K. (2020). Making a scientific research article word list. 言語教育研究(30), 73-98.

Negro Alousque, I. (2016). Developments in ESP: from register analysis to a genre-based and CLIL-based approach. LFE. Revista de Lenguas para Fines Específicos, 22(1), 190-212.

Nekrasova‐Beker, T., Becker, A., & Sharpe, A. (2019). Identifying and teaching target vocabulary in an ESP course. TESOL Journal, 10(1), e00365.

Ramírez, C. G. (2015). English for specific purposes: Brief history and definitions. Revista de Lenguas Modernas, (23), 379-386

Roig–Marín, A. (2020). English-based coroneologisms: A short survey of our Covid-19-related vocabulary. English Today, 1-3.

Sarré, C., & Whyte, S. (2017). New developments in ESP teaching and learning research. Research-publishing. net.

Scott, M. (2012). WordSmith tools version 6. In Lexical Analysis Software.

The United Nations. (2020). COVID-19: an unprecedented news story for journalists.

The World Health Organization. (2020). Tips for professional reporting on COVID-19 vaccines.

Tongpoon-Patanasorn, A. (2018). Developing a frequent technical words list for finance: A hybrid approach. English for Specific Purposes, 51, 45-54.

Wang, J., Liang, S.-l., & Ge, G.-c. (2008). Establishment of a medical academic word list. English for Specific Purposes, 27(4), 442-458.

Wang, P. (2017). A Corpus-based Study of English Vocabulary in Art Research Articles. Journal of Arts Humanities, 6(8), 47-53.

West, M. (1953). A general service list of English words. Longman.

Williams, C. (2014). The future of ESP studies: building on success, exploring new paths, avoiding pitfalls. ASp. la revue du GERAS, (66), 137-150.

Yang, M.-N. (2015). A nursing academic word list. English for Specific Purposes, 37, 27-38.



  • There are currently no refbacks.

View My Stats

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.