CORPUS SELECTION AND DATA COLLECTION METHODS FOR ENGLISH AND UZBEK LANGUAGE DICTIONARIES

Authors

  • Jumaboyeva Dildora Munis kizi Urgench Ranch Technological University

Keywords:

corpus linguistics, data collection, lexicography, bilingual dictionary, English, Uzbek, lexical resources, corpus design, language data, digital lexicography

Abstract

This article explores the principles and methods of corpus selection and data collection for the development of English and Uzbek language dictionaries. The study emphasizes the importance of using well-balanced and representative corpora to ensure the accuracy, relevance, and usability of lexical entries. It also discusses various sources of linguistic data, including written and spoken texts, electronic databases, and user-generated content. By comparing practices in both languages, the paper highlights effective strategies for compiling bilingual lexicographic resources. The findings aim to contribute to the improvement of dictionary-making practices and promote the integration of modern corpus linguistics into lexicography.

References

Karimov, Z. (2021). Corpus-based approaches in Uzbek-English bilingual lexicography. Tashkent State University of Uzbek Language and Literature Press.

Tursunov, A. (2020). The role of national corpora in Uzbek lexicography. Journal of Philological Research, 15(2), 78–85.

Ergashev, R., & Abdullaeva, N. (2019). Challenges of corpus development for the Uzbek language. Uzbek Journal of Language Studies, 7(1), 45–59.

Sadikova, M. (2022). Digitization of Uzbek texts for corpus linguistics: Problems and solutions. Modern Linguistic Issues, 8(3), 90–101.

Mukhamedov, D. (2021). Designing balanced corpora for low-resource languages: The case of Uzbek. Tashkent Linguistics Journal, 10(4), 112–124.

Davies, M. (2008). The Corpus of Contemporary American English (COCA): 560 million words, 1990–present. BYU.

McEnery, T., & Hardie, A. (2012). Corpus Linguistics: Method, Theory and Practice. Cambridge University Press.

Sinclair, J. (1991). Corpus, Concordance, Collocation. Oxford University Press.

Kilgarriff, A., & Grefenstette, G. (2003). Introduction to the special issue on the web as corpus. Computational Linguistics, 29(3), 333–347.

Allamurodov, A. (2023). Developing a bilingual lexicon based on parallel corpora: English-Uzbek applications. Proceedings of the National Conference on Applied Linguistics, 2(1), 55–62.

Downloads

Published

2025-08-12

How to Cite

Jumaboyeva Dildora Munis kizi. (2025). CORPUS SELECTION AND DATA COLLECTION METHODS FOR ENGLISH AND UZBEK LANGUAGE DICTIONARIES. Ethiopian International Multidisciplinary Research Conferences, 150–153. Retrieved from https://eijmr.org/conferences/index.php/eimrc/article/view/1251