المرجع الالكتروني للمعلوماتية
المرجع الألكتروني للمعلوماتية

English Language
عدد المواضيع في هذا القسم 5694 موضوعاً
Grammar
Linguistics
Reading Comprehension

Untitled Document
أبحث عن شيء أخر

الأفعال التي تنصب مفعولين
23-12-2014
صيغ المبالغة
18-02-2015
الجملة الإنشائية وأقسامها
26-03-2015
اولاد الامام الحسين (عليه السلام)
3-04-2015
معاني صيغ الزيادة
17-02-2015
انواع التمور في العراق
27-5-2016

corpus, plural corpora (n.)  
  
688   10:06 صباحاً   date: 2023-07-28
Author : David Crystal
Book or Source : A dictionary of linguistics and phonetics
Page and Part : 117-3


Read More
Date: 2023-10-14 638
Date: 2023-12-02 536
Date: 2023-09-12 490

corpus, plural corpora (n.)

A collection of LINGUISTIC DATA, either written texts or a TRANSCRIPTION of recorded speech, which can be used as a starting-point of linguistic description or as a means of verifying hypotheses about a LANGUAGE (corpus linguistics). Linguistic DESCRIPTIONS which are ‘corpusrestricted’ have been the subject of criticism, especially by GENERATIVE GRAMMARIANS, who point to the limitations of corpora (e.g. that they are samples of PERFORMANCE only, and that one still needs a means of PROJECTING beyond the corpus to the language as a whole). In fieldwork on a new language, or in HISTORICAL study, it may be very difficult to get beyond one’s corpus (i.e. it is a ‘closed’ as opposed to an ‘extendable’ corpus), but in languages where linguists have regular access to NATIVE-SPEAKERS (and may be native-speakers themselves) their approach will invariably be ‘corpus-based’, rather than corpus-restricted. Corpora provide the basis for one kind of COMPUTATIONAL LINGUISTICS. A computer corpus is a large body of machine-readable texts. Increasingly large corpora (especially of English) have been compiled since the 1980s, and are used both in the development of natural language processing software and in such applications as lexicography, speech recognition, and machine translation.