Corpus Stylistics - HumBox_第1頁(yè)
Corpus Stylistics - HumBox_第2頁(yè)
Corpus Stylistics - HumBox_第3頁(yè)
Corpus Stylistics - HumBox_第4頁(yè)
Corpus Stylistics - HumBox_第5頁(yè)
已閱讀5頁(yè),還剩15頁(yè)未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

1、Outline:Background and introduction to current workMethodology in Corpus StylisticsApplications of Corpus Stylistics ReferencesCorpus StylisticsCorpus StylisticsBackground: What is Corpus Stylistics? The statistical study of style, i.e. study of the relative frequency of elements in a text Augustus

2、de Morgan, 1851: disputes about the authenticity of some of the writings of St Paul settled by the measurement of the length of the words used in the various Epistles T.C. Mendenhall, 1887: analysis of several authors frequency distributions of word-length Corpus: a body or collection of linguistic

3、data for use in research Since the early 1960s: interest in computer corpora or machine readable corpora Statements about the relative frequency of various linguistic items in a corpus have become very accurateCorpus Stylistics Some uses of statistical analysis of style through corpora: Education, e

4、.g. EFL textbook writing Establishment of authorship, e.g. of unascribed manuscripts Interpretive stylistics, e.g. study of the writers ideology and point of view Corpus StylisticsMethodology Simple things may characterise different styles average sentence length average word length type:token ratio

5、 (vocabulary richness) number of types = number of different words number of tokens = total number of words vocabulary growth (homogeneity of text) number of new types in 1st, 2nd, , nth 1000 words in rich varied text, number will climb steadily Especially when used comparativelyCorpus StylisticsCor

6、pus StylisticsMethodology (contd) More complex analyses can give a more interesting picture specific syntactic structures degree of modification in NPs types of verbs (e.g. verbs of persuasion, speech verbs, action verbs, descriptive verbs) distribution of pronouns (1st/2nd/3rd person) etc (anything

7、 you can think of!) Quite sophisticated mathematical techniques can give an overall picture e.g. factor analysis: identifies from a (big) range of variables which ones best identify/characterise differences Corpus StylisticsMethodology (contd)Multidimensional analysis Collect a huge range of measure

8、s of a wide variety some simple word counts syntactic features classes and subclasses of N, V, Adj, Avd Factor analysis choose a range of features to measure, see which ones are correlated150 features in allCorpus StylisticsMethodology (contd) Example: work based on corpora trying to quantify and ch

9、aracterise genre and register differences Work pioneered by Douglas Biber* Biber used statistical measures to identify stylistic factors that co-occurred, and could therefore be definitional of text types and genres E.g. conjuncts like therefore, nevertheless and use of passive together indicate mor

10、e formal style*D. Biber, S. Conrad & R. Reppen, Corpus Linguistics: Investigating Language Structure and Use, Ch 5: the study of discourse characteristicsCorpus StylisticsMethodology (contd) Corpora useful not only for counting frequencies of features, but also: Concordancing Lists occurrences o

11、f word in context Identify syntactic use of word Identify range of meanings Identify relative frequency of different uses/meanings Collocation What words occur together? Compare distribution of close synonymsCorpus StylisticsMethodology (contd)Vocabulary in context“Concordance, also known as KWIC li

12、st (key word in context) Allows us to see the (immediate) environment in which a word appearsListings can be customised to show what you want more clearly, e.g.sorted according to next or previous wordshowing more or less contextCorpus StylisticsMethodology (contd)CollocationTerm coined by J R Firth

13、 (1957) to characterise (part of) his theory of meaning“You shall judge a word by the company it keeps“The occurrence of two or more words within a short space of each other in a text (Sinclair 1991)“The relationship a lexical item has with items that appear with greater than random probability in i

14、ts (textual) context (Hoey 1991)Style and CorporaMethodology (contd)Collocation, text type and style example: Distinguish between general and more usual collocations vs. technical and more personal ones e.g. in a general corpus time collocates with save, spend, waste, fritter away, but in a corpus o

15、f sports reports time collocates with half, full, extra, injury, first, second, third, Style and CorporaApplicationsStylometry An attempt to capture the essence of the style of a particular author by reference to a variety of quantitative criteria, usually lexical, called discriminators. Study of fr

16、equently occurring features:word/sentence length; choice and frequency of words; vocabulary richness) The ideal situation for authorship studies is when there are large amounts of undisputed text, or few contenders for the authorship of the disputed text(s). Style and CorporaApplications (contd)Auth

17、or attributionEstablishing the author of an unascribed manuscript: Build corpora A - works definitely by author A B - works definitely by author B C - works of disputed authorship, but probably written by A or B Then select discriminants and associated measures When the technique has been shown to d

18、iscriminate effectively between A and B, then try it on C(M. Oakes: Computational Stylometry, in Handbook of Corpus Linguistics)Style and CorporaApplications (contd)Language LearningFrequency - in particular, word frequency - had a role in language learning in the days before electronic corpora exis

19、ted. The corpus revolution made available frequency information about language use in a totally unprecedented way Frequency dictionaries and frequency-based grammatical information are becoming more and more available and new sources of frequency information from the Web are being tapped Various kin

20、ds of knowledge found in present-day language textbooks (grammatical, collocational, semantic) are getting to be frequency-based. In general, corpora represent real usage of languageIn addition, more frequent can equal “more important“ in many aspects of language learningStyle and CorporaApplications (contd)Interpretive stylistics Programmes like WordSmith Tools and other Windows-based applications allow researchers to derive a list of keywords (words which occur

溫馨提示

  • 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

評(píng)論

0/150

提交評(píng)論