Tuesday, October 30, 2018

Week 5 Using Corpus Analysis Software to Analyse Specialised texts

Using Corpus Analysis Software to Analyse Specialised texts
1. What is a corpus? 
A corpus can be generally defined as a collection of naturally-occurring texts in a computer-readable format which can be retrieved and analyzed using corpus analysis software.

2. Sources of language corpora 
- Subscribe to a large corpus provider such as the British National Corpus (BNC) 
- Use web concordancing (for instance http://corpus.leeds.ac.ukhttp://corpus.byu.edu/)
- Compile own corpora and analyze data using corpus analysis software like ‘Antconc’ , ‘Wordsmith’ or ‘Paraconc’.
3. Designing a specialized corpus 
- Corpus size
- Text extracts vs. full texts
- Number of texts
- Medium
- Subject and text type
- Other considerations

4. Sources of specialized texts
- Printed materials
- Word document texts
- CD-ROMs
- Text on the Web
- Online databases

5. Getting started with Antconc
Download the latest version and watch YouTube tutorials from 
http://www.antlab.sci.waseda.ac.jp/antconc_index.html

6. Creating a specialized corpus profile
Size
96,736  words
Source of corpus data
From the internet (anybookfree.com/series/book/the-hunger-games)
Number of texts
25 texts
Medium
Spoken
Subject
Series The Hunger Games
Text type
New article
Authorship
Language
Texts written in English mostly by native speaker
Publication date
Recent texts (Retrieved in August 2018)

0 comments:

Post a Comment