Using Corpus Analysis Software to Analyse Specialised texts
1. What is a corpus?
A corpus can be generally defined as a collection of
naturally-occurring texts in a computer-readable format which can be retrieved
and analyzed using corpus analysis software.
2. Sources of language corpora
- Subscribe to a large corpus provider such as the British
National Corpus (BNC)
- Use web concordancing (for instance http://corpus.leeds.ac.uk, http://corpus.byu.edu/)
- Compile own corpora and analyze data using corpus analysis
software like ‘Antconc’ , ‘Wordsmith’ or ‘Paraconc’.
3. Designing a specialized corpus
- Corpus size
- Text extracts vs. full texts
- Number of texts
- Medium
- Subject and text type
- Other considerations
4. Sources of specialized texts
- Printed materials
- Word document texts
- CD-ROMs
- Text on the Web
- Online databases
5. Getting started with Antconc
Download the latest version and watch YouTube tutorials
from
http://www.antlab.sci.waseda.ac.jp/antconc_index.html
6. Creating a specialized corpus profile
Size
|
96,736 words
|
Source of corpus data
|
From the internet
(anybookfree.com/series/book/the-hunger-games)
|
Number of texts
|
25 texts
|
Medium
|
Spoken
|
Subject
|
Series The Hunger Games
|
Text type
|
New article
|
Authorship
|
|
Language
|
Texts written in English mostly
by native speaker
|
Publication date
|
Recent texts (Retrieved in August
2018)
|
0 comments:
Post a Comment