CORPORA
Currently a loosely-knit index of web sites esp. relevant to building and designing corpora.
Initiatives and Organizations
Center for Turkish Language and Speech Processing
Corpus Encoding Standarts Page
Text Encoding Initiative Page
ELSNET
TransEuropean Language Resources Infrastructure
(TELRI)
UCREL at University of Lancaster
ACL Siglex home page
ELRA homepage
EAGLES Home
Tuscan Word Centre
Columbia Annotation Repository Project
Personal Homepages and Indices
Tony Berber Sardinha's Home Page
(Courses, Bibliography, Links etc.)
Barlow's Corpus Linguistics Page
Statistical NLP and Corpus Linguistics Links
Corpora List's Archive
(Useful discussions, other links)
Individual Corpora
Linguistic Data Consortiums Annotations Page
ICE-GB Corpora of Adult British English (Demo and Software)
British National Corpus
NEGRA Corpus
Cambridge International Corpus (CIC)
A Compendium of Multilingual Language Resources
Contemporary Turkish Literature
Speech Corpora
Talkbank
Course Pages
Chris Manning's Statistical NLP Page
Statistical NLP Course of Nivre at Goteburg Univ.
Data Intensive Linguistics CoursePage
General Corpus Linguistics Page
Software (Textual Analysers, Taggers, etc.)
Alembic Workbench
QTAG Tagger
WinBrill for Windows 95/98
Constraint Grammar Tagger
CLAWS Tagset
and
tagger
TNT Tagger
IMS Corpus Workbench
AUTASYS Tagger and Lemmatiser
Quirk Textual Analysis System
WordSmith Tools
Oxford Text Archive's Software Links
(SGML-XML editors, etc.)
and more
Web Robots for collecting texts from the web
Ainat's Information Extraction and Retrieval on the Web Site
Peter Ruthven-Stuart's Pages
Bruce Guthrie's Utilities
Conferences
Linguistically Interpreted Corpora Workshop (EACL 99)
METU Turkish Corpus Project Page
Morphologically and Syntactically Annotated Treebank Corpus Project Page