ABSTRACT
Tagging and Parsing Linguistic Corpora for Language and Speech Applications
Alex Chengyu Fang, Department of Phonetics and Linguistics, University College London
Linguistic "corpora" are collections of texts organised, and often
analysed, according to specifics of their intended applications. They
have been applied extensively in linguistics, e.g. to show what usages
and idioms actually exist in language used in different places (such as
the different countries where English is an official language) or by
different groups, and in cognitive psychology.
Before computer programs can exploit corpora for practical applications,
e.g. support for translation or language teaching, the texts must be
preprocessed to add linguistic information (part-of-speech tags, other
information derived from parsing and even semantic information).
The seminar will introduce the subject of how to do this automatically
by using knowledge extracted from corpora at lexical, grammatical and
syntactic levels, and summarise the state of the art in such computing.
Open research issues, and possible future applications of corpora (the
two topics are linked), will also be described.
Computer Science Home Page
Maintained by rbennett@cs.ucl.ac.uk