|
Information:
Deutsch
English
Français
Español
Impressum
|
|
Automatische Wortschatzerschließung großer Textkorpora am Beispiel des DWDS
Abstract
In the past years a large number of electronic text corpora for German have been created due to
the increased availability of electronic resources. Appropriate filtering of lexical material in these
corpora is a particular challenge for computational lexicography since machine readable lexicons alone
are insufficient for systematic classification. In this paper we show – on the basis of the corpora of
the DWDS – how lexical knowledge can be classified in a more fine-grained way with morphological and
shallow syntactic parsing methods. One result of this analysis is that the number of different lemmas
contained in the corpora exceeds the number of different headwords of current large monolingual German
dictionaries by several times.
|