Previous abstract | Contents | Next abstract

The Role of Algorithm Bias vs Information Source in Learning Algorithms for Morphosyntactic Disambiguation

Morphosyntactic Disambiguation (Part of Speech tagging) is a useful benchmark problem for system comparison because it is typical for a large class of Natural Language Processing (NLP) problems that can be defined as {\em disambiguation in local context}. This paper adds to the literature on the systematic and objective evaluation of different methods to automatically learn this type of disambiguation problem. We systematically compare two inductive learning approaches to tagging: {\sc mxpost} (based on maximum entropy modeling) and {\sc mbt} (based on memory-based learning). We investigate the effect of different sources of information on accuracy when comparing the two approaches under the same conditions. Results indicate that earlier observed differences in accuracy can be attributed largely to differences in information sources used, rather than to algorithm bias.


Guy De Pauw and Walter Daelemans, The Role of Algorithm Bias vs Information Source in Learning Algorithms for Morphosyntactic Disambiguation. In: Proceedings of CoNLL-2000 and LLL-2000, Lisbon, Portugal, 2000. [ps] [pdf] [bibtex]
Last update: June 27, 2001. erikt@uia.ua.ac.be