Previous abstract | Contents | Next abstract

Named Entity Recognition Using a Character-based Probabilistic Approach

We present a named entity recognition and classification system that uses only probabilistic character-level features. Classifications by multiple orthographic tries are combined in a hidden Markov model framework to incorporate both internal and contextual evidence. As part of the system, we perform a preprocessing stage in which capitalisation is restored to sentence-initial and all-caps words with high accuracy. We report f-values of 86.65 and 79.78 for English, and 50.62 and 54.43 for the German datasets.


Casey Whitelaw and Jon Patrick, Named Entity Recognition Using a Character-based Probabilistic Approach. In: Proceedings of CoNLL-2003, Edmonton, Canada, 2003, pp. 196-199. [ps] [ps.gz] [pdf] [bibtex]
Last update: June 11, 2003. erikt@uia.ua.ac.be