Previous abstract | Contents | Next abstract

Inducing translation lexicons via diverse similarity measures and bridge languages

This paper presents a method for inducing translation lexicons between two distant languages without the need for either parallel bilingual corpora or a direct bilingual seed dictionary. The algorithm successfully combines temporal occurrence similarity across dates in news corpora, wide and local cross-language context similarity, weighted Levenshtein distance, relative frequency and burstiness similarity measures. These similarity measures are integrated with the bridge language concept under a robust method of classifier combination for both the Slavic and Northern Indian language families.


Charles Schafer and David Yarowsky, Inducing translation lexicons via diverse similarity measures and bridge languages. In: Dan Roth and Antal van den Bosch (eds.), Proceedings of CoNLL-2002, Taipei, Taiwan, 2002, pp. 146-152. [ps] [ps.gz] [pdf] [bibtex]
Last update: September 07, 2002. erikt@uia.ua.ac.be