Previous abstract | Contents | Next abstract

Modeling Category Structures with a Kernel Function

We propose one type of TOP (Tangent vector Of the Posterior log-odds) kernel and apply it to text categorization. In a number of categorization tasks including text categorization, negative examples are usually more common than positive examples and there may be several different types of negative examples. Therefore, we construct a TOP kernel, regarding the probabilistic model of negative examples as a mixture of several component models respectively corresponding to given categories. Since each component model of our mixture model is expressed using a one-dimensional Gaussian-type function, the proposed kernel has an advantage in computational time. We also show that the computational advantage is shared by a more general class of models. In our experiments, the proposed kernel used with Support Vector Machines outperformed the linear kernel and the Fisher kernel based on the Probabilistic Latent Semantic Indexing model.

Hiroya Takamura, Yuji Matsumoto and Hiroyasu Yamada, Modeling Category Structures with a Kernel Function. In: Proceedings of CoNLL-2004, Boston, MA, USA, 2004, pp. 57-64. [ps] [ps.gz] [pdf] [bibtex]

Last update: May 13, 2003. erikt@uia.ua.ac.be