Previous abstract | Contents | Next abstract
We present a novel approach to the problem of overfitting in the training of stochastic models for selecting parses generated by attribute-valued grammars. In this approach, statistical features are merged according to the frequency of linguistic elements within the features. The resulting models are more general than the original models, and contain fewer parameters. Empirical results from the task of parse selection suggest that the improvement in performance over repeated iterations of iterative scaling is more reliable with such generalized models than with ungeneralized models.