| Home | CV | Teaching | PhDs | Research Output | Publications | Projects | Software |
| Publications 1992-2001 reverse chronological | Publications 2002-2012 reverse chronological |
Early 1990s a paradigm shift (a real one) occurred in computational linguistics. Until that time, accepted methodology was to handcraft computational models of language processing tasks. These models were mostly rule-based and inspired by linguistic theories. Their problem was that they were brittle, had limited coverage and were difficult to build (knowledge acquisition bottleneck). With the statistical revolution, the field switched to a methodology in which computational models were induced using machine learning methods from corpora annotated with linguistic representations. We found Memory-based learning, an extension of k-nn pattern matching techniques, to be well-suited for language data with its many subregularities and exceptions. The inductive approach is not without its own problems. Comparing different learning algorithms or measuring the importance of specific annotations for solving a language processing task turns out to be difficult to do, our work on machine learning methodology addresses this problem. The knowledge acquisition bottleneck has been replaced by a corpus annotation bottleneck, that may be alleviated by techniques like Active Learning. Statistically trained models are brittle as well in that they overfit their training data, showing awful accuracy decreases when applied to other domains than those they were trained on. We therefore need automatic Domain Adaptation methods. Finally, we did research on techniques like Ensemble Methods, that combine different machine learned classifiers by classifier combination or metalearning methods.
Walter Daelemans and Antal van den Bosch. `Memory-based learning.' In: A. Clark, C. Fox, and S. Lappin (Eds.) Handbook of Computational Linguistics and Natural Language Processing, Oxford, UK, Wiley-Blackwell Publishers, 154-179, 2010. [pdf]
Walter Daelemans, Jakub Zavrel, Ko van der Sloot, and Antal van den Bosch, TiMBL: Tilburg Memory Based Learner, version 6.3, Reference Guide. ILK & CLiPS Research Groups Technical Report Series no. 10-01, 66 pages, 2010. [pdf]
Walter Daelemans and Antal van den Bosch. Memory-Based Language Processing, Cambridge, UK: Cambridge University Press, 2005. [Book website]
Daelemans, Walter, Memory-Based Language Processing. Introduction to the Special Issue. In: Journal of Experimental and Theoretical AI (JETAI), 11:3, 1999. [ps]
Daelemans, Walter, Antal van den Bosch, and Jakub Zavrel, Forgetting exceptions is harmful in language learning. In: Machine Learning, 34:1/3, 1999, pp. 11-43. [ps]
Daelemans, Walter, Antal van den Bosch, and Ton Weijters. "IGTree: Using Trees for Compression and Classification in Lazy Learning Algorithms." In D. Aha (ed.), Artificial Intelligence Review, special issue on Lazy Learning, 1997. [ps]
Daelemans, W., A. van den Bosch, and J. Zavrel. "A Feature-Relevance Heuristic for Indexing and Compressing Large Case Bases." In M. van Someren and G. Widmer (eds.) 9th European Conference on Machine Learning - Poster Papers. Prague: Laboratory of Intelligent Systems, 29-38, 1997. [ps]
Zavrel, Jakub and Walter Daelemans. "Memory-Based Learning: Using Similarity for Smoothing." In Proceedings of 35th Annual Meeting of the ACL, Madrid, Spain, July 1997. [ps]
Daelemans, Walter. "Memory-Based Lexical Acquisition and Processing." In P. Steffens (ed.) Machine Translation and the Lexicon, Springer Lecture Notes in Artificial Intelligence 898, 85-98, 1995. [ps]
Daelemans, Walter, and Antal van den Bosch. "Generalization Performance of Backpropagation Learning on a Syllabification Task." In M.F.J. Drossaers and A. Nijholt (eds.) Connectionism and Natural Language Processing. Proceedings Third Twente Workshop on Language Technology, 27-38, 1992. [ps]
Hoste Véronique, Daelemans Walter, `Comparing learning approaches to language learning: there is more to it than bias.' In: Proceedings of Benelearn 2006, Ghent, Belgium, 131-138, 2006. [pdf]
Daelemans, Walter, Véronique Hoste, Fien De Meulder and Bart Naudts, `Combined Optimization of Feature Selection and Algorithm Parameter Interaction in Machine Learning of Language.' Proceedings of the 14th European Conference on Machine Learning (ECML-2003), Lecture Notes in Computer Science 2837, Springer-Verlag, Cavtat-Dubrovnik, Croatia, 84-95, 2003. [pdf]
Daelemans, Walter and Hoste, Véronique, `Evaluation of Machine Learning Methods for Natural Language Processing Tasks.' Proceedings of LREC-2002, the third International Conference on Language Resources and Evaluation, Las Palmas, Spain, 755-760, 2002. [pdf]
Kool, Anne, Walter Daelemans, and Jakub Zavrel. Genetic Algorithms for Feature Relevance Assignment in Memory-Based Language Processing. In: Proceedings of CoNLL-2000. [ps]
De Pauw, Guy, and Walter Daelemans, The Role of Algorithm Bias vs Information Source in Learning Algorithms for Morphosyntactic Disambiguation. Proceedings of the Fourth Conference on Computational Language Learning (CoNLL-2000), Lissabon, Portugal, 19-24, 2000. [ps]
Halteren, Hans van, Jakub Zavrel, Walter Daelemans, Improving accuracy in word class tagging through combination of machine learning systems. Computational Linguistics 27 (2), 199-230, 2001. (Preprint). [ps]
Hoste, Veronique, Anne Kool, and Walter Daelemans. `Classifier Optimization and Combination in the English All Words Task.' In: Judita Preiss and David Yarowsky (eds.), Proceedings of SENSEVAL-2. Second International Workshop on Evaluating Word Sense Disambiguation Systems. New Brunswick: ACL, 83-86, 2001. [ps]
Hoste, Véronique and Walter Daelemans, Comparing bagging and boosting for natural language processing tasks: a typicality approach. In: Ad Feelders (ed.), Proceedings of the Tenth Belgian-Dutch Conference on Machine Learning (Benelearn 2000),pp. 101-108, 2000. [ps]
Hoste, Véronique, Walter Daelemans, Erik Tjong Kim Sang and Steven Gillis, Meta-Learning for Phonemic Annotation of Corpora. In: Pat Langley (ed.), Proceedings of ICML-2000, Stanford University, CA, USA, pp. 375-382. [ps]
Tjong Kim Sang, Erik F., Walter Daelemans, Hervé Déjean, Rob Koeling, Yuval Krymolowski, Vasin Punyakanok and Dan Roth, Applying System Combination to Base Noun Phrase Identification. In: Proceedings of COLING 2000, Saarbruecken, Germany. [ps]
Zavrel, J., S. Degroeve, A. Kool, W. Daelemans, K. Jokinen, Diverse classifiers for NLP disambiguation tasks. Comparisons, Optimization, Combination, and Evolution. In: Jokinen et al. (eds.), TWLT 18. Learning to Behave. CEvoLE 2, Ieper, Belgium, p. 201-221, 2000. [ps]
Zavrel, Jakub, and Walter Daelemans, Bootstrapping a Tagged Corpus through Combination of Existing Heterogeneous Taggers. In: Proc. of the 2nd International Conference on Language Resources and Evaluation (LREC-2000), Athens, Greece, 31 May - 2 June, 2000. [ps]
Van Halteren, Hans, Jakub Zavrel, and Walter Daelemans. "Improving data driven wordclass tagging by system combination." In Proceedings of COLING-ACL '98, August 1998, Montreal, Canada, pp. 491-497. [ps]
Vincent Van Asch and Walter Daelemans. ‘Using domain similarity for performance estimation.’ Proceedings of the ACL 2010 Workshop on Domain Adaptation for Natural Language Processing (DANLP) - CISB 978-1-932432-80-0 - Uppsala, Association for Computational Linguistics, 2010, p. 31-36, 2010. [pdf]
Daelemans W., Groenewald H.J. and van Huyssteen G.B. ‘Prototype-based active learning for lemmatization’, Proceedings of the 7th International Conference on Recent Advances in Natural Language Processing (RANLP)., Borovec, Bulgaria, p 65–70, 2009. [pdf]
Antal van den Bosch and Walter Daelemans, `Improving sequence segmentation learning by predicting trigrams.' In: Proceedings of the Ninth Conference on Natural Language Learning, CoNLL-2005, June 29-30, 2005, Ann Arbor, MI, 80-87, 2005. [pdf]
Daelemans Walter, `A mission for computational natural language learning.' In: Proceedings of the Tenth Conference on Natural Language Learning (CoNLL-X), New York City, USA, June 8-9, 1-5, 2006. [pdf]
Daelemans, Walter `Machine Learning of Natural Language'. In: G. Altmann, R. Koehler, and R. Piotrowski (eds.) Quantitative Linguistics. An international handbook, Berlin and New York: Walter De Gruyter, 821-833, 2005. [pdf]
Daelemans, Walter, Antal van den Bosch, Jakub Zavrel, Jorn Veenstra, Sabine Buchholz, and Bertjan Busser. "Rapid development of NLP modules with memory-based learning." In Proceedings of ELSNET in Wonderland, pp. 105-113. Utrecht: ELSNET, 1998. Also in R. Basili and M.T. Pazienza (Eds.), ECML-98 TANLPS Workshop Notes, Technische Universitaet Chemnitz, 1998, pp. 1-17. [ps]
One design principle that has survived the statistical revolution is the use of cascaded, pipelined, or heterarchically organized systems of modules each of which performs a specific language processing task that can be reused in larger text analysis architectures. Although direct text to text transformation has become a serious option in inductive approaches to CL (with as most impressive example statistical machine translation), most applications still use a modular approach. We have applied memory-based learning and many other machine learning methods to a wide range of modules in phonology, morphology, syntax, semantics, and discourse. We studied the issue of modularity (how many and which modules are necessary) in the context of grapheme to phoneme conversion and in (shallow) parsing.
Véronique Hoste, Walter Daelemans and Steven Gillis. `Using rule-induction techniques to model pronunciation variation in Dutch', Computer Speech and Language 18(1), 2004, 1-23. (preprint). [pdf]
Decadt Bart, Jacques Duchateau, Walter Daelemans, and Patrick Wambacq. `Memory-Based Phoneme-to-Grapheme Conversion.' In: M. Theune, A. Nijholt, and H. Hondrop (eds.) Computational Linguistics in the Netherlands 2001. Selected Papers from the Twelfth CLIN Meeting, Amsterdam - New York: Rodopi, 47-61, 2002. [ps]
Decadt, Bart, Jacques Duchateau, Walter Daelemans, and Patrick Wambacq, `Transcription of Out-of-vocabulary Words in Large Vocabulary Speech Recognition based on Phoneme-to-grapheme Conversion'. Proceedings of ICASSP-02, the International Conference on Acoustics, Speech and Signal Processing, Volume I, Orlando, USA, 861-864, 2002. [ps]
Daelemans, Walter and Antal van den Bosch. `TreeTalk: Memory-Based Word Phonemisation.' In: R. I. Damper (Ed.) Data-Driven Techniques in Speech Synthesis. Kluwer Academic Publishers, 149-172, 2001. [ps]
Hoste, Véronique, Walter Daelemans, and Steven Gillis, A Rule Induction Approach to Modeling Regional Pronunciation Variation. In: Proceedings of COLING 2000, Saarbrcken, Germany. San Francisco: Morgan Kaufman Publishers, 2000, pp. 327-333. [ps]
Daelemans, Walter, and Antal van den Bosch. "Language-Independent Data-Oriented Grapheme-to-Phoneme Conversion." In Van Santen, J., R. Sproat, J. Olive, and J. Hirschberg, Progress in Speech Synthesis. New York: Springer Verlag, 77-90, 1996. [ps]
Van den Bosch, Antal, and Walter Daelemans. "Data-Oriented Methods for Grapheme-to-Phoneme Conversion." In Proceedings of the Sixth conference of the European chapter of the ACL, ACL, 45-53, 1993. [ps]
Walter Daelemans. `POS Tagging.' In: Claude Sammut and Geoffrey I. Webb (eds.) Encyclopedia of Machine Learning. Springer Verlag, Heidelberg, 2010. ISBN: 978-0-387-30768-8
Daelemans, Walter, Jakub Zavrel, Antal van den Bosch and Ko van der Sloot. `MBT: Memory Based Tagger, version 3.2, Reference Guide.' ILK Research Group and CLiPs Technical Report Series 10-04, 15 pages, 2010. [pdf]
Antal van den Bosch, Bertjan Busser, Sander Canisius, and Walter Daelemans. `An efficient memory-based morphosyntactic tagger and parser for Dutch.' In: Peter Dirix et al. (eds.) Computational Linguistics in the Netherlands 2006. Selected papers from the seventeenth CLIN meeting. LOT, Utrecht, 191-206, 2007. [pdf]
Van Eynde, Frank, Jakub Zavrel and Walter Daelemans, Part of speech tagging and lemmatisation for the Spoken Dutch Corpus. In: Proceedings of the second international conference on language resources and evaluation (LREC-2000). Athens, Greece, 2000, pp. 1427-1434. [ps]
Daelemans, Walter, Machine Learning Approaches. In: Hans van Halteren (ed.), Syntactic Wordclass Tagging. Kluwer Academic Publishers, pp. 285-304, 1999. [ps]
Daelemans, Walter, Jakub Zavrel, Peter Berck, and Steven Gillis. "MBT: A memory-based part of speech tagger-generator." In Fourth Workshop on Very Large Corpora, edited by E. Ejerhed and I. Dagan, 14-27. Copenhagen, 1996. [ps]
Vaneyghen Joris, de Pauw Guy, van Compernolle Dirk, Daelemans Walter, `A mixed word/morphological approach for extending CELEX for high coverage on contemporary large corpora.' In: Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006), Genoa, Italy, 2006 931-934, 2006. [pdf]
Daelemans, Walter ‘Computational Linguistics.’ In: G. Booij, Ch. Lehmann, and J. Mugdan (eds.), Morphology. A Handbook on Inflection and Word Formation, Berlin and New York: Walter De Gruyter, 1893-1900, 2004. [pdf]
Guy De Pauw, Tom Laureys, Walter Daelemans and Hugo Van hamme. `A Comparison of Two Different Approaches to Morphological Analysis of Dutch.' In Proceedings of the ACL 2004 Workshop on Current Themes in Computational Phonology and Morphology, 62--69, Barcelona, Spain, July 2004. [pdf]
Tom Laureys, Guy De Pauw, Hugo Van Hamme, Walter Daelemans, and Dirk Van Compernolle. `Evaluation and adaptation of the Celex Dutch morphological database.' In: M.T. Lino e.a. (Eds.), Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004), 1247-1250, 2004. [pdf]
Van den Bosch, Antal and Walter Daelemans, Memory-based morphological analysis. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, ACL'99, University of Maryland, USA, June 20-26, 1999, pp. 285-292. [ps]
Daelemans, Walter, Peter Berck, and Steven Gillis. "Unsupervised discovery of phonological categories through supervised learning of morphological rules." In 16th International Conference on Computational Linguistics (COLING-96), 95-100. Copenhagen, 1996. [ps]
Daelemans, Walter, and Gert Durieux, Inductive Lexica. In: Van Eynde, F. and D. Gibbon (eds.) Lexicon Development for speech and language processing, Kluwer Academic Publishers, 115-139, 2000. [ps]
Tom De Smedt, Vincent Van Asch V. and Walter Daelemans. Memory-based Shallow Parser for Python. CLiPS Technical Report Series, CTRS-002, 2010.
Hammerton, James, Miles Osborne, Susan Armstrong, and Walter Daelemans (Eds.) Special issue on Machine Learning Approaches to Shallow Parsing. Journal of Machine Learning Research 2, 551--719, 2002.
Hammerton, James, Miles Osborne, Susan Armstrong, and Walter Daelemans. `Introduction to the Special issue on Machine Learning Approaches to Shallow Parsing.' Journal of Machine Learning Research 2, 551-558, 2002. [ps]
Buchholz, Sabine, Jorn Veenstra, Walter Daelemans, Cascaded Grammatical Relation Assignment. In: Proceedings of EMNLP/VLC-99, University of Maryland, USA, June 21-22, 1999, pp. 239-246. [ps]
Daelemans, Walter, Sabine Buchholz, Jorn Veenstra, Memory-Based Shallow Parsing. In: Proceedings of CoNLL-99, Bergen, Norway, June 12, 1999, pp. 53-60. [ps]
Van Asch Vincent, Daelemans Walter. ‘Prepositional phrase attachment in shallow parsing.’ Proceedings of the 7th International Conference on Recent Advances in Natural Language Processing (RANLP). Borovec, Bulgaria, 2009, p. 12–17. [pdf]
Zavrel, Jakub, Walter Daelemans, and Jorn Veenstra. "Resolving PP Attachment Ambiguities with Memory-Based Learning." In Proceedings of the workshop on Computational Natural Language Learning (CoNLL'97), edited by: Mark Ellison, Madrid, 11 July 1997. [ps]
Canisius Sander, van den Bosch Antal, Daelemans Walter, `Discrete versus probabilistic sequence classifiers for domain-specific entity chunking.' In: Proceedings of the Eighteenth Belgian-Dutch Conference on Artificial Intelligence, BNAIC-2006, Namur, Belgium, 75-82, 2006. [pdf]
Fien De Meulder and Walter Daelemans. `Memory-Based Named Entity Recognition using Unannotated Data.' Proceedings of the Seventh Conference on Natural Language Learning, 2003, pp. 208-211. [pdf]
De Meulder, Fien, Walter Daelemans, and Véronique Hoste. `A Named Entity Recognition System for Dutch.' In: M. Theune, A. Nijholt, and H. Hondrop (eds.) Computational Linguistics in the Netherlands 2001. Selected Papers from the Twelfth CLIN Meeting, Amsterdam - New York: Rodopi, 77-88, 2002. [ps]
Decadt, Bart, Veronique Hoste, Walter Daelemans and Antal van den Bosch, `GAMBL Genetic Algorithm Optimization of Memory-Based WSD.' In Rada Mihalcea and Phil Edmonds (eds.), Proceedings of the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text (Senseval-3), 108-112, 2004. [pdf]
Decadt, Bart and Walter Daelemans, `Verb Classification - Machine Learning Experiments in Classifying Verbs into Semantic Classes.' L. Guthrie e.a. (eds.), Proceedings of the LREC 2004 Workshop ``Beyond Named Entity Recognition - Semantic Labelling for NLP Tasks'', 25-30, 2004. [pdf]
Hendrickx, Iris, Antal van den Bosch, Veronique Hoste, Walter Daelemans. `Dutch Word Sense Disambiguation: Optimizing the Localness of Context.' In: Phil Edmonds, Rada Mihalcea, Patrick Saint-Dizier (eds.) Word Sense Disambiguation: Recent Successes and Future Directions. New Brunswick: ACL, 61-66, 2002. [ps]
Hoste, Véronique, Walter Daelemans, Iris Hendrickx, Antal van den Bosch. `Evaluating the Results of a Memory-Based Word-Expert Approach to Unrestricted Word Sense Disambiguation.' In: Phil Edmonds, Rada Mihalcea, Patrick Saint-Dizier (eds.) Word Sense Disambiguation: Recent Successes and Future Directions. New Brunswick: ACL, 95-101, 2002. [ps]
Hoste, Véronique, Iris Hendrickx, Walter Daelemans, and Antal van den Bosch. `Parameter Optimization for Machine-Learning of Word Sense Disambiguation.' Natural Language Engineering, Special Issue on Word Sense Disambiguation Systems 8 (4): 311-325, 2002. [pdf]
Veenstra, Jorn, Antal van den Bosch, Sabine Buchholz, Walter Daelemans, Jakub Zavrel, Memory-Based Word Sense Disambiguation. In: Computers and the Humanities, special issue on Senseval, Word Sense Disambiguation, Ed. Adam Kilgarriff and Martha Palmer, 34:1-2, 171-177, 2000. [ps]
Morante Roser, Walter Daelemans, Vincent Van Asch. ‘A combined memory-based semantic role labeler of English.’ In: Proceedings of the 12th Conference on Computational Natural Language Learning, 2008, p. 208-212. [pdf]
Antal van den Bosch, Sander Canisius, Walter Daelemans, Iris Hendrickx, Erik Tjong Kim Sang, `Memory-based semantic role labeling: optimizing features, algorithms, and output.' Proceedings of CoNLL 2004, Boston, USA, 102-105, 2004. [pdf]
Hendrickx Iris, Gosse Bouma, Frederik Coppens, Walter Daelemans, Véronique Hoste, Geert Kloosterman, Anne-Marie Mineur, Joeri van der Vloet, Jean-Luc Verschelde. ‘Coreference resolution for extracting answers for Dutch.’ In: Proceedings of LREC, Marrakech, Morocco, 2008. [pdf]
Hendrickx Iris, Véronique Hoste, Walter Daelemans. ‘Semantic and syntactic features for anaphora resolution for Dutch.’ In: Proceedings of the CICLing-2008 Conference, Haifa, Israel, Berlin: Springer, 2008, p. 351- 361. [pdf]
Véronique Hoste, Iris Hendrickx and Walter Daelemans, `Disambiguation of the neuter pronoun and its effect on pronominal coreference resolution.' In: Text, Speech and Dialogue. Proceedings of the 10th International Conference TSD 2007., Plzen, Czech Republic, Springer Lecture Notes in Computer Science Volume 4629, Berlin, Heidelberg: Springer, 48-55, 2007. [pdf]
Iris Hendrickx, Véronique Hoste, and Walter Daelemans, `Evaluating hybrid versus data-driven coreference resolution.' In: Anaphora: Analysis, Algorithms and Applications. Lecture Notes in Computer Science Volume 4410, Berlin, Heidelberg: Springer Verlag, 137-150, 2007. [pdf]
Véronique Hoste and Walter Daelemans. `Learning Dutch Coreference Resolution.' In: Ton van der Wouden, Michaela Poss, Hilke Reckman, Crit Cremers (eds.) Computational Linguistics in the Netherlands 2004. Selected papers from the fifteenth CLIN meeting., Utrecht: LOT, 133-148, 2005. [pdf]
Véronique Hoste and Walter Daelemans. `Comparing Learning Approaches to Coreference Resolution. There is More to it Than 'Bias'.' In: Proceedings of the Workshop on Meta-Learning (held in conjunction with ICML-2005), Bonn, Germany, 20-27, 2005. [pdf]
Marsi Erwin, Martin Reynaert, Antal van den Bosch, Walter Daelemans, Véronique Hoste. `Learning to predict pitch accents and prosodic boundaries in Dutch.' Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, Sapporo, Japan, 489-496, 2003. [pdf]
Marsi, E., B. Busser, W. Daelemans, V. Hoste, M. Reynaert, A. van den Bosch, `Combining information sources for memory-based pitch accent placement.' Proceedings of ICSLP-2002, International Conference on Spoken Language Processing, Denver, USA, 1273-1276, 2002. [ps]
Busser, Bertjan, Walter Daelemans, Antal van den Bosch. `Predicting phrase breaks with memory-based learning' Proceedings 4th ISCA Tutorial and Research Workshop on Speech Synthesis. Perthshire Scotland, August 29th - September 1st, 2001. [ps]
Text analysis modules can be combined to create applications. One important category of applications is Text Mining or Text Analytics, a group of applications that improves upon simple keyword search in addressing content rather than form. Summarization systems reduce documents or sentences to their essence; Information Extraction systems understand documents as far as some predefined aspects of content are concerned; Ontology Extraction systems produce concept models or hierarchies from text within a specific domain; Question Answering systems answer questions rather than returning sets of relevant documents from which the answer has still to be extracted. We are also working on text mining in a specific domain: Biomedical Text Mining.
Erwin Marsi, Emiel Krahmer, Iris Hendrickx, Walter Daelemans. ‘Sentence compression: Beyond deletion models?’ Empirical Methods in Natural Language Generation., Lecture Notes in Computer Science 5980, Springer, Berlin Heidelberg, [pdf]
Erwin Marsi, Emiel Krahmer, Iris Hendrickx, Walter Daelemans. ‘Sentence compression: Beyond deletion models?’ Empirical Methods in Natural Language Generation., Lecture Notes in Computer Science 5980, Springer, Berlin Heidelberg, to appear.
Glickman Oren, Dagan Ido, Daelemans Walter, Keller Mikaela, Bengio Sammy, `Investigating lexical substitution scoring for subtitle generation.' In: Proceedings of the Tenth Conference on Natural Language Learning (CoNLL-X), New York City, USA, June 8-9, 45-52, 2006. [pdf]
Daelemans, Walter, Anja Hoethker and Erik Tjong Kim Sang, `Automatic Sentence Simplification for Subtitling in Dutch and English.' In: M.T. Lino e.a. (eds.), Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004), 1045-1048, Lisbon, 2004. [pdf]
Piperidis Stelios, Iason Demiros, Prokopis Prokopidis, Peter Vanroose, Anja Hoethker, Walter Daelemans, Elsa Sklavounou, Manos Konstantinou, and Yannis Karavidas, `Multimodal Multilingual Resources in the Subtitling Process.' Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004), 205-208, Lisbon, 2004. [pdf]
Erik Tjong Kim Sang, Walter Daelemans, Anja Hoethker. `Reduction of Dutch Sentences for Automatic Subtitling.' In: Decadt, B, V. Hoste, and G. De Pauw (eds). Computational Linguistics in the Netherlands 2003. Antwerp Papers in Linguistics, 111, 109-123, 2004. [pdf]
De Sitter, An, Toon Calders, and Walter Daelemans, `A formal framework for evaluation of information extraction.' Technical report, University of Antwerp, Dept. of Mathematics and Computer Science, TR 2004-04, 12 pages, 2004. [pdf]
De Sitter, An and Walter Daelemans, `Information Extraction via Double Classification.' Proceedings of the International Workshop on Adaptive Text Extraction and Mining. Catvat-Dubrovnik, Croatia, 66-73, September 2003. [Also: Department of Mathematics and Computer Science, University of Antwerp, Technical Report 2003-06.] [pdf]
Zavrel Jakub and Walter Daelemans. `Feature-Rich Memory-Based Classification for Shallow NLP and Information Extraction.' Jurgen Franke, Gholamreza Nakhaeizadeh and Ingrid Renz (eds.), Text Mining, Theoretical Aspects and Applications, Springer Physica-Verlag, 33-54, 2003. [pdf]
Reinberger, Marie-Laure and Walter Daelemans, Unsupervised Text Mining for Ontology Extraction: An Evaluation of Statistical Measures. Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004), 491-494, Lisbon, 2004. [pdf]
Reinberger, Marie-Laure, Peter Spyns, A. Johannes Pretorius and Walter Daelemans, `Automatic Initiation of an Ontology'. Proceedings of ODBase'04, Ayia Napa, Cyprus, Lecture Notes in Computer Science, Springer-Verlag, 600-617, 2004. [pdf]
York Sure, Asuncion Gomez-Perez, Walter Daelemans, Marie-Laure Reinberger, Nicola Guarino and Natalya Noy. `Why Evaluate Ontology Technologies? Because It Works!' IEEE Intelligent Systems, Trends & Controversies, 19(4), 74-81, Jul/Aug 2004. [pdf] [preprint]
Reinberger, Marie-Laure and Walter Daelemans. `Is shallow parsing useful for the unsupervised learning of semantic clusters?' Gelbukh, Alexander (ed.) Proceedings of the 4th Conference on Intelligent Text Processing and Computational Linguistics (CICLing 2003), Mexico City, Mexico, LNCS 2588, Springer Verlag, 2003, pp. 304-313. [pdf]
Reinberger, Marie-Laure, Peter Spyns, Walter Daelemans, and Robert Meersman. `Mining for lexons: applying unsupervised learning methods to create ontology bases.' Meersman, R., Zahir Tari, and Douglas Schmidt (eds.) On the Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE, Lecture Notes in Computer Science 2888, Springer-Verlag, Catania, Italy, 803-819, 2003. [pdf]
Buchholz, Sabine and Walter Daelemans. `Complex answers: a case study using a WWW question answering system.' Natural Language Engineering 7 (4): 301-323, 2001. (preprint). [ps]
Buchholz, Sabine, and Walter Daelemans. `SHAPAQA: Shallow Parsing for Question Answering on the World Wide Web.' In: Galia Angelova et al. (Eds.) Proceedings Euroconference Recent Advances in Natural Language Processing, Tsigov Chark, Bulgaria, 5-7 September, 47-51, 2001. [ps]
Roser Morante, Sarah Schrauwen and Walter Daelemans. `Corpus-based approaches to processing the scope of negation cues: an evaluation of the state of the art.' In: Johan Bos and Stephen Pulman (editors), Proceedings of the Ninth International Conference on Computational Semantics (IWCS 2011), pp. 350-354. Oxford, UK. [pdf]
Walter Daelemans, Vincent Van Asch, Roser Morante. `Memory-based biomedical Named-Entity tagging.' In: Dietrich Rebholz-Schuhmann and Udo Hahn (eds.) Proceedings of the First CALBC Workshop, Cambridgeshire, EBI, p. 31--32, 2010. [pdf]
Thomas Abeel, Sofie Van Landeghem, Roser Morante, Vincent Van Asch, Yves Van de Peer, Walter Daelemans, Yvan Saeys. `Highlights of the BioTM 2010 workshop on advances in bio text mining.' BMC Bioinformatics 11, Suppl 5,p. 1-3, 2010. [pdf]
Roser Morante, Vincent Van Asch, Walter Daelemans. `Memory-based resolution of in-sentence scopes of hedge cues.' Proceedings of the Fourteenth Conference on Computational Natural Language Learning: Shared Task, Uppsala, Sweden, Association for Computational Linguistics, 40-47, 2010. [pdf]
Roser Morante, Vincent Van Asch and Walter Daelemans. `Extraction of Biomedical Events.' In: Eline Westerhout, Thomas Markus, Paola Monachesi (eds.) Computational Linguistics in the Netherlands. Selected papers from the twentieth CLIN meeting., LOT, Utrecht, p. 91-106, 2010. [pdf]
Morante Roser, van Asch Vincent, Daelemans Walter. ‘A memory-based learning approach to event extraction in biomedical texts.’ Proceedings of the Workshop on BioNLP: Shared Task, Boulder, Colorado, Association for Computational Linguistics, June 2009, p. 59–67. [pdf]
Morante Roser, Daelemans Walter. ‘A metalearning approach to processing the scope of negation.’ Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL), Boulder, Colorado, Association for Computational Linguistics, June 2009, p. 21–29. [pdf]
Morante Roser, Daelemans Walter. ‘Learning the scope of hedge cues in biomedical texts.’ Proceedings of the Workshop on BioNLP, Boulder, Colorado, Association for Computational Linguistics, 2009, p. 28–36. [pdf]
Morante Roser, Anthony Liekens, and Walter Daelemans. ‘Learning the Scope of Negation in Biomedical Texts.’ Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, Honolulu, 2008, p. 715–724. [pdf]
The well-known text categorization approach, in which relevant keywords extracted from documents are combined with supervised machine learning can be easily extended to assign meta-information to a text about the author of that text. The extensions involve looking for linguistic features (using text analysis) and machine learning methods suited for the task. The meta-information that can be predicted from text includes the identity of the author, personality, gender, age, region, etc. Authorship attribution both in literary science and in forensic applications is the most important application area.
Mike Kestemont, Walter Daelemans, Guy De Pauw. `Weigh Your Words -- memory-based lemmatization for Middle Dutch.' Literary and Linguistic Computing, Vol. 25, No. 3, 287-301, 2010. [pdf]
Frederik Vaassen and Walter Daelemans. `Emotion Classification in a Serious Game for Training Communication Skills.' In: Eline Westerhout, Thomas Markus, Paola Monachesi (eds.) Computational Linguistics in the Netherlands. Selected papers from the twentieth CLIN meeting., LOT, Utrecht, p. 155-169, 2010. [pdf]
Mike Kestemont, Walter Daelemans and Guy De Pauw. `Space traveling: Assessing the 'soundness' of class labels in memory-based learning and the case of Middle Dutch spelling variation' (extended abstract). In: Proceedings of the 19th Annual Belgian-Dutch Conference on Machine Learning (Benelearn 2010), Leuven, 2010, 2 pages. [pdf]
Luyckx Kim and Walter Daelemans. ‘Authorship attribution and verification with many authors and limited data.’ In: Proceedings of the 22nd International Conference on Computational Linguistics (COLING 2008), Manchester, 2008, p. 513-520. [pdf]
Luyckx Kim and Walter Daelemans. ‘Personae: a corpus for author and personality prediction from text.’ In: Proceedings of LREC-2008, the Sixth International Language Resources and Evaluation Conference, Marrakech, 2008. [pdf]
Luyckx Kim, Walter Daelemans. ‘Using syntactic features to predict author personality from text.’ In: Proceedings of Digital Humanities 2008, Oulu, Finland, 2008, p. 146-149. [pdf]
Luyckx Kim, Daelemans Walter, Vanhoutte Edward, `Stylogenetics: clustering-based stylistic analysis of literary corpora.' In: Proceedings of LREC-2006: the 5th International Language Resources and Evaluation Conference, Workshop Towards Computational Models of Literary Analysis, Genova, ILC, 30-35, 2006. [pdf]
Kim Luyckx, and Walter Daelemans. `Shallow Text Analysis and Machine Learning for Authorship Attribution.' In: Ton van der Wouden, Michaela Poss, Hilke Reckman, Crit Cremers (eds.) Computational Linguistics in the Netherlands 2004. Selected papers from the fifteenth CLIN meeting., Utrecht: LOT, 149-160, 2005. [pdf]
Lieve Macken and Walter Daelemans. ‘A Chunk-Driven Bootstrapping Approach to Extracting Translation Patterns.’ In: A. Gelbukh (ed.), 11th International Conference on Intelligent Text Processing and Computational Linguistics. Iasi, Romania, Vol. 6009 of Lecture Notes in Computer Science, Springer Verlag, Heidelberg, pp. 394-405, 2010. [pdf]
Walter Daelemans and Véronique Hoste (eds.) Evaluation of Translation Technology. LANS 8/2009. Brussels: Academic and Scientific Publishers. 261 pages. 2010. [Website]
Walter Daelemans and Véronique Hoste. ‘Evaluation of Translation Technology.’ in Daelemans et al. (eds.) Evaluation of Translation Technology. LANS 8/2009, p. 9–13, 2010. [pdf]
Lieve Macken and Walter Daelemans, ‘Aligning linguistically motivated phrases.’ Selected Papers from the 18th Computational Linguistics in the Netherlands Meeting. Verberne et al. (eds.), Nijmegen, The Netherlands, 2008, p. 37–52. [pdf]
Walter Daelemans, Review of ‘Recent Advances in Example-based Machine Translation’ by Michael Carl and Andy Way (editors). Computational Linguistics 30(4), 516-520, December 2004. [pdf]
Memory-based (exemplar-based / lazy learning / instance-based) models also have strong potential for explaining empirical findings about language acquisition and processing, and for providing a model of usage-based models in cognitive linguistics. We have investigated phonological (word stress), morphological (inflection), and syntactic phenomena (thematic fit) with exemplar-based models.
Vandekerckhove Bram, Sandra Dominiek, Daelemans Walter. ‘A robust and extensible exemplar-based model of thematic fit.’ Lascarides et al. (eds.) Proceedings of the 12th Conference of the European Chapter of the ACL, Athens, Greece, Association for Computational Linguistics, 2009, p. 826–834. [pdf]
Emmanuel Keuleers and Walter Daelemans, `Memory-Based Learning Models of Inflectional Morphology: A Methodological Case Study.' Lingue e Linguaggio VI.2, 151-174, 2007. [pdf]
Keuleers Emmanuel, Sandra Dominiek, Daelemans Walter, Gillis Steven, Durieux Gert, Martens Evelyn, `Dutch plural inflection: the exception that proves the analogy.' In: Cognitive Psychology, 54:4, 283-318, 2007. [pdf]
Martens, E., W. Daelemans, S. Gillis, and H. Taelman, `Where do syllables come from?' In: W. Gray and C. Schunn (Eds.) Proceedings of the Twenty-Fourth Annual Conference of the Cognitive Science Society, Fairfax, Virginia, George Mason University, 657-664, 2002. [pdf]
Gillis, Steven, Walter Daelemans, and Gert Durieux, Lazy Learning: A comparison of Natural and Machine Learning of Stress In: P. Broeder and J.M.J. Murre (Eds.), Models of Language Acquisition: inductive and deductive approaches . Oxford University Press, 76-99, 2000. [ps]
Gillis, Steven, Gert Durieux, and Walter Daelemans. "A computational model of P&P: Dresher & Kaye (1990) revisited." In M. Verrips & F. Wijnen (eds.) Approaches to Parameter Setting. Amsterdam Series in Child Language Development 5 (1995): 135-173. [ps]
Daelemans, Walter, Steven Gillis, and Gert Durieux. "The acquisition of stress: A data-oriented approach." Computational Linguistics 20 (1994): 421-451. [ps]
Gillis, Steven, Walter Daelemans, and Gert Durieux. "Are children Lazy Learners? A comparison of natural and machine learning of stress." In Proceedings of the Sixteenth Annual Conference of the Cognitive Science Society, edited by A. Ram and K. Eiselt, 369-374. Hillsdale: Erlbaum, 1994. [ps]
Steven Gillis, Walter Daelemans, Gert Durieux and Antal van den Bosch. "Learnability and Markedness: Dutch Stress Assignment." In Proceedings of the Fifteenth Annual Conference of the Cognitive Science Society, Boulder Colorado, USA, Hillsdale: Lawrence Erlbaum Associates, 452-457, 1993. [ps]
Daelemans, Walter. `Review of: Knowledge and learning in language, Charles D, Yang', Glot International 6(5), 137-142, 2002. [pdf] (preprint)
Daelemans, W. `A comparison of analogical modeling of language to memory-based language processing.' In: R. Skousen, D. Lonsdale and D. Parkinson (eds.). Analogical Modeling. Amsterdam: John Benjamins, 157-179, 2002. [pdf] (preprint)
Daelemans, Walter, Review of Learnability in Optimality Theory, Computational Linguistics 27 (2), 316-317, 2001. [ps]
Van den Bosch, Antal, and Walter Daelemans, A Distributed, Yet Symbolic Model of Text-to-Speech Processing. In: P. Broeder and J.M.J. Murre (Eds.), Models of Language Acquisition: inductive and deductive approaches . Oxford University Press, 55-75, 2000. [ps]
Durieux, Gert, Walter Daelemans and Steven Gillis, On the Arbitrariness of Lexical Categories. In: Frank Van Eynde, Ineke Schuurman and Ness Schelkens (eds.), Computational Linguistics in The Netherlands 1998, Amsterdam: Rodopi, pp. 19-36, 1999. [ps]
Daelemans, Walter, Toward an exemplar-based computational model for cognitive grammar. In: Van Der Auwera et al. (eds), English as a Human Language. Munchen: LINCOM, 73-82, 1998. [ps]
Daelemans, Walter, Peter Berck, and Steven Gillis. "Data mining as a method for linguistic analysis: Dutch diminutives." Folia Linguistica 31 (1997): 57-75. [ps]
Van den Bosch, Antal, Alain Content, Walter Daelemans, and Beatrice De Gelder. Measuring the complexity of Writing Systems. In Journal of Quantitative Linguistics, 1, 3, 178-188, 1994. [ps] (preprint)
Maruster, Laura, Wil van der Aalst, Ton Weijters, Antal van den Bosch, Walter Daelemans. `Automatic Discovery of Workflow Models from Hospital Data.' Proceedings BNAIC-01, Amterdam, October 25-26, 183-194, 2001. [ps]
Maruster, Laura, Ton Weijters, Geerhard de Vries, Antal van den Bosch, and Walter Daelemans. Logistic-based patient grouping for multi-disciplinary treatment. Artificial Intelligence in Medicine 26, 87-107, 2002. [ps]
Daelemans, Walter, and Koen De Smedt. "Default Inheritance in an Object-Oriented Representation of Linguistic Categories." In International Journal Human-Computer Studies 41, 149-177, 1994. [ps]
Walter Daelemans, D. Binnenpoorte, F. de Vriend, J. Sturm, H. Strik & C. Cucchiarini. `Establishing priorities in the developement of HLT resources: the Dutch-Flemish experience.' In: W. Daelemans, T. du Plessis, C. Snyman & L. Teck (eds.), Multilingualism and Electronic Language Management . Proceedings of the 4th International MIDP Colloquium}, 22-23 September 2003, Bloemfontein, South Africa (Studies in Language Policy in South Africa 4). Pretoria: Van Schaik Publishers, 9-23, 2005. [pdf]
Walter Daelemans, Theo du Plessis, Cobus Snyman and Lut Teck (eds.), Multilingualism and Electronic Language Management'. Proceedings of the 4th International MIDP Colloquium, 22-23 September 2003, Bloemfontein, South Africa (Studies in Language Policy in South Africa 4). Pretoria: Van Schaik Publishers, 2005, ISBN: 062702601X
Daelemans Walter, and Helmer Strik (Eds.) Actieplan voor het Nederlands in de taal- en spraaktechnologie: prioriteiten voor basisvoorzieningen', Report for the Nederlandse Taalunie, 165 pages, 2002. (Action Plan for Dutch in Language and Speech technology: Priorities for Basic Resources).
Strik, H., W. Daelemans, D. Binnenpoorte, J. Sturm, F. De Vriend, C. Cucchiarini. `Dutch HLT resources: From BLARK to priority lists.' Proceedings of ICSLP-2002, Denver, USA, 1549-1552, 2002. [pdf]