You shall know a word by the (visual) company it keeps: Towards a multimodal distributional semantics
Many computational models of lexical semantics rely on the distributional hypothesis, that is, the idea that the meaning of a word can be approximated by the set of linguistic contexts in which the word occurs. In practice, this contextual distribution is encoded in a vector recording the word co-occurrence frequencies with a set of collocates in a large text corpus. On closer inspection, the distributional hypothesis is actually making two separate claims: 1) that meaning is approximated by context, and 2) that we can limit ourselves to linguistic contexts. The latter restriction has probably been adopted by computational linguists more out of necessity than out of theoretical beliefs: It is easy to extract the linguistic contexts in which a word occurs from corpora, whereas, until recently, it was not clear how other kinds of contexual information could be harvested on a large scale. But this has changed: Thanks to the Web, we now have access to huge amounts of multimodal documents where words co-occur with images (tagged Flickr pictures, illustrated news stories, YouTube videos...). And thanks to progress in computer vision, we can represent images in terms of automatically extracted discrete features, that can in turn be treated as visual collocates of the words associated with the images, enriching the vector-based representation of words with visual information. In this talk, I will briefly introduce the relevant techniques from computer vision, and report the results of the ongoing experiments from our lab in which we combine text- and image-derived collocates to derive distributional vectors that paint a richer picture of word meaning.
Marco Baroni is a tenured researcher in the CLIC group of CIMeC, the Center for Mind/Brain Sciences of the University of Trento. He is also a member of the DISCoF Department. His research areas are computational linguistics and cognitive science. In 2011, he was awarded an ERC Starting Grant for the 5-year COMPOSES project on compositionality in distributional semantics, which is now his main focus of research.
The colloquium takes place in Annexe, Lange Winkelstraat, 2000 Antwerp (building 3 on the campus map).