The Secret Life of Words: Exploring Regularity and Systematicity :: Отделение теоретической и прикладной лингвистики

The Secret Life of Words: Exploring Regularity and Systematicity

November 11, 2020
Join Zoom Meeting
Starts at 17:00 (Moscow time)

Ekaterina Vylomova

Postdoctoral Fellow
University of Melbourne

Ryan Cotterell

Assistant Professor of Computer Science
ETH Zürich / University of Cambridge

Abstract

In this talk, we discuss computational approaches to modeling form–meaning interactions.

The first part of the talk presents computational models of morphology. More specifically, we focus on the task of morphological inflection addressing the following research question: how well are data-driven models able to learn declension and/or conjugation systems? We first introduce a universal morphological annotation schema, UniMorph, that allows an inflected word from any language to be defined by its lexical meaning, typically carried by the lemma, and a bundle of universal morphological features defined by the schema. We then continue with a description of a series of shared tasks (2016–2020) where systems competed in their ability to learn representations of inflectional systems across many typologically diverse languages. Finally, we provide a taxonomy of errors made by most systems and various challenges the systems typically face.

The second part of the talk focuses on a longstanding debate in semiotics centers on the relationship between linguistic signs and their corresponding semantics: is there an arbitrary relationship between a word form and its meaning, or does some systematic phenomenon pervade? For instance, does the character bigram ‘gl’ have any systematic relationship to the meaning of words like ‘glisten’, ‘gleam’ and ‘glow’? We offer a holistic quantification of the systematicity of the sign using mutual information and recurrent neural networks. We employ these in a data-driven and massively multilingual approach to the question, examining 106 languages. We find a statistically significant reduction in entropy when modeling a word form conditioned on its semantic representation. Encouragingly, we also recover well-attested English examples of systematic affixes. We conclude with the meta-point: our approximate effect size (measured in bits) is quite small — despite some amount of systematicity between form and meaning, an arbitrary relationship and its resulting benefits dominate human language.

Presentation

Video

Part 1

Part 2

Part 3