The Future of Artificial Language Learning in Linguistics
4 September, 2024

Schedule

Below you will find the workshop schedule.
All talks take place in G.03 on the ground floor of the Bayes Centre at the University of Edinburgh. Lunch, coffee, and reception following the workshop will all be in the Bayes Centre Atrium. To join virtually on teams, use this link.

Wednesday, 4 September

9h00–9h30 Welcome and coffee (Bayes Centre Atrium)
9h30–10h15 Dr. Alexander Martin
Using artificial language learning cross-linguistically: A test of a proposed word order universal

Strong typological tendencies are often claimed to be driven by features of cognition. Here, we examine an oft discussed word order universal, Greenberg’s U20, and its potential cognitive underpinnings. Building on previous work, we use a simple version of the artificial language learning paradigm (using L1 lexical items) to show that speakers of both English and Thai show structural biases in line with the proposed universal. We then show that both English and Kîîtharaka speakers (whose language actually goes against the word order universal) show the same structural biases in word order preferences when faced with an entirely novel artificial language. This is the clearest experimental evidence to date that the typological tendency is due to universal features of human cognition.

10h15–11h00 Dr. Mora Maldonado
Uncovering representational constraints in functional domains: how can experiments help?

Categorization systems vary widely across languages, but not all ways of partitioning a semantic space are equally likely cross-linguistically. Regularities in categorization are often thought to reflect representational constraints—specific ways of representing meaning. In this talk, I will present joint work with Jenny and Noga Zaslavsky, where we use an experimental approach to investigate constraints in the representation of person systems (i.e., systems used to refer to individuals based on their role in an utterance, such as speaker and addressee). I will begin by presenting a series of artificial language learning experiments that explore representational biases during learning. Our results suggest that learners prefer person systems that (a) are based on natural classes and (b) maintain distinct forms for the speaker, indicating that these are two independent forces influencing the learnability of person systems. Next, I will discuss card sorting results which suggest that these two constraints are grounded in language-independent representations of pronominal referents and are shared by speakers speakers of different languages. Interestingly, when integrated into a model that also incorporates environmental and communicative biases, these constraints help account for key aspects of cross-linguistic patterns in person systems. I will finish by briefly outlining some recent attempts to extend this approach to new functional domains, such as number and spatial indexicality, highlighting the challenges of applying an experimental approach to a functional domain.

11h00–11h45 Dr. Carmen Saldaña
Cognitive and typological biases in paradigmatic morphosyntax: Matches and mismatches

Artificial Language Learning (ALL) experiments have become a widely accepted method for experimentally testing the relationship between human cognition and linguistic typology. These studies rest on the assumption that ALL experiments reveal probabilistic biases driven by individual’s mental representations or by efficiency principles common to all linguistic populations, which affect how languages are learned and used in communication. These biases are thought to play a crucial role in shaping linguistic structures as they evolve over time, leading to cross-linguistic regularities. However, fully testing these assumptions presents (at least) three significant methodological challenges, which are common shortcomings in existing research. The first challenge is to collect sufficiently large and representative linguistic samples and to statistically evaluate the proposed biases in typology, ensuring they can be described as cross-linguistic regularities independent of effects specific to certain language families. The second challenge involves disentangling which, how, and when biases apply and permeate the linguistic systems at the population level. The third challenge is to gather experimental data from a diverse sample of linguistic populations to evaluate whether these biases are indeed fundamental features shared across human cognition and communication. While the third challenge will be addressed in previous talks within this workshop, in this talk, I will discuss a series of ALL studies that combine large-scale statistical models of both synchronic and diachronic typological biases to address the first two challenges. These studies focus on the link between cognition and cross-linguistic regularities in the paradigmatic structure of morphosyntactic marking, examining instances where cognitive and typological biases not only match but also where they mismatch.

11h45–12h30 Dr. Wataru Uegaki
Lexicalisation universals and artificial language learning: the case of deontic priority

The notion of lexicalisation plays a central role in a number of hypothesised semantic universals. For example, the monotonicity universal for quantifiers (Barwise & Cooper 1981) states that all *lexical* quantifiers are monotonic. In our cross-linguistic data collection (Uegaki, Mucha et al. 2024), we have identified a new generalization in the *lexicalisation* of impossibility expressions, which we call Deontic Priority. According to the generalization, if a lexicalized impossibility modal has an epistemic flavour, it also has a deontic flavour. Why are these semantic constraints operate only on lexicalised items, but not to compositional expressions? In this talk, we explore the idea that pressure for efficient communication drives lexicalisation of meanings having high communicative utility (among other properties). Focusing on the Deontic Priority generalisation, we present results of an interactive artificial language learning experiment designed to test this hypothesis.

12h30–13h30 Lunch break (Bayes Centre Atrium)
13h30–14h15 Dr. András Bárány
Artificial language learning of case and agreement patterns

In this talk I present preliminary results of two ongoing ALL studies of case and agreement in ditransitive constructions. In the first of these (with Michelle Sheehan and John Williams), we explore a typological gap in ditransitive constructions. The expression of theme and recipient in ditransitives varies across languages (cf. I gave the flowers to Mary vs. I gave Mary the flowers). In languages with object agreement, it also varies whether the theme (the flowers) or the recipient (Mary) can control agreement. But not all combinations of agreement and morphological expression are attested. I report preliminary results of an ALL experiment testing how attested an unattested patterns are learned.

The second study (with Jennifer Culbertson) explores the person–case constraint (or PCC). Many languages disallow combinations of first person themes in the context of third person recipients (e.g. Mary introduced me to him.), while they allow the opposite (e.g. ... him to me.). We tested whether providing participants with different frequencies of common (him to me) vs. uncommon (me to him) patterns would influence their responses in a production task.

Finally, I address opportunities and difficulties of testing complex case and agreement patterns using ALL studies.

14h15–15h00 Dr. Gaja Jarosz
The Joint Roles and Types and Tokens in Generalization and Learning

We examine the joint roles of type frequency and token frequency in three artificial language learning experiments involving lexically-conditioned plural allomorphy. The primary role of type frequency in productivity is well-established, but debates about the precise relationship between type frequency and productivity continue. The effect of token frequency on productivity is even more controversial: some lines of research suggest that token frequency and productivity are inversely related, other results indicate that they are positively related, and yet others argue that token frequency plays no role in productivity. We take up both of these questions. Our learning framework makes it possible to examine the effects of these variables on generalization to novel forms as well as to examine how sensitivity to these factors affects the time-course of learning. The first two experiments differentiate the predictions for generalization of three distinct hypotheses about the relationship between productivity and type frequency, while the third experiment investigates the independent role of token frequency, above and beyond type frequency. We find that both type and token frequency independently and positively contribute to learning rates and generalization across the three experiments, with type frequency playing a dominant role. Implications for learning models and linguistic theories of item-specificity are discussed.

15h00–15h15 Coffee break (Bayes Centre Atrium)
15h15–15h45 Dr. Isabelle Dautriche
Elementary semantic abilities in non-human primates

Human languages share a number of universal features, from the properties of lexical meanings to the way these meanings are composed and realised on the surface (i.e. word order). Artificial Learning Language experiments have been instrumental in better understanding what aspects of cognition crystallise into language. In this talk, I will show how this type of paradigm can be readily adapted to non-linguistic species to investigate the cognitive origins of the shared features of language. I will report three sets of studies suggesting that baboons (Papio papio) (i) like to manipulate ‘concepts’ of the same shape as ours, (ii) can compose mental representations and (iii) display attentional preferences consistent with some of the word order found in language.

15h45–16h30 Summary (Jenny Culbertson)
16h30–17h00 Open Discussion (all!)
17h00–19h00 Reception (Bayes Centre Atrium)