International Centre for Language and Communicative Development: Defaulting Effects Contribute to the Simulation of Cross-linguistic Differences in Optional Infinitive Errors, 2014-2020

DOI

This paper describes an extension to the MOSAIC model which aims to increase MOSAIC’s fit to the cross-linguistic occurrence of Optional Infinitive (OI) errors. While previous versions of MOSAIC have successfully simulated these errors as truncated compound finites with missing modals or auxiliaries, they have tended to underestimate the rate of OI errors in (some) obligatory subject languages. Here, we explore defaulting effects, where the most frequent form of a given verb is substituted for less frequent forms, as an additional source of OI errors. It is shown that defaulting in English tends to result in the production of bare forms that are indistinguishable from the infinitive, while defaulting in Spanish is less pronounced, and tends to result in the production of 3rd person singular forms. Dutch verb forms are dominated by the stem in corpus-wide statistics, and the infinitive in utterance-final position, suggesting defaulting in Dutch may change qualitatively across development. Defaulting is shown to increase MOSAIC’s fit to English and Dutch without affecting its already good fit to Spanish, and provides a potential way of simulating the cross-linguistic pattern of verb-marking errors in children with SLI.The International Centre for Language and Communicative Development (LuCiD) will bring about a transformation in our understanding of how children learn to communicate, and deliver the crucial information needed to design effective interventions in child healthcare, communicative development and early years education. Learning to use language to communicate is hugely important for society. Failure to develop language and communication skills at the right age is a major predictor of educational and social inequality in later life. To tackle this problem, we need to know the answers to a number of questions: How do children learn language from what they see and hear? What do measures of children's brain activity tell us about what they know? and How do differences between children and differences in their environments affect how children learn to talk? Answering these questions is a major challenge for researchers. LuCiD will bring together researchers from a wide range of different backgrounds to address this challenge. The LuCiD Centre will be based in the North West of England and will coordinate five streams of research in the UK and abroad. It will use multiple methods to address central issues, create new technology products, and communicate evidence-based information directly to other researchers and to parents, practitioners and policy-makers. LuCiD's RESEARCH AGENDA will address four key questions in language and communicative development: 1. ENVIRONMENT: How do children combine the different kinds of information that they see and hear to learn language? 2. KNOWLEDGE: How do children learn the word meanings and grammatical categories of their language? 3. COMMUNICATION: How do children learn to use their language to communicate effectively? 4. VARIATION: How do children learn languages with different structures and in different cultural environments? The fifth stream, the LANGUAGE 0-5 PROJECT, will connect the other four streams. It will follow 80 English learning children from 6 months to 5 years, studying how and why some children's language development is different from others. A key feature of this project is that the children will take part in studies within the other four streams. This will enable us to build a complete picture of language development from the very beginning through to school readiness. Applying different methods to study children's language development will constrain the types of explanations that can be proposed, helping us create much more accurate theories of language development. We will observe and record children in natural interaction as well as studying their language in more controlled experiments, using behavioural measures and correlations with brain activity (EEG). Transcripts of children's language and interaction will be analysed and used to model how these two are related using powerful computer algorithms. LuciD's TECHNOLOGY AGENDA will develop new multi-method approaches and create new technology products for researchers, healthcare and education professionals. We will build a 'big data' management and sharing system to make all our data freely available; create a toolkit of software (LANGUAGE RESEARCHER'S TOOLKIT) so that researchers can analyse speech more easily and more accurately; and develop a smartphone app (the BABYTALK APP) that will allow parents, researchers and practitioners to monitor, assess and promote children's language development. With the help of six IMPACT CHAMPIONS, LuCiD's COMMUNICATIONS AGENDA will ensure that parents know how they can best help their children learn to talk, and give healthcare and education professionals and policy-makers the information they need to create intervention programmes that are firmly rooted in the latest research findings.

In order to determine the potential effects of defaulting across the three languages, corpora of child-directed speech were analysed to derive counts for the different verb inflections. Counts were collected from a range of speakers. For English, the adult speech directed at all (12) children in the Manchester corpus (Theakston et al. 2001) was pooled. For Dutch, the pooled data from the Groningen corpus (Bol, 1996) was used. The Spanish counts were derived from the corpora of Juan and Lucia from the Nottingham corpus (Aguado-Orea, 2004) and combined with those of the Fern-Aguado corpus.

Identifier
DOI https://doi.org/10.5255/UKDA-SN-853921
Metadata Access https://datacatalogue.cessda.eu/oai-pmh/v0/oai?verb=GetRecord&metadataPrefix=oai_ddi25&identifier=50e63bd53e9ac1acfc8bc1f6257282a155a8994336fb35b63e28cab972cb0922
Provenance
Creator Freudenthal, D, University of Liverpool; Pine, J, University of Liverpool; Jones, G, Nottingham Trent University; Gobet, F, University of Liverpool
Publisher UK Data Service
Publication Year 2021
Funding Reference Economic and Social Research Council
Rights Daniel Freudenthal, University of Liverpool; The Data Collection is available to any user without the requirement for registration for download/access.
OpenAccess true
Representation
Language English
Resource Type Other
Discipline Humanities; Linguistics; Psychology; Social and Behavioural Sciences
Spatial Coverage United Kingdom