Sage Journals: Discover world-class research

Abstract

This article focuses on the early acquisition of fricative and affricate consonants in Setswana, a Bantu language spoken in Botswana and South Africa (Tswana S30). We describe a series of intriguing patterns displayed in the speech of Setswana-learning children between the ages of 1 and 3 years. The data display clear trends also expected from the larger literature on phonological and phonetic development, including the stopping, affrication and debuccalization of target fricatives |s, ʃ, f, χ|, and the simplification (deaffrication; deaspiration) of target affricates |tɫ, tɫʰ, ʦ, ʦʰ, ʧ, ʧʰ, ʤ|. Beyond these general trends, the data reveal intriguing asymmetries: (a) the fact that coronal fricatives |s/ʃ| display much less debuccalization to laryngeals than the non-coronal fricatives |f, χ|; (b) the observation that while the non-lateral affricates |ʦ, ʦʰ, tʃ, ʧʰ, ʤ| are generally produced with their target place of articulation, the lateral affricates |tɫ, tɫʰ| can be variably substituted for velar stops [k, kʰ] instead. We first discuss why an analysis of the observations stated above transcends traditional models of phonology based on phonological features. We then argue that, in addition to phonological conditioning, issues in speech phonetics may also influence how children analyze the speech forms of their language. Our analyses reconcile the data with phonological theory, also in ways that offer additional insight on the origins of speech sound substitutions in child language.

Keywords

child phonology features speech sounds phonological development developmental asymmetries phonological processes phonetic factors sound substitution affricates fricatives

Introduction

Mainstream models of phonological theory coalesce around sets of phonological features, taken as the atoms of phonological representation. Features can be used to express formal contrasts between speech sounds or encode phonological operations, for example, in terms of contextual allophonic variation or as the result of morpho-phonological alternations (Ewen et al., 2011; Goldsmith et al., 2011). Likewise, models of phonological acquisition often adopt similar formalisms to capture the types of substitutions we observe in child speech: Rules and/or constraints making reference to phonological features can be used to capture the types of speech sound substitutions that children commonly display as they learn to produce the sounds and sound combinations of their native languages. For example, it is common for children to add voicing to voiceless obstruents, in words such as papa [papa] produced as [baba]. This type of substitution can be captured via rules or constraints making reference to phonological features such as [±voiced] or [voiced]/[voiceless] feature contrasts in monovalent systems.

However, as discussed throughout the literature on child language acquisition, children often display speech sound substitutions that defy the type of logic that normally defines phonological patterning in adult systems (Priestly, 1977; Rose & Inkelas, 2011). Further, while substitution of |r| by [w]¹ (e.g. ring produced as [wɪŋ]) is frequently observed in the speech of English-learning children, it is not clear whether it should be captured in terms of place of articulation (labiality) or sonority (liquid-to-glide substitution) (Rose & Penney, 2022 for recent discussions).

In this article, we focus on similar challenges, this time from the early acquisition of fricative and affricate consonants by first-language learners of Setswana, the national language of Botswana, aged between 1 and 3 years old.² Beyond well-documented general trends, the data reveal asymmetrical patterns between different types of fricatives as well as completely unexpected patterns of place substitutions affecting lateral affricates, at least if considered from traditional analyses based on phonological features. In order to reconcile data with theory, we claim that the patterns observed have their origins in the children’s own analyses of the target speech forms, which can be influenced, at least in part, by phonetic conditioning, especially in the area of auditory perception. Our interpretation of the data embraces an emergentist approach to phonological knowledge, in particular concerning how speech phonetics can shape the child’s own representation of the speech forms they attempt to reproduce through their own speech. We take as examples two prominent asymmetries. The first is the observation that while debuccalization of target fricatives to laryngeals is noticeable across the sample set, coronal fricatives |s/ʃ| are much less affected by this process than the non-coronal fricatives |f, χ|. Second, while the non-lateral affricates |ʦ, ʦʰ, ʧ, ʧʰ, ʤ| are generally reduced to simpler stops or fricatives with their target places of articulation, as coronals, the lateral affricates |tɫ, tɫʰ| may also be substituted for velar stops [k, kʰ], a place substitution pattern that is clearly surprising at first sight. We frame our discussion of these patterns within the A-map model of phonological emergence (McAllister Byun et al., 2016; Rose et al., 2021), which focuses on the mechanisms by which the child learner must map auditory categories of speech into the types of articulatory gestures required to reproduce these categories into speech. As predicted by the A-map, early auditory-articulatory mappings of speech forms may yield variable and/or surprising results, as the child must first discover the correct mappings, a non-trivial challenge and, from there, learn to reproduce them faithfully and reliably in their own speech productions.

This study does not contribute developmental norms for the language, a task that extends well beyond the scope of our empirical investigation. However, this study offers keys to understand some of the challenges that Setswana learning children must cope with while learning the sets of fricative and affricate consonants that compose the language. By extension, it contributes insight toward the development of speech assessment materials and services for Setswana learning children, an emerging area of public service in Botswana.

We begin with an overview of the phonological system of Setswana, in the next section. We then provide background to some theoretical as well as empirical issues in the acquisition of Setswana phonology, in section ‘Background’. Building on this background, we introduce our study methods in section ‘Current study’, followed in section ‘Results’ by the relevant data obtained from our empirical sampling. In section ‘Phonetic factors affecting the acquisition of fricative and affricate places of articulation’, we discuss the theoretical implications of these findings and, for each pattern, the types of phonetic factors that may give rise to their emergence in the children’s speech. We conclude briefly in section ‘Conclusion’.

Setswana Phonology

Consonant Phonemes

Setswana, also known as Tswana (S30), is a Bantu language spoken in Botswana, South Africa, Namibia, and Zimbabwe. Setswana is the official language of Botswana.³ In Botswana, it is spoken by approximately 1.3 million individuals (approximately 79% of the population; Batibo et al., 2003). In South Africa, Setswana is 1 of 11 official languages, while in Namibia and Zimbabwe it is a minority language. Table 1 provides the 28 distinctive consonants of Setswana (Department of African Languages & Literature [DALL], 1999).

Table 1.

Setswana Consonant Inventory.

Categories	Labial	Alveolar	Latero-alveolar	Alveo-palatal	Palatal	Velar	Uvular	Laryngeal
Plosives	p b pʰ	t tʰ				k kʰ	qʰ
Fricatives	f	s		ʃ			χ	h
Affricates		ʦ ʦʰ	tɫ tɫʰ	ʧ ʧʰ ʤ
Nasals	m	n			ɲ	ŋ
Trill		r
Lateral			[ɫ/d]
Glides	w				j

Source. Adapted from DALL(1999, p. 10).

[d] is an allophone of /ɫ/ and the two occur in complementary distribution. [d] is found before [+high] vowels /u/ and /i/ while /ɫ/ precedes the vowels /ɪ ɛ a ɔ ʊ/. In addition to the 28 consonants, Setswana includes in its inventory a series of consonants with a secondary, labio-velarized place of articulation (Cole, 1955; Gouskova et al., 2011; Mathangwane, 1999; Mogapi, 1984; Otlogetswe, 2017; Rogers, 2009), including the plosives (/tʷ tʰʷ kʷ kʰʷ qʰʷ/), fricatives (/sʷ ʃʷ χʷ/), affricates (/ʦʷ ʦʰʷ tɫʷ tɫʰʷ ʤʷ ʧʷ ʧʰʷ/, nasals /nʷ ɲʷ ŋʷ/, and the liquids / rʷ ɫʷ/.

The discussion below focuses on Setswana’s rich series of fricative and affricate consonants. Voiceless fricatives are attested across all major places of articulation (labial, coronal, dorsal), in addition to the laryngeal /h/. Coronal affricates are attested at the alveolar, alveo-palatal and latero-alveolar places of articulation. As we will see in the data, different places of articulation for fricatives and affricates in Setswana may yield different patterns of development.

Syllable Structure

Setswana, just like many other Bantu languages, displays a maximally CV syllable structure. A word can begin either with a single consonant or a single vowel, but must always end with a vowel, and there are no consonant clusters word medially (e.g. mosadi /mʊsadi/ ‘woman’, peo /pe.ʊ/ ‘seed’, aba /a.ba/ ‘give away’; DALL, 1999, p. 31; Otlogetswe, 2017, p. 404). The nasals, the trill, and the lateral are syllabic, and display the same distribution as vowels (Cole, 1955) in word initial position (i.e. mpho [m.pʰɔ] ‘gift; ntate [n̩.ta.tɛ] ‘my father; nko [ŋ.kɔ] ‘nose; nnyaya [ɲ.ɲa.ja] ‘no’; lla [l.la] ‘cry; rra [r.ra] ‘sir/Mr.); word medial position (i.e. mampharing [ma.m.p^ha.ri.ŋ] ‘lizard’; monna [mʊ.n.na] ‘adult male’; monngame [mʊ.ŋ.ŋa.me] ‘my chief’; mollo [mʊ.l.lɔ] ‘fire’); and word final position (i.e. leng [lɪː.ŋ̩] ‘what’).

Given the relative simplicity of Setswana syllable structure, the children’s data in the area of syllable structure development in the language are relatively unremarkable, also given the early acquisition of syllabic consonants (Tsonope, 1993). While it is possible that extremely early developmental data (e.g. during the transition between the babble and early speech stages) reveal phonological conditioning in the development of Setswana syllables, for example, concerning the syllabic nature of the sonorant consonants, our data do not suggest any particular patterns. On this note, we introduce the background to our study, in the next section.

Background

Phonological Processes in Child Language

Stampe (1973, p. 1) defines child language phonological processes as ‘mental operations that apply in speech to substitute, for a class of sounds or sound sequences presenting a common difficulty to the speech capacity of the individual, an alternative class identical but lacking in the difficult property’. Phonological processes are important because they reveal the types of constraints at play, be they phonological, in relation to various issues in structural complexity, perceptual, from the perspective of the child’s auditory categorization of speech sounds in context, or in relation to issues in oro-motor development (Green et al., 2002; McAllister Byun et al., 2016). As recent research clearly highlights, phonological processes must also be interpreted in the context of individual languages, given that they do not manifest themselves the same across languages or even, at times, between individual learners of the same languages (Rose & Penney, 2022, for a recent summary). It is thus important to document child speech across a maximum number of languages, to properly identify each contributing factor and its relative influence on the data.

In light of these general observations, our understanding of phonological development requires that we interpret the data both phonologically and phonetically, for example in terms of the phonological contrasts relevant to the learner’s target language(s) as well as how these contrasts manifest themselves phonetically. For example, while rhotic consonants (or ‘r’ sounds) are commonly analyzed through the feature [rhotic] within the phonological literature, these consonants employ a wide range of places and manners of articulation across languages (e.g. apical, retroflex or uvular, produced as taps, trills or even fricatives). In turn, these different phonetic attributes may influence how rhotics are acquired by learners of different languages (Bernhardt & Stemberger, 2018; Rose & Penney, 2022). These learners must indeed acquire rhoticity as a phonological category, its phonological distribution as well as how each position manifests itself within the speech stream. Moving from rhotics, which have been discussed within the recent literature, we now turn to the acquisition of fricatives and affricates in Setswana.

Fricatives and Affricates Across Languages and in Setswana

From an articulatory standpoint, fricatives are produced by forcing air through a narrow point of constriction, resulting in turbulent noise within the speech signal throughout the duration of the consonant. As turbulent airflow may involve different speech articulators and configurations (e.g. tongue grooving or spreading), the production of fricatives involves high degrees of articulatory precision (Kent, 1992; Rose et al., 2021).

In comparison, affricates begin with a complete blocking of the airflow released into a fricative constriction at the same point of articulation (Ladefoged & Maddieson, 1996, pp. 90–91; Stevens, 1993). Affricates are thus structurally more complex than fricatives, given their sequenced manners of articulation; the fricative release involves similar levels of articulatory complexity at each point of articulation (Chomsky & Halle, 1968; Ladefoged & Maddieson, 1996; Lombardi, 1991; Rubach, 1994; Sagey, 1986; Stevens, 1993).

From an acoustic standpoint, the frication present in fricatives and affricates generally falls into one of two categories: louder, or strident, versus weaker, or non-strident. The former involves noticeably higher sound energy (Stevens, 1993, p. 251), compared to the latter. At a more formal level, one may assign the feature [+strident] to fricatives articulated within the alveolar and alveo-palatal places of articulation (e.g. [s z ʃ ʒ]) as well as corresponding affricates (LaCharité, 1993 for an early analysis; see also Kim et al., 2015 and references therein). In contrast to this, fricatives articulated across virtually all other places of articulation are assigned the feature [-strident] (e.g. [f, v, θ, ð, x, ɣ, h, ɦ]). As defined by Jakobson et al. (1952):

Strident phonemes are primarily characterized by a noise which is due to turbulence at the point of articulation. This strong turbulence, in its turn, is a consequence of a more complex impediment which distinguishes the strident from the corresponding mellow consonants [. . .] (p. 24)

From an auditory standpoint, stridency translates into robust cues to consonantal places of articulation. In contrast to this, non-strident fricatives and affricates display weaker cues to place of articulation. This also holds true of lateral affricates, whose second formant, its main cue to place of articulation, is of generally low amplitude (Ladefoged & Maddieson, 1996, p. 206). Below we show how the respective characteristics of these consonants influence their acquisition.

As predicted by the phonetics of both frication (partial constriction) and affrication (sequencing of constrictions), patterns of acquisition, in the spirit of Stampe (1973), predictably include stopping (complete closure at the point of constriction) and deaffrication to a single manner, either stop or fricative. These general predictions are borne out of the data, as we summarize in the next paragraphs.

Across languages, fricatives and affricates have been reported among later-developing sound categories, as they tend to be mastered after stops, nasals, and glides (Ferguson, 1978; Jakobson, 1941; Smit et al., 1990; Stoel-Gammon, 1985). Dodd et al. (2003) report the late acquisition of fricatives going beyond the age of 6 years (e.g. age 6;11). Ferguson (1978), Stoel-Gammon (1985), Dyson (1988) and Shriberg and Kwiatkowski (1994) observe the following order of acquisition for fricatives for English-speaking children: the earlier acquisition of |f s ʃ|, followed by |v z| and, later, by |θ ð ʒ|.⁴ Affricates, on the other hand are reported to emerge later than the fricatives in this general order: |ʧ ʤ| are acquired later than the fricatives |f s|. However, these affricates can be acquired alongside |ʃ| (Ingram, 1978; Shriberg & Kwiatkowski, 1994). The same general order was reported for Putonghua (Mandarin Chinese) by Hua and Dodd (2000). The production of these consonants can vary significantly between learners, including both typically developing children (Ferguson & Farwell, 1975; Smith, 1973) as well as children with speech sound disorders (Ingram et al., 1980; Shriberg, 1993; see also Bernhardt et al., 2015). Finally, fricatives and affricates may display different orders of acquisition across languages. For example, Cook (2006) reports that children learning Chipeywan (Dëne Sųłiné) acquire affricates and stops before fricatives. Further, Smith (1973) reports substitutions of both affricates and fricatives by stops in his longitudinal study of English acquisition; Smit (1993) reports similar patterns in her general survey of English acquisition.

Building on such observations, stopping appears to be the most common pattern affecting the acquisition of fricatives and affricates across languages (McLeod, 2007; Smith, 1973; Watts, 2018). Concerning place of articulation, depalatalization of alveo-palatal or palatal fricatives and affricates to their alveolar counterparts is frequently observed, with the opposite substitution robustly attested as well (Hodson & Paden, 1981; Hua & Dodd, 2000; McLeod, 2007). Finally, the literature reveals a general pattern of fricative debuccalization to [h] across languages, which appears to be much more prominent in younger than in older learners (Levelt, 1994; Smit, 1993). These general results are borne out of studies on the acquisition of African languages as well. This includes the acquisition of Akan (Amoako, 2020), isiXhosa (Lewis, 1994; Mowrer & Burger, 1991; Tuomi et al., 2001), isiZulu (Naidoo et al., 2005), Sesotho (Demuth, 1992, 2007), Swahili (Gangji, 2012) and Setswana (Mahura, 2014; Mahura & Pascoe, 2016; Matlhaku, 2023).

Concerning the general pattern of variation affecting the acquisition of precise coronal places of articulation, we follow the general literature on acquisition suggesting that during early stages of acquisition children have not yet acquired the fine distinctions between different coronal articulations. This observation can be captured in terms of lack of feature or feature specification within phonological representations (Fikkert & Levelt, 2008; Levelt, 1994; Levelt & van Oostendorp, 2007), also in relation to models of phonological underspecification (Lahiri & Reetz, 2010).

In sum, the general patterns affecting the general manner of articulation of target fricatives and affricates can be analyzed straightforward both phonologically and phonetically. However, as we argue below, additional, and at times more subtle, patterns exist in the data which reveal additional ways in which child learners may interpret the speech data of their target language. We introduce these patterns next, which we discuss in light of recent proposals on phonological emergence.

Factors Influencing the Emergence of Phonological Processes

According to McAllister Byun et al. (2016) and Rose et al. (2021), phonological explanations alone, which focus on phonological features and their development within the learners’ systems, cannot capture in sufficient detail the range of facts observed in the acquisition data. Following Priestly (1977), they note that many of the phonological processes observed in child language, can hardly, if at all, be encoded within formal models of phonology. They claim that these challenging patterns may in fact arise from pressures that lie outside the realm of phonology, including the general factors we overview in the next paragraphs.

One general factor concerns the physiological configuration of the vocal tract, in particular the differences in vocal tract sizes and configurations that exist between young children and adults. For example, a young child’s vocal tract is smaller, with the tongue disproportionately large and forward positioned, as it almost completely fills the entire length of the vocal cavity (Kent, 1992). This configuration in turn significantly affects the child’s lingual manoeuvrability and, consequently, hinders accurate production of linguo-palatal sounds (Kent, 1992; Vorperian et al., 2005).

Another general factor relates to the fact that children are composing with motor control limitations which may also hinder the production of certain sounds (Inkelas & Rose, 2003; Rose at al., 2021). Kent (1992) broadly differentiates between two types of constrictions, ballistic vs. controlled, as follows: Ballistic sounds typically involve movements of short duration, high velocity with rapid acceleration and deceleration of the speech articulators (Kent, 1992, p. 85). These include oral stops and nasals. In contrast to this, controlled sounds require more refined levels of articulatory precision and timing, for example the fine constriction required to create airflow turbulence, or the precise manipulation of tongue body gestures involved in the production of rhotics and laterals (Green et al., 2002; Kent, 1992). Together, physiological and articulatory constraints on speech production ‘contribute to the emergence of phonological processes in child language’ (Rose et al., 2021, p. 579).

A third factor relates to the acquisition of the various components of the human speech chain more generally, given that children must first be able to accurately perceive the speech sounds of their languages before they can learn to reproduce them in their own speech (Curtin et al., 2017; Rose & Penney, 2022). In a nutshell, auditory issues may result in the child misinterpreting the speech signal of the sound at hand, especially during the early period of development, resulting in either an incomplete (ill-defined) or erroneous speech sound or sound combination to reproduce. If a sound is not perceived correctly (or well enough), it logically cannot be reproduced accurately. Rose and Penney (2022) take as an example the acquisition of the uvular fricative rhotic |ʀ| in four languages (Dutch, German, French, Portuguese), whose articulation involves subtle constrictions of the velopharyngeal area affecting the aerodynamic control of airflow making its way through these constrictions (Ohala, 1983). They found that |ʀ| was predominantly substituted by [h] in German and, to a lesser extent, in Dutch, while it was deleted for children learning French and Portuguese. Rose and Penney link these observations to language-specific differences in aspiration between German and Dutch, even if both of these languages display an /h/ in their respective phonemic inventories, while French and Portuguese display neither aspiration nor a laryngeal fricative in their phonological inventories.

Considerations such as these are inherent to emergentist models of phonology such as the Linked-Attractor model (Menn et al., 2009, 2013) and the A-map model (McAllister Byun et al., 2016). Similar views form the basis of computational models of speech acquisition (Guenther, 1994; Guenther et al., 2006; see also Lin & Mielke, 2008). For example, within the A-map, learning consists of attaining phonological representations that successfully relate the speaker’s internal knowledge of auditory inputs (present in the ambient language) and the motor plans required to reproduce these inputs through their own speech articulations.

In the next section, we introduce the data relevant to the current discussion. We begin with a general description of our study. We then turn our focus on the more subtle patterns we have been alluding to across our different discussions above.

Current Study

Our study follows the general tradition of naturalistic, longitudinal research on child language acquisition. In the next subsections we overview how we recorded, processed, and analyzed speech data from three young children learning Setswana as their first language.

Participants

Our study documents three typically developing monolingual learners of Setswana code-named W, T, and B, respectively. We also use WTB to refer to this group of learners as a whole. WTB were learning the SeKwena dialect of Setswana, which is predominantly spoken in villages located in the southeastern part of Botswana. W is from Molepolole; T and B are from Mankgodi. The children were selected on the basis of their families’ use of Setswana in everyday communication as well as the caregivers’ willingness to participate in the study. Table 2 provides an overview of these participants.

Table 2.

Participants to the Study.

Child’s name	Gender	Age at 1st recording	Number/frequency of sessions
W	Male	1;10.18	11
T	Female	2;05.03	12
B	Male	3;02.22	10

WTB were recorded during a period of approximately 4 months, between W’s ages of 1;10.18 and 2;02.02; T’s ages of 2;05.03 and 2;08.15; and B’s ages of 03;02.22 and 03;06.05. We also highlight that W and B are boys and T is a girl. W has an older sibling and both are living within an extended family setup. B also has an older sibling and the children live with their mother. T is the first born child with a younger sibling and, like W, lives within an extended family setup. Given the small size of our study and the children’s similar socio-economic backgrounds, we did not take into consideration individual factors such as biological sex or socio-economic status (SES) which have been reported to play an important role in children’s acquisition (Winitz, 1969; Wells, 1985, 1986). However, we note that while T, the girl, was not the older child in our study, she generally displayed more accurate productions than the two boys. This sex-based difference was also observed by Mahura (2014) (see also Dodd et al., 2003; MacCobby & Jacklin, 1974; McCormack & Knighton, 1996; Smit et al., 1990; So & Dodd, 1995). In comparison, W’s productions were the most variable of the three children, as we expected given him being both a boy and the youngest participant in our study.

Audio Recording

We audio recorded each of the children in their own homes. The recordings were done by a native speaker of Setswana with advanced training in linguistics, also in the presence of a parent or caregiver. The recordings took place at regular intervals, once or twice a week, for a period of approximately 4 months. None of the children attended pre-school prior to the recordings or were exposed in any meaningful way to a secondary language such as English. Prior to carrying out the recordings, we obtained from the caregivers their informed written consent concerning both the data recording procedures and the later use of the children’s audio-recorded speech data for research. The caregivers also completed a questionnaire consisting of a list of common lexical items such as nouns (i.e. depicting domestic animals, modes of transportation, nature, food items, utensils), verbs, adjectives, and question words to ensure that all children had the expected knowledge of basic Setswana words. We further verified with the caregivers that all the children were developmentally and socially healthy and that they also had no vision or hearing problems.

We used picture books to elicit words that together maximally cover the sounds of Setswana across all CV positions (initial, medial and final) within which they can appear. Elicitation involved making use of single-word naming and image description, which started with the children being asked to informally and spontaneously name objects or actions they identified in the picture books; we did not use any fixed word lists or structured speech elicitation protocols. However, prompting questions were offered to the child if they did not immediately name the object or action they saw in the picture, for example, golo mo ke eng? ‘what is this?’; ke eng selo se? ‘what is this thing?’; o dira eng? ‘what is he/she doing?’; se dira eng? ‘what is it doing?’; or ba/di⁵ dira eng? ‘what are they doing?’ Besides our use of prompts, the children were generally in control of the trajectory of the conversation, during which they were positively encouraged to spontaneously produce their own speech. As part of these interactions, the adult interviewer focused on repeating the child’s productions using the adult form. This provided the child with a stimulating environment for speech production and language learning, and also served the subsequent identification of the speech forms attempted by the child. However, a consequence of this approach is that it could not guarantee that all children produced all the sounds in all of the recordings. This limitation had no measureable impact on the results we present below.

Data Transcription and Analysis

The children’s speech productions were orthographically and phonetically transcribed and analyzed by a native speaker and trained linguist using the Phon software program (Rose et al., 2006, 2013; Rose & MacWhinney, 2014). Phon enables us to make one-on-one comparisons between attempted (target) and produced (actual) sounds and sound combinations. Using the analytic and query functions of Phon, we then extracted the data relevant to our study, in particular the patterns of phonological production that manifest themselves in the data such as consonant substitution or deletion across all syllable positions within which they appear in the children’s productions.

Note that we combined the results for target alveolar |s| and post-alveolar |ʃ| within a single category below for the following two reasons: (a) they displayed similar patterns of behaviours, (b) |ʃ| had significantly fewer attempts (n = 55) relative to alveolar |s| (n = 1,539), 43 of which were attempted by T and 11 by B. T mostly affricated |ʃ| to [ʧʰ] (in 65% of her attempts) and |s| to [ʦ] and [ʦʰ] (66% and 11%, respectively). B displayed coronal stopping for both |s| and |ʃ|, while W’s few attempts yielded variable outcomes. We return to these results in the next section, where we describe the trends observed in the children’s acquisition of Setswana’s fricatives and affricates.

Results: WTB’s Acquisition of Fricatives and Affricates

In this section, we present the children’s patterns of acquisition for the target fricatives and affricates of Setswana. The data descriptions below combine target consonants across initial, medial and final syllable onsets, as we have not found differences in behaviours across these positions.

Fricatives

Our corpus documents 4,630 attempts at fricatives, including 532 labiodental |f|, 1,594 coronal |s/ʃ|, 2,437 uvular |χ|, and 67 laryngeal |h|.⁶ This results in a category of fricatives for each major place of articulation (Labial, Coronal, Dorsal, and Laryngeal). We present in Table 3 the general rates of substitution affecting each category of fricatives.

Table 3.

WTB’s Combined Attempts at Fricatives.

Target	Total # of attempts	Substitutions	Deletions	Accurate productions
\|f\|	532	249 (46.8%)	2 (<0.1%)	281 (52.8%)
\|s/ʃ\|	1,594	1,392 (87.3%)	16 (1%)	186 (11.7%)
\|χ\|	2,437	878 (36%)	32 (1.3%)	1,527 (62.7%)
\|h\|	67	13 (19.4%)	0	54 (80.6%)

As we can see in this table, the coronals |s/ʃ| display an outsized proportion of substitutions compared to the other categories, with an accuracy rate of less than 12%, while |f, χ, h| range in accuracy from 53% to 81%. T made the most attempts at all the fricatives. W and B made a similar number of attempts. Table 4 presents the individual differences in the children’s attempts at each fricative.

Table 4.

Overview of WTB’s Target Fricatives |s/ʃ, f, χ, h|.

The Children	Trends’ labels	strident	non-strident
W	Target	\|s/ʃ\|	\|f\|	\|χ\|	\|h\|
	Total no.	337	64	433	18
	Accurate	67 (20%)	18 (28%)	248 (57%)	12 (67%)
	Inaccurate	270 (80%)	46 (72%)	185 (43%)	6 (33%)
T	Target	\|s/ʃ\|	\|f\|	\|χ\|	\|h\|
	Total no.	1,028	389	1,688	53
	Accurate	66 (6%)	251 (65%)	1,122 (66.5%)	16 (30%)
	Inaccurate	962 (94%)	138 (35%)	566 (33.5%)	37 (70%)
B	Target	\|s/ʃ\|	\|f\|	\|χ\|	\|h\|
	Total no.	229	79	322	13
	Accurate	37 (16%)	10 (13%)	131 (41%)	8 (62%)
	Inaccurate	192 (84%)	69 (87%)	191 (59%)	5 (38%)

Of all the fricatives, target |h| recorded the lowest number of attempts; because of these low numbers, and as the development of |h| is tangential to the current discussion, we will not discuss it further. We identified six patterns for target fricatives. We classify these patterns by place features and manner features, in particular, in the case of affrication. Patterns involving place and manner substitutions include debuccalization and gliding. The category ‘other’ is used to group together all substitutions that failed to form a distinguishable pattern. We now turn to a more detailed description of each of these patterns, starting with |s/ʃ| in the next subsection.

Substitution Patterns for Coronal |s/ʃ|

As we can see in Table 5, the children display more or less individual patterns of variation in their productions for |s/ʃ|. These target fricatives yield relatively stable substitutions for T and B; T predominantly substituted |s/ʃ| by the affricates [ʦ ʦʰ]. In contrast to T, B displayed a combination of stopping to coronals [t/d] and affrication [ʦ]. We note in this context that the majority of B’s stop productions appear to be the result of consonant harmony⁷ with another coronal consonant present within the word. Finally, W, the youngest of the three children, displayed much more variable patterns, with productions ranging between affrication, stopping, gliding, and debuccalization to laryngeal consonants throughout the recording period.

Table 5.

WTB’s Production Patterns for Target |s/ʃ|.

Patterns	W	T	B	Total
[h ʔ] substitution	69 (26%)	37 (3.9%)	14 (7%)	120
Affrication	77 (29%)	857 (89%)	51 (27%)	985
Stopping to coronals	58 (22%)	39 (4%)	116 (60%)	213
[w j] substitution	58 (22%)	15 (1.5%)	5 (3%)	78
Other^a	0	8 (0.83%)	4 (2%)	12
Deletion	8 (3%)	6 (0.01%)	2 (1%)	16

The ‘other’ productions for T’s and B’s |s/ʃ| all involve substitutions to [r/l].

We also note that, proportional to each child’s level of productivity, most of the marginal cases of debuccalization we find in the data for |s/ʃ| come from W, the youngest and least advanced learner. This developmental observation offers additional context for our description of |f| and |χ| below, characterized by higher rates of debuccalization in all three children’s productions.

Substitution Patterns for Non-Coronal |f| and |χ|

As we can see in Tables 6 and 7, the non-coronal consonants |f| and |χ| were produced with lower rates of substitutions, especially by T, our most proficient learner. We identified six different patterns in these data, involving different manner and place dimensions; in particular, we highlight the high rate of debuccalization to [h ʔ], especially for |f|.

Table 6.

WTB’s Production Patterns for Target |f|.

Patterns	W	T	B	Total
[h ʔ] substitution	18 (39%)	68 (49%)	22 (32%)	108
Affrication	1 (2%)	19 (14%)	10 (15%)	30
Stopping to labials	11 (24%)	44 (32%)	31 (45%)	86
[w j] substitution	16 (35%)	3 (2%)	5 (7%)	24
Other	0	2 (1%)	1 (1%)	3
Deletion	0	2 (1%)	0	2

Table 7.

WTB’s Production Patterns for Target |χ|.

Patterns	W	T	B	Total
[h ʔ] substitution	72 (39%)	12 (2%)	53 (28%)	137
Affrication to various places	3 (2%)	6 (1%)	3 (1.5%)	12
Stopping to various places	47 (25%)	129 (23%)	60 (31%)	236
[w j] substitution	59 (32%)	38 (7%)	35 (18%)	132
Other^a	2 (1%)	12 (2%)	20 (10.5%)	34
Deletion	2 (1%)	10 (6%)	20 (10.5%)	32

The category ‘other’ for W consists of substitutions to [l]; T’s consists of substitutions to [f]/[l]/[r]; B’s consists of a vast majority of substitutions to [l].

We also observe a more marginal tendency for this fricative to be stopped to [p b], while the uvular |χ| displays a wide range of variation. W and T mainly stopped this consonant to uvular [qʰ] and velar [k], while B displayed substitutions of this consonant by coronal and labial stops.

In sum, the children displayed much less accuracy in their realization of places of articulation for the non-coronal (and non-strident) fricatives compared to the coronal (strident) ones. Recall from above the observation that coronal fricatives, the strident targets, tend to maintain their general places of articulation even when they are affected by phonological processes. Before we discuss our interpretation of these facts, we turn to our second segmental context, this time defined by the series of affricate consonants that Setswana presents.

Affricates

Our corpus documents 4,701 attempts at affricates, namely 3,298 non-lateral |ʦ ʦʰ ʤ ʧ ʧʰ| and 1,403 lateral |tɬ tɫʰ|).⁸ By far the most common process affecting the children’s productions of these consonants is deaffrication. Table 8 breaks down the deaffrication rate of each affricate for each child. In the speech of both boys (W, B), lateral affricates were much more prominently affected by deaffrication than non-lateral ones, irrespective of the place of articulation of the resulting consonant. In comparison, T continued to display higher levels of achievement, with much lower rates of deaffrication overall.

Table 8.

WTB’s Rates of Deaffrication for Non-Lateral and Lateral Affricates.

The Children	Non-lateral				Lateral
	ʦ	ʦʰ	ʧʰ	ʤ	t͡ɬ	t͡ɬʰ
W	183/352 (52%)	42/114 (37%)	16/72 (22%)	34/40 (85%)	66/85 (78%)	200/256 (78%)
T	46/1,064 (4%)	19/335 (6%)	5/41 (12%)	70/181 (39%)	45/547 (8%)	17/358 (5%)
B	126/191 (66%)	32/141 (23%)	10/22 (45%)	31/60 (52%)	129/163 (79%)	45/50 (90%)

Substitution Patterns for Non-Lateral |ʦ ʦʰ ʧʰ ʤ|

We begin in Table 9 with a summary of each child’s accuracy rates, for each target affricate.

Table 9.

Overview of WTB’s Target Affricates |ʦ ʦʰ ʧʰ ʤ|.

The Children	Target	\|ʦ\|	\|ʦʰ \|	\|ʧʰ \|	\|ʤ\|
W	Total no.	378	119	73	42
	Accurate	83 (22%)	26 (22%)	45 (62%)	8 (19%)
	Inaccurate	295 (78%)	93 (78%)	28 (38%)	34 (81%)
T	Total no.	1,070	363	41	192
	Accurate	834 (78%)	298 (82%)	32 (78%)	105 (55%)
	Inaccurate	236 (22%)	65 (18%)	9 (22%)	87 (45%)
B	Total no.	202	145	22	74
	Accurate	39 (19%)	68 (47%)	10 (45%)	27 (37%)
	Inaccurate	163 (81%)	77 (53%)	12 (55%)	47 (64%)

The following four tables break down the patterns, for each child, for target |ʦ ʦʰ ʧʰ ʤ|, respectively. Starting with |ʦ ʦʰ|, in Tables 10 and 11, we observe nine different patterns of substitution, which we categorized in terms of place and manner of articulation, similar to our classification of fricative productions in the previous section.

Table 10.

WTB’s Production Patterns for Target |ʦ|.

Patterns	W	T	B	Total
Aspiration	40 (14%)	173 (73%)	14 (9%)	227
Palatalization	45 (15%)	11 (5%)	11 (7%)	67
Stopping to coronals	149 (51%)	17 (7%)	106 (65%)	272
[h ʔ] substitution	10 (3%)	10 (4%)	1 (1%)	21
Spirantization	9 (3%)	4 (2%)	1 (1%)	14
[w j] substitution	10 (3%)	1 (<1%)	5 (3%)	16
[ɫ] substitution	1 (<1%)	8 (3%)	10 (6%)	19
Other	5 (2%)	6 (3%)	4 (3%)	15
Deletion	26 (9%)	6 (3%)	11 (7%)	43

Table 11.

WTB’s Production Patterns for Target |ʦʰ|.

Patterns	W	T	B	Total
Deaspiration	6 (7%)	24 (37%)	12 (16%)	42
Palatalization to \|ʧʰ\|	40 (43%)	18 (28%)	29 (38%)	87
Stopping to coronals	39 (51%)	14 (22%)	25 (32%)	78
[h ʔ] substitution	1 (2%)	1 (2%)	0	2
Spirantization	0	4 (6%)	2 (3%)	6
[w j] substitution	0	0	1 (1%)	1
Other	2 (2%)	0	4 (5%)	6
Deletion	5 (5%)	2 (3%)	4 (5%)	11

For target |ʧʰ| in Table 12 below, we also observed nine different patterns of substitution, categorized mainly to different place and manner.

Table 12.

WTB’s Production Patterns for Target |ʧʰ|.

Patterns	W	T	B	Total
Deaspiration to [ʧ]	1 (4%)	0	0	1
Deaspiration and depalatalization to [ʦ]	2 (7%)	0	2 (17%)	4
Depalatalization to [ʦʰ]	4 (14%)	1 (11%)	0	5
Stopping to coronals	11 (39%)	1 (11%)	2 (17%)	14
[h ʔ] substitution	0	1 (11%)	4 (33%)	5
Spirantization to [s ʃ]	4 (14%)	4 (44%)	3 (25%)	11
[w j] substitution	1 (4%)	0	0	1
[ɫ] substitution	4 (14%)	1 (11%)	0	5
Deletion	1 (4%)	1 (11%)	0	2

Finally, we report on target |ʤ| in Table 13. As we can see, this voiced affricate yielded more variable patterns, also with higher rates of deletion overall.

Table 13.

WTB’s Production Patterns for Target |ʤ|.

Patterns	W	T	B	Total
Devoicing^a	4 (12%)	21 (24%)	2 (4%)	27
Stopping to coronals	0	0	4 (9%)	4
[b] substitution	3 (9%)	5 (6%)	1 (2%)	9
[h ʔ] substitution	0	3 (3%)	10 (21%)	13
Spirantization to [ð]	1 (3%)	2 (2%)	1 (2%)	4
[w j] substitution^b	16 (47%)	11 (13%)	15 (32%)	42
[ɫ] substitution	4 (12%)	8 (3%)	0	12
[n ɲ] substitution	1 (3%)	33 (38%)	0	34
Other	0	1 (1%)	0	1
Deletion	5 (15%)	11 (13%)	14 (29%)	30

Including to non-palatal |ʦ ʦʰ| and palatal |ʧ ʧʰ| substitutions.

W’s and T’s gliding are exclusively to [j] while B glides to [w].

Despite the relative variability observed in the data, the majority of which involve manner substitutions, we highlight that these affricates were generally produced with their target coronal place of articulation.⁹ Further, we note the absence of any pattern of velar substitution. As we describe next, lateral affricates present a much different picture in both these respects.

Substitution Patterns Affecting the Lateral Affricates |t͡ɫ t͡ɫʰ|

As can be seen in Table 14, the lateral affricates |t͡ɫ| and |t͡ɫʰ| also displayed noticeable rates of substitution, for each child, in line with our general predictions about affricate development.

Table 14.

Overview of WTB’s Target Affricates |t͡ɫ t͡ɫʰ|.

The Children		\|t͡ɫ\|	\|t͡ɫʰ\|
W	Total no.	82	256
	Accurate	n = 2 (2.5%)	n = 13 (5%)
	Inaccurate	n = 80 (97.5%)	n = 243 (95%)
T		\|t͡ɫ\|	\|t͡ɫʰ\|
	Total no.	521	358
	Accurate	n = 432 (83%)	n = 332 (93%)
	Inaccurate	n = 89 (17%)	n = 26 (7%)
B		\|t͡ɫ\|	\|t͡ɫʰ\|
	Total no.	163	50
	Accurate	n = 1 (1%)	n = 0 (0%)
	Inaccurate	n = 162 (99%)	n = 50 (100%)

Table 15.

WTB’s Production Patterns for Target |t͡ɫ|.

Patterns	W	T	B	Total
Stopping to coronals	21 (25%)	27 (31%)	125 (77%)	173
Stopping to labials	12 (15%)	0	1 (<1%)	13
Stopping to velars	23 (29%)	12 (14%)	2 (1%)	37
[h ʔ] substitutions	10 (12%)	15 (15%)	23 (14%)	48
[w j] substitutions	4 (5%)	0	0	4
[ɫ] substitutions	0	2 (2%)	0	2
Delateralization	4 (5%)	17 (20%)	3 (2%)	24
Other	3 (4%)	3 (5%)	1 (<1%)	7
Deletion	3 (4%)	13 (15%)	7 (4%)	23

Table 16.

WTB’s Production Patterns for Target |t͡ɫʰ|.

Patterns	W	T	B	Total
Stopping to coronals	132 (54%)	7 (27%)	19 (38%)	158
Stopping to labials	5 (2%)	0	23 (46%)	28
Stopping to velars	49 (20%)	4 (15%)	0	53
[h ʔ] substitutions	2 (1%)	3 (12%)	0	5
[w j] substitutions	5 (2%)	0	0	5
[ɫ] substitutions	0	3 (12%)	1 (2%)	4
Delateralization	40 (16.5%)	5 (9%)	5 (10%)	50
Other	7 (3%)	0	2 (4%)	9
Deletion	3 (1%)	4 (15%)	0	7

Similar asymmetries can be observed in the data for target |t͡ɫʰ|, as we can see in Table 16:

During the course of our observation period, we could also witness W’s productions evolving from velar to coronal stopping during the course of the observation period. We can also see instances of substitutions to the non-lateral affricates for the lateral affricates, but the children never displayed lateral substitutions for the non-lateral counterparts.

Interim Summary

In sum, our general findings about WTB’s acquisition of the fricatives and affricates of Setswana fall generally in line with general expectations based on the cross-linguistic literature. However, a few trends in the data transcend these expectations, in particular (a) the relative persistence of debuccalization as a process affecting non-coronal (and non-strident) fricatives as well as (b) the unexpected pattern of velar substitution affecting lateral affricates. Arguably, both of these processes are representative of early stages in phonological development. As reported above, fricative debuccalization is more prominent in the speech of younger child speakers cross-linguistically. Similarly, velar substitution for lateral affricates (also a non-strident category of consonants) manifests itself much more prominently in the early productions of W, also our younger child participant. We discuss the potential origins of these production patterns in the next session.

Phonetic Factors Affecting the Acquisition of Fricative and Affricate Places of Articulation

In this section, we address our developmental observations from the perspective of segmental emergence. We first highlight some of the difficulties involved in the analysis of these patterns using formal models of phonology. We then consider these patterns in light of the phonetic properties of the target system and how these properties may ultimately be interpreted by the child learners. We frame this discussion using the A-map model of segmental development (McAllister Byun et al., 2016; Rose et al., 2021).

Factors Affecting the Emergence of Fricatives

In section ‘Results’, we observed in WTB’s data an asymmetry whereby debuccalization as a substitution process affecting target fricatives is much more prominently observed with non-coronal, non-strident fricatives than with coronal, strident ones.

To our knowledge, no model of phonology can encode this asymmetry in a straightforward fashion. First, we are aware of no language where fricative debuccalization is a necessary function of coronality, except perhaps in the case of lenition processes related to specific syllable positions (e.g. /s/ debuccalization or deletion in syllable codas across different dialects of Spanish, Harris, 1969). To distinguish between categories of consonants more or less prone to debuccalization, an option would be through [±strident] distinctions, given that labial and velar fricatives cannot be easily combined within a single place category. However, even if a formal distinction can be established based on this feature, any rule of debuccalization affecting [-strident] sounds would merely consist of a formal restatement of the phenomenon observed. Further, a rule should predict categorical behaviours; this makes it challenging to capture optionality within the data; it would also need to be associated to relatively early stages of phonological development. The same criticisms apply to constraint-based models of phonology unless one were to ground the formalism in speech phonetics (Archangeli & Pulleyblank, 2022 for a recent argument), also in relation to the acquisition of phonological knowledge.

Building on these observations and related criticisms, we take as a starting point the observation that stridency (or absence thereof) played a role in the children’s acquisition of each type of fricative consonants, as follows: The speech cues to the place of articulation of coronal fricatives were carried through strident frication, enhancing the learner’s ability to identify the coronal fricatives within the signal, which were then reliably produced as coronals as soon as the children had acquired the ability to reproduce coronality through their own speech. While frication (and affrication) remained a challenge, yielding generalized patterns of stopping, the children displayed no problem in realizing these stopped consonants within their general place of articulation.

The only exception to this general scenario within the data on coronal fricatives concerns the noticeable pattern of debuccalization presented by W, our youngest learner, whose productions were highly variable. For example, throughout the observation period, W produced setilo |sɪtilɔ| ‘chair’ as [hɪtiwɔ]/ [ʔɪtiwɔ]; sekuta |sɪkuta| ‘motor-bike as [ʔututa] and the possessive pronoun saaka |saːka| ‘mine’ as [ʔaːka]. These facts are also compatible with the general framework of phonetically grounded phonological emergence, as they represent a state where the child clearly perceives the presence of a fricative sound but has yet to acquire the place of articulation for this consonant.

In contrast to this, the place of articulation of non-coronal, non-strident fricatives, may at times be hindered or, minimally, not perceived as accurately by the learner, given that cues to place of articulation are weaker and more variable when carried through non-strident frication. Under an emergentist approach, this predicts the same general developmental trajectory as with strident fricatives, however, with slower rates of development as well as more variability in the data leading to the mastery stage. To the overall weakness of the acoustic signal, we also add the fact that non-strident fricatives involve a contrast between two different places of articulation (labial, velar), a factor potentially adding to the challenge of acquiring these fricatives. Not only do the learners of these contrasts have to cope with weak auditory cues, they also must identify and replicate contrasting places of articulation based on these cues.

This explanation captures the characteristics of early stages of development for non-strident fricatives concerning their slower rates of acquisition, the substitution patterns affecting their early productions and the overall variability observed throughout the acquisition period. In contrast to this, as stated above, it would be extremely challenging to capture these data and the variability within it using a formal, rule- or constraint-based model of phonology. Any such analysis would also have to contend with explaining the origins of these data in the first place, which the current analysis captures in a straightforward way. It was based on the properties of speech to which the children are exposed. We expand on this discussion based on WTB’s substitution patterns for affricates, in the next section.

Factors Affecting the Emergence of Affricates

The general pattern of deaffrication displayed by WTB reflects the inherent complexity of affrication. Substitution by deaffrication can thus be analyzed, in general terms, as the child’s reduction of both the structural complexity of the consonant and of its related phonetic attributes (see also Demuth, 2007; Mowrer & Burger, 1991; Tuomi et al., 2001).

As we highlighted above, more intriguing in the data is the place asymmetry we observe between non-lateral and lateral affricates in the children’s substitutions. While non-lateral affricates are overwhelmingly reduced to coronal consonants, the lateral affricates display more variable behaviours, including unexpected substitutions to the velar place of articulation.

Building on the logic developed above in the context of fricatives, we contend that formal modelling of these observations is possible, for example, through a combination of [±strident] and [±lateral] features, whether encoded in terms of phonological rules or constraint-based analyses. However, both the developmental characteristics of the phenomenon and its variability in children’s speech makes it extremely challenging to capture within formal models. Likewise, this modelling would hardly provide an answer as to the origins of the phenomenon. Similar to fricative development above, we argue that an approach grounded in speech phonetics offers better insight into the phenomenon.

Starting with stridency, we first note that, in comparison to the non-lateral affricates, the lateral affricates tend to display relatively deprived cues to their places of articulation, through a second formant (F2) of generally low amplitude (Ladefoged & Maddieson, 1996, p. 206). We also note that coronal + lateral sequences of speech articulations generally display a blurry contrast with velar + lateral sequences (Davidson & Shaw, 2012; Hallé & Best, 2007; Hallé et al., 1998). For example, Hallé et al. (1998), Hallé and Best (2007), and Pitt (1998) show that /tl/ and /dl/ sequences, which are unattested in word-initial positions in languages such as French or English, generally tend to be perceived by speakers of these languages as /kl/ and /gl/, respectively. Note that the speakers’ misperceptions must relate at least in part to the fact that French and English listeners are phonotactically biased against /tl/ and /dl/ clusters, given the absence of such sequences in their languages. However, these speakers never associate individual alveolar stops or laterals to velar segments. The perception of /tl/ and /dl/ sequences as involving a velar place of articulation is thus specific to the coronal stop + lateral phonetic sequence.

We argue that any analysis of the optional velar substitutions observed in the data must incorporate these phonetic facts, given that they provide an answer about the origins of the perception of coronal stop + lateral phonetic sequences as involving a velar articulation. While the learners of Setswana are evidently exposed to speech phonotactics different from those of English or French, in particular to the presence of genuine lateral affricates in the language, these learners nonetheless face the challenge to auditorily interpret the lateral affricates as coronal, given also the presence of velar consonants within the language. In light of the relative confusability of coronal stop + lateral phonetic sequences, it is thus not surprising that at least a portion of the target lateral affricates present in the input were misinterpreted by the children as involving a velar place of articulation. As stated by Davidson and Shaw (2012), the phonotactics of a language are not the only triggers of perceptual illusion; other triggers which are not language specific, such as acoustic similarity, may also yield perceptual confusion. For example, fricative-initial sequences may lead to prothesis illusions; stop-nasal sequence may lead to the illusion that the initial consonant is either not present in the string or is present in some modified form, while stop-stop sequences may lead to vowel epenthesis. Building on the research above and on our own observations from the acquisition of Setswana lateral affricates, we add to this list alveolar stop + lateral sequences as potential triggers of velar place perception. This hypothesis also dovetails with the typological observation reported above that /tl/ and /dl/ sequences, either as consonant clusters or, in the case of Setswana, as lateral affricates, are hardly attested in the phonological inventories of the world’s languages (Hallé et al., 1998; Maddieson, 2005). While universal constraints on the perception of given phonetic sequences can be incorporated into constraint-based models of phonology, such analyses are also typically grounded in speech phonetics. We leave the development of such formalism for future research.

Conclusion

In summary, we highlighted the general trends observed in the speech of three first-language learners of Setswana, most of which find parallels across languages involving similar sound categories. We then focused on asymmetries in the data which posing challenges to formal models of phonology. We analyzed these patterns under the lens of speech phonetics, which can capture both the a priori unexpected patterns observed in the data as well as the variability that these patterns display within our data.

This research thus highlights that while phonological systems encode systematic phone distribution and patterns within languages, their acquisition must be understood at least in part from the perspective of the ways these systems present themselves to the learners, through the phonetics of the ambient language. This forms the basis of recent emergentist models of phonological acquisition, as opposed to any attempt at describing child phonological behaviours as fully completely similar to that of adult speakers of their target language.

In regard to potential formalisms to encode these phenomena, we highlight that our data descriptions embrace both the main patterns as well as the variation present in the data. Any formal analysis of these phenomena would thus require similar contextualization of the variable data, a topic which transcends the goal of the current paper. It is however, our hope that the type of understanding that stems from our discussion above will offer useful steps toward this goal; we also hope that this understanding will offer insight for speech clinicians and educators who may encounter these phenomena in the speech of Setswana-learning children, or that of any other typologically similar language.

Footnotes

ORCID iDs

Keneilwe Matlhaku

Yvan Rose

Author Contributions

Keneilwe Matlhaku: Conceptualization; Formal analysis; Investigation; Methodology; Software; Validation; Writing – original draft; Writing – review & editing. Yvan Rose: Conceptualization; Software; Validation; Writing – review & editing.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Notes

References

Archangeli

Pulleyblank

(2022). Emergent phonology. Language Science Press.

Amoako

(2020). Assessing phonological development among Akan-speaking children [MA thesis, University of British Columbia].

Anderson

L.-G.

Janson

(1997). Languages in Botswana: Language ecology in Southern Africa. Longman.

Batibo

Mathangwane

Tsonope

(2003). A study of the third language teaching in Botswana. Ministry of Education.

Bernhardt

B. M.

Másdóttir

Stemberger

J. P.

Leonhardt

Hansson

G. Ó

. (2015). Fricative acquisition in English- and Icelandic-speaking preschoolers with protracted phonological development. Clinical Linguistics & Phonetics, 29, 642–665. https://doi.org/10.3109/02699206.2015.1036463

Bernhardt

B. M.

Stemberger

(1998). Handbook of phonological development from the perspective of constraint-based nonlinear phonology. Academic Press.

Bernhardt

B. M.

Stemberger

(2018). Tap and trill clusters in typical and protracted phonological development: Conclusion. Clinical Linguistics & Phonetics, 32(5–6), 563–575. https://doi.org/10.1080/02699206.2017.1370496

Chomsky

Halle

(1968). The sound pattern of English. Harper & Row.

Cole

D. T.

(1955). An introduction to Tswana grammar. Longmans.

10.

Cook

(2006). The patterns of consonantal acquisition and change in Chipewyan (Dëne Sųłiné). International Journal of American Linguistics, 72(2), 236–263. https://www.jstor.org/stable/10.1086/507166

11.

Curtin

Hufnagle

Mulak

K. E.

Escudero

(2017). Speech perception: Development. In McEwen

B. S.

(Ed.), Reference module in neuroscience and bio-behavioral psychology (pp. 1–7). Elsevier.

12.

Davidson

Shaw

J. A.

(2012). Sources of illusion in consonant cluster perception. Journal of Phonetics, 40(2), 234–248. https://doi.org/10.1016/j.wocn.2011.11.005

13.

Demuth

(1992). Acquisition of Sesotho. In Slobin

D. I.

(Ed.), The cross-linguistic study of language acquisition (Vol. 3, pp. 557–638). Lawrence Erlbaum Associates.

14.

Demuth

(2007). Sesotho speech acquisition. In Mcleod

(Ed.), The international guide to speech acquisition (pp. 526–538). Thomson Delmar Learning.

15.

Department of African Languages & Literature (DALL). (1999). The sound system of Setswana (2nd ed.). Lentswe La Lesedi.

16.

Dodd

Holm

Hua

Crosbie

(2003). Phonological development: A normative study of British English-speaking children. Clinical Linguistics & Phonetics, 17(8), 617–643.

17.

Dyson

A. T.

(1988). Phonetic Inventories of 2 and 3-year old children. Journal of Speech & Hearing Disorders, 53, 89–93.

18.

Ewen Hume

Oostendorp

M. V.

Rice

(Eds.). (2011). The Blackwell companion to phonology. Wiley-Blackwell.

19.

Ferguson

C. A.

(1978). Fricatives in child language acquisition. In Honsa

Hardman-de-Bautista

M. J.

(Eds.), Papers on linguistics and child language (Vol. 6, pp. 93–115). Mouton de Gruyter.

20.

Ferguson

C. A.

Farwell

C. B.

(1975). Words and Sounds in Early Language Acquisition. Language, 51, 419–439.

21.

Fikkert

Levelt

C. C.

(2008). How does place fall into place? The Lexicon and emergent constraints in children’s developing grammars. In Avery

Dresher

B. E.

Rice

(Eds.), Contrast in phonology: Theory, perception, acquisition (pp. 231–268). Mouton de Gruyter.

22.

Gangji

(2012). Phonological development in Swahili: A descriptive, cross-sectional study of typically developing pre-schoolers in Tanzania. Masters of Speech-Language Pathology. University of Cape Town.

23.

Goldsmith

Riggle

(2011). The handbook of phonological theory. John Wiley & Sons.

24.

Gouskova

Zsiga

Tlale-Boyer

(2011). Grounded constraints and the consonants of Setswana. Lingua, 121(15), 2120–2152.

25.

Green

J. R.

Moore

C. A.

Reilly

K. J.

(2002). The sequential development of jaw and lip control for speech. Journal of Speech, Language, and Hearing Research, 45(1), 66–79. https://doi.org/10.1044/1092-4388(2002/005)

26.

Guenther

F. H

(1994). A neural network model of speech acquisition and motor equivalent speech production. Biological Cybernatics, 72, (43-53).

27.

Guenther

F. H.

Ghosh

S. S.

Tourville

J. A.

(2006). Neural modeling and imaging of the cortical interactions underlying syllable production. Brain and Language, 96(3), 280–301.

28.

Hallé

P. A.

Best

C. T.

(2007). Dental-to-velar perceptual assimilation: A cross-linguistic study of the perception of dental stop+/l/ clusters. The Journal of the Acoustical Society of America, 121(5 pt 1), 2899–2914.

29.

Hallé

P. A.

Segui

Frauenfelder

Meunier

(1998). Processing of illegal consonant clusters: A case of perceptual assimilation? Journal of Experimental Psychology. Human Perception and Performance, 24(2), 592–608. https://doi.org/10.1037//0096-1523.24.2.592

30.

Harris

J. W.

(1969). Spanish phonology. MIT Press.

31.

Hodson

Paden

(1981). Phonological processes which characterize unintelligible and intelligible speech in early childhood. Journal of Speech and Hearing Disorders, 46, 369–373.

32.

Hua

Dodd

(2000). The phonological acquisition of Putonghua (Modern Standard Chinese Journal of Child Language, 27, 3–42.

33.

Ingram

(1978). The production of word-initial fricatives and affricates by normal and linguistically-deviant children. In Caramazza

Zurif

(Eds.), Language acquisition and language breakdown: Parallels and divergencies (pp. 63–85). Johns Hopkins University Press.

34.

Ingram

Christensen

Veach

Webster

(1980). The Acquisition of word-initial fricatives and affricates in English by children between 2 and 6 years. In YeniKomshian

G. H.

Kavanagh

J. F.

Ferguson

C. A.

(Eds.), Child Phonology. Academic Press.

35.

Inkelas

Rose

(2003). Velar fronting revisited. Boston University Conference on Language Development (BUCLD), 27, 334–345.

36.

Jakobson

(1941). Child language, aphasia and phonological universals. Mouton de Gruyter.

37.

Jakobson

Fant

Halle

(1952). Preliminaries to speech analysis. MIT Press.

38.

Kent

(1992). The biology of phonological development. In Ferguson

Menn

Stoel-Gammon

(Eds.), Phonological development: Models, research, implications (pp. 65–90). York Press.

39.

Kim

Clements

G. N.

Toda

(2015). The feature [strident]. In Rialland

Ridouane

van der Hulst

(Eds.), Features in phonology and phonetics: Posthumous writings by Nick Clements and coauthors (pp. 179–194). De Gruyter Mouton. https://doi.org/10.1515/9783110399981-009

40.

LaCharité

(1993). The internal structure of affricates [PhD dissertation, University of Ottawa].

41.

Ladefoged

Maddieson

(1996). The sounds of the world’s languages. Blackwell.

42.

Lahiri

Reetz

(2010). Distinctive features: Phonological underspecification in representation and processing. Journal of Phonetics, 38(1), 44–59. https://doi.org/10.1016/j.wocn.2010.01.002

43.

Levelt

(1994). On the acquisition of place. Holland Academic Graphics.

44.

Levelt

van Oostendorp

(2007). Feature co-occurrence constraints in L1 acquisition. In Los

van Koppen

(Eds.), Linguistics in the Netherlands (pp. 162–172). John Benjamins.

45.

Lewis

(1994). Aspects of the phonological acquisition of clicks in Xhosa [MA thesis, University of Stellenbosch].

46.

Lin

Mielke

(2008). Discovering place and manner features: What can be learned from acoustic and articulatory data. University of Pennsylvania Working Papers in Linguistics, 14(1), Article 3136.

47.

Lombardi

(1991). Laryngeal features and laryngeal neutralization [PhD dissertation, University of Massachusetts].

48.

MacCobby

Jacklin

(1974). The psychology of sex differences. Stanford University Press. https://www-degruyter-com.qe2a-proxy.mun.ca/document/doi/10.1515/9781503620780/html

49.

Maddieson

(2005). Lateral consonants. In Dryer

M. S.

Haspelmath

Dryer

M. S.

Comrie

(Eds.), The world atlas of language structures online (pp. 38–41). Oxford University Press. https://wals.info/chapter/8

50.

Mahura

(2014). The acquisition of Setswana phonology in children aged 3;0 – 6;0 years: A cross-sectional study [MA thesis, University of Cape Town].

51.

Mahura

Pascoe

(2016). The acquisition of Setswana segmental phonology in children aged 3.0–6.0 years: A cross-sectional study. International Journal Of Speech-Language Pathology, 18(6), 533-549. https://doi.org/10.3109/17549507.2015.1126639

52.

Mathangwane

J. T.

(1999). Ikalanga phonetics and phonology: A synchronic and diachronic study [PhD dissertation, University of California, Berkeley].

53.

Matlhaku

(2022). PhoBank Setswana Matlhaku corpus [Audio]. Phon. https://doi.org/10.21415/98C0-2Z80

54.

Matlhaku

(2023). Phonetic and Phonological factors affecting the early consonantal development in Setswana [PhD dissertation, Memorial University of Newfoundland].

55.

McAllister Byun

Inkelas

Rose

. (2016). The A-map model: Articulatory reliability in child-specific phonology. Language, 92(1), 141–178. https://doi.org/10.1353/lan.2016.0000

56.

McCormack

Knighton

(1996). Gender differences in the speech patterns of two and a half year old children. In McCormack

Russel

(Eds.), Proceedings of the sixth Australian international conference on speech science and technology. Australia Speech Sciences and Technology Association. https://assta.org/proceedings/sst/SST-96/cache/SST-96-Chapter13-p15.pdf

57.

McLeod

(2007). The international guide to speech acquisition. Thomson Delmar Learning.

58.

Menn

Schmidt

Nicholas

(2009). Conspiracy and Sabotage in the acquisition of phonology: Dense data undermine existing theories, provide scaffolding for a new one. Language Sciences, 31(2–3), 285–304. https://doi.org/10.1016/j.langsci.2008.12.019

59.

Menn

Schmidt

Nicholas

(2013). Challenges to theories, charges to a model: The linked-attractor model of phonological development. In. Vihman

Keren-Portnoy

(Eds.), The emergence of phonology: Whole-word approaches & cross-linguistic evidence (pp. 460–502). Cambridge University Press.

60.

Mogapi

(1984). Thutapuo ya Setswana: Mephato ya Magare (2nd ed.). Longman.

61.

Mowrer

Burger

(1991). A comparative analysis of phonological acquisition of consonants in the speech of 21/2-6 year old Xhosa and English-speaking children. Clinical Linguistics & Phonetics, 5(2), 139–164.

62.

Naidoo

Van de Merwe

Groenewald

Naude

(2005). Development of speech sounds and syllable structure of words in Zulu-speaking children. Southern African Linguistics and Applied Language Studies, 23, 59–79.

63.

Nyathi-Ramahobo

(1999). National language: A resource or a problem? Cambridge University Press.

64.

Ohala

J. J.

(1983). The origin of sound patterns in vocal tract constraints. In MacNeilage

P. F.

(Ed.), The production of speech (pp. 189–216). Springer-Verlag.

65.

Otlogetswe

(2017). Setswana syllable structure and distribution. Nordic Journal of African Studies, 26(4), 403–430.

66.

Pitt

M. A.

(1998). Phonological processes and the perception of phonotactically illegal consonant clusters. Perception & Psychophysics, 60(6), 941–951.

67.

Priestly

T. M. S.

(1977). One idiosyncratic strategy in the acquisition of phonology. Journal of Child Language, 4, 45–65.

68.

Rogers

(2009). Phonetic evidence for complex Cw segments: An ultrasound and audio-visual study of Shona [Qualifying Paper]. University of British Columbia.

69.

Rose

Hedlund

G. J.

Byrne

Wareham

MacWhinney

(2013). Phon: A Computational basis for phonological database building and model testing. In Villavicencio

Poibeau

Korhonen

Alishahi

(Eds.), Cognitive aspects of computational language acquisition (pp. 29–49). Springer.

70.

Rose

Inkelas

(2011). The interpretation of phonological patterns in first language acquisition. In Ewen

C. J.

Hume

van Oostendorp

Rice

(Eds.), The Blackwell companion to phonology (pp. 2414–2438). Wiley-Blackwell.

71.

Rose

MacWhinney

(2014). The PhonBank project: Data and software-assisted methods for the study of phonology and phonological development. In Durand

Gut

Kristoffersen

(Eds.), The Oxford handbook of corpus phonology (pp. 380–401). https://doi.org/10.1093/oxfordhb/9780199571932.013.023

72.

Rose

Macwhinney

Byrne

Hedlund

Maddocks

Brien

Wareham

(2006). Introducing Phon: A software solution for the study of phonological acquisition. Proceedings of the Annual Boston University Conference on Language Development, 2006, 489–500.

73.

Rose

McAllister

Inkelas

(2021). Developmental phonetics of speech production. In Setter

Knight

R.-A.

(Eds.), Cambridge handbook of phonetics (pp. 578-602). Cambridge University Press.

74.

Rose

Penney

(2022). Language and learner specific influences on the emergence of consonantal place and manner features. In MacWhinney

Kempe

Brooks

P. J.

(Eds.), Emergentist approaches to language (pp. 242–256). https://doi.org/10.3389/978-2-88974-483-1

75.

Rubach

(1994). Affricates as strident stops in Polish. Linguistic Inquiry, 25(1), 119–143.

76.

Sagey

(1986). The representation of features and relations in non-linear phonology [PhD dissertation, Massachusetts Institute of Technology].

77.

Shriberg

L. D.

(1993). Four new speech and prosody-voice measures for genetics research and other studies in developmental phonological disorders. Journal of Speech and Hearing Research, 36, 105–140.

78.

Shriberg

L. D.

Kwiatkowski

(1994). Developmental phonological disorders I: A clinical profile. Journal of Speech and Hearing Research, 37, 1100–1126.

79.

Smit

A. B.

(1993). Phonologic error distributions in the Iowa Nebraska Articulation Norms Project: Consonant singletons. Journal of Speech and Hearing Research, 36, 533–547.

80.

Smit

A. B.

Hand

Freilinger

J. J.

Bernthal

J. E.

Bird

(1990). The Iowa articulation norms project and its Nebraska replication. Journal of Speech and Hearing Disorders, 55, 779–798.

81.

Smith

N. V.

(1973). The acquisition of phonology: A case study. Cambridge University Press.

82.

L. K. H.

Dodd

B. J.

(1995). The acquisition of phonology by Cantonese-speaking children. Journal of Child Language, 22(3), 473–495.

83.

Stampe

(1973). A dissertation on natural phonology [PhD dissertation, University of Chicago].

84.

Stevens

K. N.

(1993). Modelling affricate consonants. Speech Communication, 13(1), 33–43. https://doi.org/10.1016/0167-6393(93)90057-R

85.

Stoel-Gammon

(1985). Phonetic inventories, 15–24 months. Journal of Speech, Language, and Hearing Research, 28(4), 505–512. https://doi.org/10.1044/jshr.2804.505

86.

Tsonope

(1993). Children’s acquisition of Bantu noun class prefixes. Botswana Notes and Records, 25, 111–117.

87.

Tuomi

Gxhilishe

Matomela

(2001). The acquisition of Xhosa phonemes. Per Linguam, 17(1), 14–23.

88.

Vorperian

H. K.

Kent

R. D.

Lindstrom

M. J.

Kalina

C. M.

Gentry

L. R.

Yandell

B. S.

(2005). Development of vocal tract length during early childhood: A magnetic resonance imaging study. The Journal of the Acoustical Society of America, 117(1), 338–350. https://doi.org/10.1121/1.1835958

89.

Watts

(2018). Markedness and implicational relationships in phonological development: A longitudinal, cross-linguistic investigation [PhD dissertation, Memorial University of Newfoundland].

90.

Wells

(1985). Language development in the pre-school years. Cambridge University Press.

91.

Wells

(1986). Variation in child language. In Fletcher

Garman

(Eds.), Language acquisition (pp. 109–140). Cambridge University Press.

92.

Winitz

(1969). Articulatory acquisition and behavior. Appleton-Century Crofts.

Phonetic Factors Affecting the Early Acquisition of Fricative and Affricate Consonants in Setswana

Abstract

Keywords

Introduction

Setswana Phonology

Consonant Phonemes

Syllable Structure

Background

Phonological Processes in Child Language

Fricatives and Affricates Across Languages and in Setswana

Factors Influencing the Emergence of Phonological Processes

Current Study

Participants

Audio Recording

Data Transcription and Analysis

Results: WTB’s Acquisition of Fricatives and Affricates

Fricatives

Substitution Patterns for Coronal |s/ʃ|

Substitution Patterns for Non-Coronal |f| and |χ|

Affricates

Substitution Patterns for Non-Lateral |ʦ ʦʰ ʧʰ ʤ|

Substitution Patterns Affecting the Lateral Affricates |t͡ɫ t͡ɫʰ|

Interim Summary

Phonetic Factors Affecting the Acquisition of Fricative and Affricate Places of Articulation

Factors Affecting the Emergence of Fricatives

Factors Affecting the Emergence of Affricates

Conclusion

Footnotes

ORCID iDs

Author Contributions

Funding

Declaration of conflicting interests

Notes

References