Abstract
This paper jointly considers syntactic, semantic, and phonological/phonetic factors in approaching an understanding of
1 Introduction
This paper brings together different approaches in linguistics in investigating the phonological and phonetic properties of
In this paper, African American English (AAE) refers to a linguistic variety spoken by some—not all—African Americans that has set syntactic, morphological, phonological, semantic, pragmatic, and lexical properties that are intertwined with properties of General American English (GAE). More recently there has been a move to use the label African American Language as a means of including all variations of language in African American communities. Owing to overlap between properties of GAE and AAE, speakers of AAE also use features that are associated with GAE. In such cases, AAE speakers are using properties that are also part of GAE; they are not codeswitching into GAE. For instance, in AAE, zero auxiliary forms are acceptable, and in some contexts overt forms are obligatory. As such, when speakers use overt auxiliaries in certain contexts, they are not codeshifting to GAE; they are using variant forms that are also in the AAE grammar. In some situations, however, AAE speakers do code shift between AAE and GAE. Given speakers’ varying use of AAE properties owing to regional influences as well as other extralinguistic factors, it is useful to view AAE on a continuum. This avoids assumptions that all speakers are alike and that there is no variation in the linguistic variety. Not only can different speakers be thought of as occupying different places on the continuum, but, also, some speakers might move along the continuum given different situations—even closer to AAE-speaking communities or farther away. (See Baugh, 1983 for more discussion of the continuum.) Even in light of AAE on a continuum, it is still important to note that there are quite likely core structural properties that unify the different subvarieties.
1.1 Background
Three verbal markers have been shown to have similar pronunciations but subtly different meanings in some contexts in some varieties of AAE. In this paper, we use a different orthographic representation for each marker: (1) “I have been to Jamaica five times” / “I have been watching TV” / “I have been a bus monitor before” (2) “I have had this necklace for fifteen or sixteen years” (3) “Bruce has been in the kitchen for a long time”
The sentence in (1) is similar to
Previous research on
Building on the description in Rickford (1975, 1999) in which the label “remote phase” is used to capture
The description of The feature that is common to habituals, whether or not they are also iterative, is that they describe a situation which is characteristic of an extended period of time, so extended in fact that the situation referred to is viewed not as an incidental property of the moment but, precisely, as a characteristic feature of a whole period. (pp. 27–28)
In the other use, which we label BINCOMPLETE (abbreviated as BINCOMP),
The subcategorization of types of states in the BINSTATE category into continuous and habitual is not trivial. When
(4) BINSTATE a) BINSTATE—Continuous (BINSTATE-CONT) i. Bruce BIN running. “Bruce has been running for a long time” ii. Bruce BIN knowing/knew the answer. “Bruce has known the answer for a long time” iii. Bruce BIN married. “Bruce has been married for a long time” iv. That food BIN cooked. “The food has been in its cooked state for a long time” v. Bruce BIN in the kitchen. “Bruce has been in the kitchen for a long time” vi. Bruce BIN the teacher for that program. “Bruce has been the teacher for that program for a long time”
A common characteristic reported in previous descriptions of
When
(4) BINSTATE (b) BINSTATE—Habitual (BINSTATE-HAB) Trina “For a long time, Trina has had the habit of running” Literally: Trina started running a long time ago, and she runs from time to time.
One way the habitual constructions differ from the continuous constructions is that the latter allows adverbial modification without a pause before the temporal adverbial, so the sentence
Each running segment occurs for 30 minutes, and the eventuality is well established, having taken place over a long period. The (4) BINSTATE (c) Trina BIN running for 30 minutes. “Trina has had the habit of running for 30 minutes for quite some time”
This sentence refers to a situation such that Trina runs for 30-minute segments, and she has been doing this for quite some time. We do not know what the long period is, but we have some idea about what it takes for a habit to be established. The length of the long period might very well be revealed during the conversation, but it cannot occur in the same utterance as the
The BINSTATE-CONT and BINSTATE-HAB perfect uses refer to states that have held for a long time, thus the paraphrase “for a long time.” The subtle difference is that BINSTATE-HAB refers to a habitual state. There is some overlap between these uses and the present participle
(5) BINCOMPLETE (BINCOMP) Bruce BIN grew out that shirt. “Bruce grew out of that shirt a long time ago”
In addition, Winford (1998) notes that when the marker occurs with non-stative predicates “it conveys the sense of some event completed in the more or less distant past” (p. 128). It should be noted that the BINCOMP constructions differ from BINSTATE constructions in that they are not always compatible with the perfect. As it turns out, it is possible to use beenPPART in BINSTATE environments, such as
A brief summary should be given about the characterization of captures the fact that BIN refers to a situation whose instantiation began a long time ago (in the case of stative predicates) and continues in effect up till the present. In the case of active predicates, the situation occurred a long time ago in the past, and there is posterior time relevance (in the case of the past perfect) or present relevance (in the case of the present perfect). (p. 162)
Given the description in Comrie (1976), (6) Remember when you said you would give Sue that blue dress for her birthday back in 2018? Did you do that? BIN response: Yeah, I BIN gave her that dress. “I gave her that dress way back in 2018 (i.e. a long time ago)” #done response: Yeah, I done gave her that dress. “I have given her that dress”
The BIN utterance is a better response to the question about an event 3 years ago than the perfect marker
Data from auxiliary support also provide some evidence that shows that not all of the uses of (7) Auxiliary support for a. Bruce haven’t BIN running; he just started. b. Bruce ain’t BIN running; he just started. “Bruce hasn’t been running for a long time; he just started”
Note, also, that the auxiliary (8) A: Bruce went ahead and opened his gift a long time ago. Yes, he BIN opened his gift. B: I know he didn’t! #I know he ain’t/haven’t!
This is a case in which
The preceding example (8) is presented to show that
In addition to meaning and contexts of (9) Bruce ain’t BIN running; he just started. “Bruce hasn’t been running for a long time; he just started running”
The sentence in (9) shows that (10) Bruce BIN could walk on stilts. “For a long time, Bruce has been able to walk on stilts”
5
The positions of (11) Bruce BIN could’a went to Jamaica. “Bruce could have gone to Jamaica a long time ago” (12) Bruce could’a BIN went to Jamaica. “Bruce could have gone to Jamaica a long time ago”
The sentences in (13) and (14) are included to show that progressive verb forms cannot occur following the modal complex (
(13) (14) Bruce could’a BIN buying discount shoes/in Texas. “Bruce could have been buying discount shoes for a long time/Bruce could have been in Texas for a long time”
Finally, (15) A: Bruce is just paying the water bill now on his phone. B: What! Bruce BIN was supposed to pay the water bill.
The usage and meaning of
While there is a small (but growing) body of work on AAE intonation (Cole et al., 2008; Holliday, 2016, 2019; Jun & Foreman, 1996; Loman, 1975; McLarty, 2011, 2018; Tarone, 1973; Thomas, 2015), to our knowledge, there is no work that situates the pronunciation of
1.2 Research questions
Both Beyer et al.’s (2015) results and Weldon’s (2019, 2021) examples indicate that understanding the sound of
The two studies complement one another in the kind of data they provide: semi-spontaneous sociolinguistic interview data from multiple regions in the United States versus elicitation data in carefully controlled semantic/syntactic/discourse contexts within an isolated, homogeneous AAE-speaking community (see Section 3.1.1 for more on the community). The large collection of over 140 sociolinguistic interview recordings in CORAAL offered opportunities for us to explore when and how “been”-types surface in the wild—even the chance to discover how been-types are used and produced in ways we might not have thought of previously. However, there is no direct control over how frequently the specific contexts required for different
Our first research objective was to characterize range in the use and meaning of
Our second research objective was to build on Beyer et al. (2015) to phonetically characterize the difference between
Rather, we adopt the more general consensus view of the set of assumptions in Autosegmental-Metrical theoretic approaches to the intonational phonology of varieties of Englishes (Beckman & Pierrehumbert, 1986; Clopper & Smiljanic, 2011; Grabe, 1998; Gussenhoven, 2016; Jun & Foreman, 1996; Ladd, 1996; Pierrehumbert, 1980; Veilleux et al., 2006): (i) tones are arranged in a linear sequence; (ii) phonological structure is organized in a prosodic hierarchical structure with an Intonational Phrase (IntP) 6 root node; (iii) tones can be characterized by how they are phonologically aligned/associated to the prosodic tree: either as pitch accents, which are associated to stressed syllables, or as prosodic boundary tones, which are aligned/associated to prosodic constituents (and some tones could be associated/aligned to both stressed syllables and constituents); (iv) pitch accents and prosodic boundary tones can be diagnosed based on how they phonetically align: pitch accent tones typically align close to a stressed syllable, while prosodic boundary tones typically align close to the edge of a prosodic constituent; and (v) F0 transitions between tones are approximately linearly interpolated, and unless there is a high or low boundary tone at an IntP edge (or an unspecified boundary tone, with F0 determined by a immediately flanking tonal event), then F0 at the IntP edge is expected to be mid-level in the speaker’s F0 range.
Under this consensus view—besides the more general question of whether
Finally, a methodological research question underlying the two studies was comparing how the two methods/data sources helped to develop our understanding of
2 CORAAL study
2.1 Materials and methods
We used CORAAL to complete a corpus study of
Summary of Age and Socioeconomic Information for Speakers in the CORAAL Datasets.
Through the CORAAL Explorer online interface, there is access to both a sound file and a paired transcription. Specific details about the transcription conventions of CORAAL can be found in their online user guide, and most of the transcription conventions followed those established by the Sociolinguistic Archive and Analysis Project (SLAAP). CORAAL transcriptions were done by undergraduates and checked by a linguistics graduate student; no information is given about whether the transcribers had experience with AAE. The transcriptions represented reduced forms (i.e.,
Initial classifications of the different been-types were made by Green, a native speaker of a variety of AAE spoken in Southwest Louisiana. Some classifications were made in collaboration with graduate student Ayana Whitmal, who also has intuitions about AAE. She listened to each been-type construction, including the utterances preceding and following the construction for semantic and discourse context, and labeled it “
To better assess the usage of
Recordings of each
2.2 Results
Section 2.2.1 presents results on the usage and distribution of “been”-types found in CORAAL. Section 2.2.2 explicates the phonetics of representative utterances of
2.2.1 Remote past “been” examples found in CORAAL: BIN and beenppart + adverbial
The search for the orthographic “been” returned a total of 1,410 results. These results included instances of “been” used by both the speaker and the interviewer. After removing the interviewer productions, a total of 1,210 utterances remained. Of that number, only 20 (1.7%) were determined to be instances of
Table 2 presents the subject and predicate of the utterance for each of the
Description of Each of the
While there were only 20 instances of
2.2.2 The phonetic realization of BIN constructions in CORAAL
Although there were 20

F0 contour, waveform, and spectrogram of DCB_se1_ag2_m_01_1, utterance 1629, showing a

F0 contour, waveform, and spectrogram of DCB_se1_ag2_f_02_1, utterance 1275, showing the one

F0 contour, waveform, and spectrogram of DCB_se1_ag3_m_03_1, utterance 1370, showing a BINSTATE-CONT. This is the one example found where

F0 contour, waveform, and spectrogram of DCB_se3_ag4_m_01_1, utterance 233, an utterance with two
Three (DCB_se1_ag2_f_01_1, utterance 1603, PRV_se0_ag3_f_03_1, utterance 1348, and DCB_se1_ag1_f_01_1, utterance 1436) ended with non-falling F0 consistent with a phrase-final mid or high boundary tone. While DCB_se1_ag2_f_01_1, utterance 1603 ended with a low F0 inflection point preceding the final high F0, so that there was a clear F0 peak in

F0 contour of DCB_se1_ag1_f_01_1, utterance 1436, showing a

F0 contour of DCB_se3_ag3_m_02_1, utterance 2650, showing a
Figure 1 shows DCB_se1_ag2_m_01_1, utterance 1629, “They woulda
As noted in Section 1.1,
The one example found where an F0 peak clearly appeared in the post-
Figure 4 shows a series of two
In three cases, the
Finally, Figure 6 shows the one
A representative example of a beenPPART utterance, utterance 402 from DCB_se1_ag2_m_02_1, is shown in Figure 7. The F0 on beenPPART is lower than the F0 on immediately preceding

F0 contour of DCB_se1_ag2_m_02_1, utterance 402, showing a representative beenPPART utterance. F0 dips low on unaccented beenPPART between high F0 on
In addition to the

F0 contour of DCB_se1_ag3_f_01_1, utterance 177, showing an example of a beenPPART that was perceived as prominent.
The two utterances Green classified as being ambiguous between

F0 contour of PRV_se0_ag2_f_01_1 utterance 1092, an utterance perceived to be ambiguous between
2.3 Discussion
2.3.1 Been-type distribution in CORAAL
Overall, there does not appear to be any systematic demographic pattern that determines when a speaker will produce a
For
Examples in CORAAL of Long Time Semantic Contexts Compatible with Remote Past been Constructions, Where the “been” Constructions are Realized with
In some cases, it is necessary to rely on discourse context and rhetorical strategies to understand the long-time reference. In the following example, the speaker begins by describing a situation in the past in which LeBron James and Dwyane Wade played professional basketball during the same time period. The speaker rhetorically takes the role of D Wade and establishes that Wade had a history of taking the lead. The speaker (as Wade) responds by saying “I been this” to mean that he’s played that role for quite some time—thus
In the unspecified category, all the adverbials are explicitly long-time. The durative adverbials in this category were varied, but adverbials like “for a long time” were common. Where “for a long time”-type adverbials occur is important. Specifically, when the predicate following beenPPART is a VP, certain adverbials co-occur with certain VP forms. Recall that constructions in the unspecified long-time category are treated as true
In the example above, the adverbial “long time ago” follows
In comparing the unspecified time period tokens, we find 20 tokens of
Despite the fact that the interview is structured to elicit natural speech, interviewees are still cognizant enough of the setting that it affects their speech. This could also contribute to why
The unspecified adverbial constructions are semantically equivalent to the
Through analyzing both interviewer cues and the discourse surrounding the “been” + adverbial utterances, we found that very few interviewer cues were directly addressing a question about duration. Of the 86 specified since cases, there were 6 such cues; there were 13 of 92 such cues for specified other cases, and 10 of 128 cues for unspecified long-time cases. That is, the majority of the time, “been” + adverbial is used unprompted.
One suggestion here is that in the interview setting, the speaker is in the position to talk about the past and give as much information and as many details as possible that will characterize the past event accurately. As such, speakers give information about time as much as possible. A clear case in support of this is the example in Figure 2, in which the speaker uses
2.3.2 The phonetic realization of BIN constructions in CORAAL
There were both consistent acoustic properties as well as loci of variability among the
The sampling of contexts in which
3 Southwest Louisiana production experiment
To complement the CORAAL corpus data, we carried out a more narrowly focused, controlled elicitation task in a small town AAE-speaking community in southwest Louisiana.
8
This task allowed us to further investigate the usage and realization of different semantic
3.1 Materials and methods
This section describes the speakers who participated in the production experiment (Sec. 3.1.1), stimulus construction in context of the experimental design (Sec. 3.1.2), the experimental procedure (Sec. 3.1.3), and data analysis (Sec. 3.1.4).
3.1.1 Speakers
Speakers came from a small-town community in southwest Louisiana (SWLAT) in Jefferson Davis Parish. This community has a population of 2,800, which is predominantly European American and 11% African American. The community has been historically segregated by railroad tracks and streets, so African Americans live on one side of the town and non-African Americans on the other with a few exceptions. While residents live in separate areas, the groups are in contact in schools and several small shops. The members of the African American community are predominantly native AAE speakers who share some language patterns with the local European Americans, some of whom identify as Cajun. In fact, the history of the community records that the citizens are a mixture of Acadians, French, and Anglo-Americans, but there is no mention of the citizens of African descent. There is one elementary school (pre-kindergarten–6th) and one high school (7th–12th), which children in the town attend unless they attend one of the Christian schools in the neighboring city. There are also two small grocery stores, a discount store, and a few other businesses, such as convenience stores with fuel stations. The schools and businesses are on the non-African American side of the town. There are two amusement parks in the town, one on the traditionally non-African American side and a smaller one on the traditionally African American side.
Nine speakers—six women and three men between the ages of 25 and 67—participated in this study in August 2019. Their gender, age, education, and employment are given in Table 4. The speakers, who are natives of SWLAT or a neighboring town which is 8 miles north, were recruited to participate in an advertised pilot study “The sound of aspect in African American speech” through a community consultant. In this small-scale pilot study, the goal was to elicit data from adults in the community with the understanding that in a larger
Summary of Demographic Information of Louisiana Speakers Recorded.
3.1.2 Stimuli
In total, there were 71 stimuli with
(16) Target utterance: a. BINCOMP: The tables are lined up neatly and ready to be cleaned. The maintenance workers really did a good job of putting numbers on all of those tables and getting them ready to be hauled away. Did they just finish? I wanted to catch them before they left the building. b. BINSTATE-HAB: At the end of every year, they have to take inventory so they know how many tables are in that big reception hall. Those same maintenance workers come every year to count and number them. They didn’t just start coming to number the tables. c. BINSTATE-CONT: The maintenance workers arrived early this morning to get this room ready. They haven’t taken a single break and they still have quite a bit of work to do. I see they are working with the tables, putting numbers on them. Did they just start that project? d. beenPPART: The maintenance workers are just leaving the building. They came in to work on the tables—to put numbers on them and get them ready to be painted. We know what they were just doing. e. beenPPART +long time adverbial: Target utterance:
Fifteen fillers were also constructed, which included grammatical structures of AAE such as existential

Illustrations for
3.1.3 Procedure
Participants were recorded by the first author in a quiet room within the community with a Shure SM35 head-mounted condenser microphone on a Zoom H5 digital recorder at a 16-bit bit depth with a 44.1 kHz sampling rate. At the beginning of the experiment, the participant was read instructions for the task and completed three practice trials. For each stimulus during the experiment, the participant saw a slide showing the accompanying illustration and listened to the context. (See slides in OSF repository). After the auditorily presented context finished playing, the target sentence to be uttered appeared on the slide for the participant to read.
Stimuli were presented in five blocks of 16–17 stimuli each, where no more than a single stimulus from an item set appeared within a block. Stimuli were pseudorandomized to avoid the same
3.1.4 Analysis
Recordings were segmented into individual utterances in Praat (Boersma & Weenink, 2019). Individual utterances were segmented into words with the Montreal Forced Aligner (McAuliffe et al., 2018) using the pretrained model for English, and then the word boundaries were hand-corrected. Two kinds of analyses were then performed: listener judgments and acoustic analysis. Results were then statistically analyzed.
Each recorded utterance was played together with its accompanying auditory context and illustration for listener judgments by Green and Whitmal. Listener judgments are a standard way to characterize AAE and other varieties of Englishes (Oetting & McDonald, 2002; Wyatt, 1991). Two kinds of judgments were made: (i) the acceptability of the utterance, given the context and (ii) an auditory classification of the
As described in Section 1.1, we expected potential cases of beenPPART usage in BINSTATE environments if speakers were choosing not to explicitly mark a long period of time. Thus, for consistency, all utterances perceived to be beenPPART in BINSTATE-HAB and BINSTATE-CONT environments were labeled as “accommodated.” In addition, six beenPPART environment items were detected during analysis to have had ambiguous contexts, so perceived non-beenPPART utterances for those items were similarly marked with “accommodated” labels (see Section 3.2.2). The
For fine-grained acoustic analyses, mean F0 and energy (i.e., intensity) measurements were taken over 10 evenly spaced subsections over each word using VoiceSauce (Shue et al., 2011), a program for automated voice analysis. The TANDEM-Straight F0 algorithm was used (Kawahara et al., 2016), with speaker-specific values for F0 floors and ceilings. Listener judgments and acoustic measurements were processed in R (R Core Team, 2018) using dplyr (Wickham et al., 2019), tidyr (Wickham & Henry, 2019), and ggplot2 (Wickham, 2016) packages. Durations were computed for each word, and mean and maximum F0 and energy values over the 10 subsections within a word were also computed. Then, the ratios between these measures over
3.2 Results
The results from the SWLAT production task are presented in four parts. Section 3.2.1 concerns task validation, and Section 3.2.2 integrates presentation of the results of perceived
3.2.1 Task validation
AAE is a spoken variety with no standard writing conventions, but our task relied on participants reading written English. To assess how well our task elicited natural AAE speech, we examined participants’ utterances of the fillers and their utterances of constructions that would be acceptable only with
3.2.2 Distribution of perceived BIN/been type and acceptability ratings across environments
We hypothesized that speakers would produce
Mean (± 1
These results can be better understood in the context of the distribution of acceptability ratings across environments, as shown in Table 6. Although they elicited a high proportion of beenPPART utterances, the BINSTATE-HABIT and BINSTATE-CONT environments yielded 0% and 1.2 ± 0.8% (1
Mean (± 1
Despite the complication of the accommodated cases, regression analyses nevertheless showed that perceived
Logistic Mixed Effects Model Output for the Effects of
3.2.3 Usage and acceptability of BIN/ beenPPART utterances in the beenPPART + long time adverbial environment
The remaining environment not yet discussed, the beenPPART + long time adverbial environment, was expected to elicit beenPPART utterances. However, as shown in Table 5, only 37.5 ± 12.3% (1
3.2.4 The phonetic realization of BIN utterances in the SWLAT production task
There was a total of 3,416 perceived
As described in Section 3.1.4, mean/max F0, mean/max intensity, and duration were measured over each word within an utterance. Then, the ratios of the acoustic measure over
Linear Mixed-Effects Model Output for the Effects of Whether the Utterance Was Perceived as

Distribution of within-item ratios of F0, intensity
Nine of the 20 CORAAL

Smoothed density plot comparing the ratio of max F0 in
What the purely acoustic analysis described in this section thus far cannot capture is variation in the phonological intonational tone choices and how those condition acoustic measures—this analysis aggregates across those phonological choices. An item-by-item intonational phonological analysis of the data is beyond the scope of this paper, but we show some representative F0 contours below, with reference to F0 contours observed in CORAAL.
The contrast between

Representative F0 contour of perceived

Representative F0 contour of perceived beenPPART in
The SWLAT productions revealed another prosodic parameter that could contribute to a

Representative F0 contour of perceived
For comparison, Figure 16 shows another utterance from the same speaker (Speaker la08) where

Representative F0 contour of an utterance perceived to be ambiguous between
Another exemplar of an utterance perceived as ambiguous between

Representative F0 contour of an utterance perceived to be ambiguous between
A final representative utterance, also from the been + long time adverbial environment, is shown in Figure 18. Like the utterance in Figure 17, this one also was classified as ambiguous between

Representative F0 contour of an utterance perceived to be ambiguous between
3.3 Discussion
Overall, results indicated that the production task was successful in eliciting AAE structures, and in particular,
The SWLAT productions also provided further information on the realization of
Besides drawing attention to the importance of the post-
The SWLAT data also built on the CORAAL data by allowing us to begin to get a sense of the range of variability in
Beyond the characterization of the realization of
4 General discussion and conclusion
Two characteristics of AAE that are often mentioned in general descriptions of the linguistic variety but not extremely well researched are its intonational patterns and tense/aspect properties. Following up on the call in Rickford (1975) to employ multiple methods in conducting research on constructions in spoken AAE that might not occur in data from interviews, we consulted corpus data and also elicitation tasks, yielding a wider data source and a study that can be replicated. These methods yielded data that provide insight into the phonology and semantics of AAE, and the contributions of the study go beyond descriptions of properties of AAE in those disparate areas and begin to provide information about the interplay between the two areas. In addition, a number of questions are also raised about the syntax of AAE and the structure of
Since the first observations about the meaning and use of
There are a number of factors that should be addressed in future
This study has also uncovered some possible areas of ambiguity in the perception and interpretation of
Footnotes
Acknowledgements
We gratefully acknowledge our southwest Louisiana community speakers and our funding sources. We are also grateful for feedback and discussion from Meghan Armstrong-Abrami, Alejna Brugos, Seth Cable, Alessa Farinella, editors Cynthia Clopper and Holger Mitterer, guest associate editor Shelome Gooden, four anonymous reviewers, and audiences from ETAP4 and NWAV 49.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This material is based upon work supported by the National Science Foundation under grant BCS-2042939, a UMass Amherst Faculty Research Grant/Healey Endowment Grant, a UMass Amherst Institute of Diversity Sciences Seed Grant, and the UMass Amherst Center for the Study of African American Language. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (or other funding sources).
