Sage Journals: Discover world-class research

Abstract

Although students are often taught to look for keywords when solving word problems, this strategy is erroneous. It is especially problematic when students solve inconsistent word problems that include a relational term, such as more but are not solved with the assumed operation (e.g., addition). In this study, we analyzed constructed equations on four word problems that included the word more for 112 Grade 3 students from a U.S. southwest school district. We compared students with and without mathematics difficulty and disaggregated based on dual-language status. Most students constructed accurate equations for the two consistent word problems, but fewer constructed accurate equations for the two inconsistent word problems. Students with mathematics difficulty, particularly those who were also dual-language learners, had the lowest rates of accurate equations on the inconsistent word problems. This analysis reinforces previous calls by researchers to avoid the ineffective keywords strategy.

Keywords

word problems keywords elementary mathematics

Ryan has 3 more books than Miguel. If Ryan has 12 books, how many books does Miguel have? In this word problem, if a student were to see the word more as a cue to add 12 + 3, they would solve the word problem incorrectly. Word problems such as this are referred to as inconsistent (Lewis & Mayer, 1987). Inconsistent word problems include a relational term, such as more, but are not solved with the keyword-queued operation (e.g., addition). Just as word problems that include the word less are not always solved with subtraction. Conversely, consistent word problems include a relational term that can be successfully used as a keyword queue. For decades, researchers have examined consistent versus inconsistent word problems because of the difficulty that inconsistent problems cause for students (e.g., Hegarty et al., 1995; Passolunghi et al., 2022).

In this study, we examined the performance of Grade 3 students on consistent and inconsistent word problems. Half of these students experienced mathematics difficulty (MD), and many were dual-language learners (DLLs). In this introduction, we review the research on inconsistent word problems and the use of a keyword strategy to solve word problems. Then, we describe students with MD, DLLs, and the word-problem features that might cause difficulty for these students. Finally, we present the purpose and research questions guiding our research.

Prior Research on Inconsistent Word Problems

Much of the prior research related to inconsistent word problems involved undergraduate participants (e.g., Hegarty et al., 1992, 1995; Jaffe & Bolger, 2023; Lewis & Mayer, 1987; Lubin et al., 2016). Across studies, undergraduate participants spent more time working on inconsistent problems compared with consistent problems and solved inconsistent problems with less accuracy. Hegarty et al. (1995) compared participants who successfully solved inconsistent problems with those who did not. Using eye-tracking technology, the researchers determined that participants who were unsuccessful at solving inconsistent word problems tended to fixate more on the numbers and relational terms (e.g., more and less) than their more successful counterparts. Hegarty et al. (1995) referred to this as the direct translation approach, hypothesizing that students who were less successful at solving word problems formulated their solution based primarily on the numbers and a perceived keyword (e.g., more).

Other researchers have focused on participants in elementary and middle school (e.g., Boonen et al., 2016; Pape, 2003; Passolunghi et al., 2022; Shum & Chan, 2020). Bartalis et al. (2023) conducted a similar study to Hegarty et al. (1995) with Grade 4 participants. In this eye-tracking study, students who were less successful at solving word problems also tended to fixate more on the relational term (e.g., more) when solving inconsistent word problems. Passolunghi et al. (2022) investigated the performance of Grade 4 and Grade 5 students on consistent versus inconsistent word problems. Similar to previous research, students were less successful in solving inconsistent problems. Notably, their participants did not have disabilities or MD and performed on grade level based on a standardized arithmetic assessment. Moreover, none of the participants were reported as being DLLs.

To date, few studies have examined the impact of keywords on the word-problem performance of a more diverse sample of elementary students. As such, implications for elementary classrooms, particularly in the United States, are limited. According to the National Center for Education Statistics (NCES), approximately 15% of U.S. public school students receive special education services, and approximately 11% are DLLs (NCES, 2024a, 2024b). Thus, it is crucial to extend the research on consistent versus inconsistent word problems by including a broader range of elementary students.

Keywords Instruction

Despite a breadth of research related to how difficult it is for students to solve inconsistent word problems such as those in which more does not mean to add, many elementary students are taught to look for keywords within word problems to determine the necessary operation (Powell, Berry & Benz, 2020). In fact, in an interview study with 70 elementary teachers, the most reported strategy taught to students during word-problem instruction was identifying keywords (Pearce et al., 2013). As further evidence of this, searching the popular website Teachers Pay Teachers (teacherspayteachers.com) for the terms word-problem keywords will return an abundance of downloadable products, including keyword posters, worksheets, and curricula. With the keywords strategy, teachers instruct students to add when they see terms such as more and altogether and subtract when they see words such as less and left, intending to help them be more successful problem solvers. When word problems contain consistent language, the keywords strategy may seem effective on the surface.

However, not all word problems have consistent language. Powell, Namkung and Lin (2022) analyzed 214 word problems from high-stakes tests in the United States. Among the one-step word problems, the keywords strategy would be effective less than 50% of the time. For multi-step problems, the keywords strategy would be effective less than 10% of the time. Importantly, even without the presence of instruction that ties keywords directly to operations, students can form these assumptions on their own, making inconsistent problems particularly difficult (Van Dooren et al., 2005). To strengthen the call for teachers to avoid reinforcing the ineffective keywords strategy, our study extends previous research by including students with MD.

Mathematics Difficulty

Students with MD demonstrate persistent low performance in mathematics. Students with MD may or may not have a school-identified learning disability. Often, researchers identify students as having MD if they score at or below a specific percentile (e.g., 25th) on a mathematics screener (Nelson & Powell, 2018). In this study, we categorized students as having MD if they performed at or below the 25th percentile on a word-problem screener described later in the manuscript. Students with MD often experience challenges with a variety of mathematical tasks, including fact fluency, computation, and word-problem solving (Andersson, 2008; Jordan et al., 2003; Mabbott & Bisanz, 2008). When solving word problems, students with MD often struggle to plan and effectively solve problems without explicit instruction on strategies (Powell, Doabler, et al., 2020). Another compounding factor related to word-problem proficiency is dual-language status.

Dual-Language Learners

We define DLLs as students who speak a language other than English in the home. The average percentage of DLLs enrolled in public schools, by state, is approximately 11% (NCES, 2024b). The largest percentage is in Texas, with more than 20% of students categorized as DLLs. To increase the generalizability of education research, it is essential for researchers to recruit diverse samples that are representative of today’s classrooms. Previous researchers have examined DLL’s performance on word problems (e.g., Abedi & Lord, 2001; King & Powell, 2023; Martin & Fuchs, 2019; Powell, Urrutia, et al., 2022). For example, Abedi and Lord (2001) administered word problems to Grade 8 students, and DLLs scored significantly lower than non-DLLs. However, after revising the word problems to reduce linguistic complexity, the majority of students’ scores increased, particularly those of DLLs.

To examine the interaction between DLL status and MD status, Martin and Fuchs (2019) administered word problems to Grade 1 students in the fall and spring. In the fall, DLLs and non-DLLs with MD performed comparably. However, among students without MD, DLLs performed lower than their non-DLL peers. Finally, by spring, both DLLs with and without MD performed significantly lower than their non-DLL peers. This finding differed from that of Powell, Urrutia, et al. (2022), who administered word problems to Grade 3 students with MD, both with and without DLL status. Among these students with MD, DLL status did not result in significantly different word-problem performance.

To further examine the interaction between DLL status and MD, King and Powell (2023) analyzed the data of Grade 3 DLLs with MD. Prior to implementing a word-problem intervention, DLL’s scores on an English-proficiency assessment in the areas of speaking, listening, reading, and writing, strongly correlated with their word-problem performance. However, upon implementing the word-problem intervention, these correlations weakened, leaving only reading and writing as significant predictors of DLLs’ word-problem performance. In addition to MD and DLL status, word-problem performance can vary due to word-problem features, a topic we explore in the next section.

Word-Problem Features

Arsenault and Powell (2022) analyzed which word-problem features were the most challenging for students with and without MD. Nearly one third of their participants were DLLs. The features of interest included schema (i.e., problem type), position of the unknown, inclusion of irrelevant information, and relevant information presented in charts or graphs. Arsenault and Powell (2022) conducted their analysis using data from the same larger study (i.e., Powell et al., 2021) with which we conducted our analysis. Their sample included 692 Grade 3 students with MD and 2,149 Grade 3 students without MD. Word problems were scored as having correct or incorrect numerical answers.

First, Arsenault and Powell (2022) analyzed students’ performance on word problems according to the schema (i.e., the underlying structure of a word problem; Cooper & Sweller, 1987). Within elementary school, students first solve additive word problems (i.e., word problems that require addition or subtraction). The three additive schemas are Total, Difference, and Change. In Total problems, parts are put together into a total (e.g., Rosie planted 12 sunflowers and 8 marigolds. How many flowers did she plant altogether?). In Difference problems, two amounts are compared for a difference (e.g., How many more sunflowers did Rosie plant than marigolds?). Lastly, in Change problems, an amount increases or decreases to a new amount (e.g., Rosie had 20 tomato seeds. She planted 4. How many seeds does she have left?). Change problems can be further categorized into Change increase and Change decrease problems

Next, Arsenault and Powell (2022) analyzed students’ performance on word problems according to the location of the unknown. In one-step word problems, the unknown information may be in the initial, medial, or final position. For Total problems, the unknown information may be the total or one of the parts. For Difference problems, the greater amount, lesser amount, or difference can be unknown. For example: Rosie planted 4 more sunflowers than marigolds. She planted 8 marigolds. How many sunflowers did she plant? In this Difference problem, the greater amount is unknown. For Change problems, the unknown information may be the start, change, or end amount. For example: Rosie planted some sunflowers, and then she planted 8 marigolds. Altogether, she planted 20 flowers. How many sunflowers did she plant? In this Change problem, the initial, or start amount is unknown.

The results of the Arsenault and Powell (2022) analysis demonstrated conflicting patterns of student performance among schemas and positions of the unknown, due in part to the inclusion of irrelevant information and information presented in charts or graphs. For example, previous research has suggested that Difference problems with an unknown difference typically elicit a high rate of accuracy (García et al., 2006; Powell et al., 2009). However, on such a word problem, students in Arsenault and Powell (2022) performed poorly. Importantly, this item also included irrelevant information and relevant information presented in a graph, which contributed to the difficulty. Conversely, students with and without MD had a markedly higher percentage of accuracy on the Difference problem with the greater amount unknown. Arsenault and Powell hypothesized this increase was due to the correct operation being addition, citing students’ tendency to choose addition rather than subtraction when solving word problems. However, this item also included the word more and was consistent. Thus, if students read the word more as a cue to add the two numbers in the word problem, they would have constructed an accurate equation.

Lastly, coinciding with previous research, students with MD generally had a higher percentage of accuracy on Change problems with an unknown end amount compared with those with an unknown start or change amount (Arsenault & Powell, 2022). However, there was an outlier. A Change decrease problem with an unknown change amount had one of the highest percentages of accuracy among students with MD. The authors hypothesized this may have occurred because the item included keywords related to subtraction (i.e., gave and give away) and was consistent. Therefore, with accurate computation, constructing a subtraction equation would result in the correct answer.

We aimed to build on the work of Arsenault and Powell (2022) by examining an additional word-problem feature that could adversely affect word-problem performance: the inclusion of keywords that may be tied directly to operations (e.g., more means to add). Furthermore, we controlled for the impact of computational skills by examining equation construction rather than correct answers and further disaggregated by DLL status.

Purpose and Research Questions

The purpose of our study was to further isolate and examine the impact of the inclusion of keywords on the word-problem performance of Grade 3 students. We focused specifically on word problems that included the word more, a word likely to be interpreted as a cue to add. We focused on the word more because it has a similar, if not identical meaning inside and outside of mathematics classrooms. Thus, Grade 3 students are likely to understand this term. Other terms that could be tied directly to operations, such as left and fewer, have multiple meanings (e.g., left meaning the direction or left meaning a remaining amount) or a mathematical definition that may not have been directly taught (i.e., fewer means less). We also focused on the word more to better control for varying vocabulary knowledge, which is a contributing factor to word-problem success (Xu et al., 2022). Moreover, to control for calculation complexity and errors in students’ calculations, we analyzed the equations that students constructed to solve word problems. We asked the following research questions:

How does the inclusion of the word more affect the accuracy of students’ construction of equations when solving word problems?

How does the accuracy of constructed equations differ between students with and without MD, and students with and without DLL status?

Method

Context

We analyzed screening data collected during a randomized-controlled trial about the efficacy of a word-problem intervention (Powell et al., 2021) that had been approved by our university’s Institutional Review Board. This study was conducted in a large, urban school district in the Southwest of the United States, and we had received approval from the school district to conduct this study in their schools with Grade 3 students. Each year, for 3 years, we screened Grade 3 students from 1 of 26 elementary schools for eligibility into a study focused on efficacy of a word-problem intervention. At the time, this public school district served more than 75,000 students. On average, the district reported 55.5% of students as Hispanic, 29.6% as White, 7.1% as African American, and 7.7% as belonging to another race or ethnic category. Overall, 27.1% of students identified as DLLs, 52.4% qualified as economically disadvantaged, and 12.1% received special education services. In Cohort 1 (2015–2016), we screened 1,109 students. In Cohort 2 (2016–2017), we screened 914 students, and we screened 818 students in Cohort 3 (2017–2018).

Measure

Before describing the participants, we describe the measure of focus for this study because participants were selected based on their performance on this measure. We screened all Grade 3 students with the screening measure of Texas Word Problems Brief (Powell & Berry, 2015). This measure consisted of eight single-step Total, Difference, and Change word problems that required addition and subtraction within 100. The word problems did not include irrelevant information or information presented in tables or graphs. Cronbach’s alpha from a full sample of 2,841 Grade 3 students for the 8-item measure was .803. Four items contained the word more. Cronbach’s alpha from the full sample of Grade 3 students for these four items was .647.

Of these four word problems with the word more, two were Difference problems (i.e., amounts are compared for a difference); one was consistent, and the other was inconsistent. The first Difference problem states: The library has 23 books about dinosaurs. The library has 14 more books about space. How many books about space does the library have? In this Difference problem, the greater amount was unknown. For this word problem, if a student were to interpret the word more as a cue to add the two numbers together, the strategy would result in an accurate equation. Conversely, the second Difference problem read: Mr. Jones delivers packages. He delivered 26 packages on Thursday and 85 packages on Friday. How many more packages did he deliver on Friday? In this Difference problem, the difference was unknown. For this word problem, interpreting the word more as a cue to add the two numbers together would result in an inaccurate equation.

The other two items were Change problems (i.e., a starting amount increases or decreases to a new amount); one was consistent, and the other was inconsistent. Consider this item: Alfred drove 59 miles, and then he stopped for gas. Then, Alfred drove 34 more miles before stopping for lunch. How far did Alfred drive? In this Change increase problem, the end amount was unknown. To solve this problem correctly, students would add 59 plus 34. This is not the case for the other change problem: There were some students on the school bus. Then 19 more came on. There are now 34 students. How many were there to start? This is also a Change increase problem, but the start amount was unknown. To solve this problem correctly, students would need to construct one of the following equations: ? + 19 = 34 or 34 − 19 = ?. Adding 19 and 34 would result in an incorrect answer.

Participants

Grade 3 students (N = 2,841) completed Texas Word Problems Brief to determine possible eligibility for a word-problem intervention efficacy trial (i.e., Powell et al., 2021). For our analysis, we began by selecting a sample of students with MD and then selected a comparison sample of students without MD (see Figure 1).

Figure 1.

Procedures for Participant Inclusion.

Participants With MD

Students were designated as having MD if they answered 50% or fewer of the items correctly on an additional screener, Single-Digit Word Problems (Jordan & Hanich, 2000). Approximately 25% of the sample answered 50% or fewer items correctly on the screener. Across the 3 years of the study, 150 students in Cohort 1, 159 students in Cohort 2, and 164 students in Cohort 3 were classified as having MD, for a total sample of 473 Grade 3 students with MD.

To be included in this analysis, students had to have (a) demographic information on record, (b) constructed an equation for all four items of interest on the Texas Word Problems Brief, and (c) used more than one operation on the Texas Word Problems Brief assessment. We only included students with demographic information on record to allow for disaggregation by DLL status. We only included students who wrote equations so there would be less ambiguity as to what operations each student intended to use. By analyzing students’ constructed equations instead of correct or incorrect solutions, we controlled for calculation complexity and students’ calculation errors. Finally, we excluded students who only used one operation on all eight items of the Texas Word Problems Brief. This controlled for students who chose operations regardless of the text of the word problems.

In an effort to include a wider range of students with MD, we included students who wrote equations but did not include the operational symbol if their intended operation was clear (e.g., 59 + 34 = 93). We also included students who misplaced the minuend and subtrahend in their subtraction equations (e.g., writing 26 − 85 = ? instead of 85 − 26 = ?) because we were solely interested in students’ choices in operations. Finally, we included students who made slight errors when copying the numbers from the word problem into their equations (e.g., writing 59 + 32 = ? instead of 59 + 34 = ?).

Of the 473 Grade 3 students with MD, 56 students met this inclusion criteria. Next, we selected a comparison sample of students without MD.

Participants Without MD

To form the comparison sample of students without MD, we began by coding the assessments of students who shared the same school and teacher of those in the MD sample. This included 295 students in Cohort 1, 282 students in Cohort 2, and 155 students in Cohort 3, for a total of 732 Grade 3 students without MD. Of these, 123 students without MD met our inclusion criteria (i.e., demographic information on record, constructed an equation for all four items of interest, and used more than one operation on the assessment). We randomly selected 56 of these students to form our comparison sample of students without MD, matching the year of the study, school, and teacher with those in our MD sample when possible.

Demographic Information

Table 1 presents the demographic information for the 56 students with MD and 56 students without MD. Gender and special education status were comparable. The majority of students with and without MD were Hispanic/Latine, but the non-MD sample had a slightly smaller proportion of Hispanic/Latine students and a slightly larger proportion of white students. DLL status was also comparable, as the MD sample had 37 DLLs, and the non-MD sample had 33 DLLs.

Table 1.

Participant Demographics for Word Problem Performance Study.

Demographic	MD (n = 56)		Without MD (n = 56)
Gender
Female	30	53.6%	32	57.1%
Male	26	46.4%	24	42.9%
Race/ethnicity
Hispanic/Latine	41	73.2%	34	60.7%
Black	6	10.7%	5	8.9%
Asian	3	5.4%	1	1.8%
Multiracial	3	5.4%	4	7.1%
White	1	1.8%	11	19.6%
Other	2	3.4%	1	1.8%
Special education
Not in special education	52	92.9%	52	92.9%
Receiving special education	3	5.4%	3	5.4%
Not reported	1	1.8%	1	1.8%
Dual-language status
Dual-language learner (DLL)	37	66.1%	33	58.9%
Non-DLL	19	33.9%	23	41.0%

Note. MD = mathematics disability.

Coding and Data Analysis

We recorded students’ equations for the four items of interest and categorized them as accurate or inaccurate based on whether solving them would result in the correct answer. If a student wrote a plus sign, but clearly subtracted, we coded these students as having intended to subtract. If a student wrote a minus sign, but clearly added, we coded these as having intended to add. After the initial coding, one of the authors double-coded by indicating agreement or disagreement. This resulted in five discrepancies (98.9% agreement), which we then resolved.

We calculated the accuracy rate for each of the four items by dividing the number of students who constructed an accurate equation by the total sample. First, we calculated the accuracy rates for students with MD compared to students without MD. Then, we calculated accuracy rates for DLLs with MD, non-DLLs with MD, DLLs without MD, and non-DLLs without MD. We also calculated the percentage of students with and without MD who constructed accurate equations across all four items, and the percentage of students with and without MD who added the numbers in the word problem across all four items.

Results

This analysis explored how the inclusion of the word more may have affected students’ construction of equations when solving four word problems. Table 2 displays the four items, the possible accurate equations for each item, and accuracy rates for students with and without MD. We begin with the two consistent problems. On the consistent Difference problem, 87.5% of students with MD and 91.1% of students without MD constructed an accurate equation by adding the numbers in the word problem. For the consistent Change problem, 91.1% of students with MD and 96.4% of students without MD constructed an accurate equation by adding the numbers in the word problem. Next, we describe the two inconsistent problems.

Table 2.

Percentage of Accurate Equations of Students With and Without Mathematics Difficulty (MD).

Category	Items	Accurate equations for each item	Percentage of accurate equations
Category	Items	Accurate equations for each item	With MD	Without MD
Consistent
Difference	The library has 23 books about dinosaurs. The library has 14 more books about space. How many books about space does the library have?	23 + 14 = ?	87.5%	91.1%
Change	Alfred drove 59 miles, and then he stopped for gas. Then, Alfred drove 34 more miles before stopping for lunch. How far did Alfred drive?	59 + 34 = ?	91.1%	96.4%
Inconsistent
Difference	Mr. Jones delivers packages. He delivered 26 packages on Thursday and 85 packages on Friday. How many more packages did he deliver on Friday?	85 – 26 = ?? + 26 = 85	42.9%	67.9%
Change	There were some students on the school bus. Then 19 more came on. There are now 34 students. How many were there to start?	34 – 19 = ?? + 19 = 34	39.3%	80.4%

For the inconsistent Difference problem, only 42.9% of students with MD constructed an accurate equation. All of these students constructed the equation 85 – 26 = ?. Conversely, all of the students with MD who constructed an inaccurate equation added the numbers in the word problem together. Of the students without MD, 67.9% constructed an accurate equation with the majority of these students constructing the subtraction equation. Only one of the students without MD constructed the equation 26 + ? = 85. Similarly, all of the students without MD who constructed an inaccurate equation added the two numbers together.

For the inconsistent Change problem, 39.3% of students with MD constructed an accurate equation. The majority of these students constructed the equation 34 – 29 = ?. One student constructed the equation ? + 19 = 34. Conversely, all of the students with MD who constructed an inaccurate equation added the numbers in the word problem together. Of the students without MD, 80.4% constructed an accurate equation. Of these students without MD, most constructed an accurate equation by subtracting. Seven students without MD (12.5%) constructed the equation ? + 19 = 34. Again, all of the students without MD who constructed an inaccurate equation added the two numbers together.

Overall, only 17.9% of students with MD constructed accurate equations for all four items, compared with 53.6% of students without MD. To explore our hypothesis that many students would add due to the inclusion of the word more, we calculated the percentage of students who added across all four items. Nearly half of the students with MD (39.3%) added on all four items. A much smaller percentage (10.7%) of students without MD added on all four items.

The percentages of accurate equations for DLLs and non-DLLs, with and without MD, are displayed in Table 3. On the consistent Difference problem, percentages were similar regardless of DLL and MD status, with a range of 84.2%–95.7%. On the consistent Change problem, 86.5% of DLLs with MD constructed an accurate equation, compared with 100% of non-DLLs with MD. Similarly, 93.9% of DLLs without MD constructed an accurate equation, compared with 100% of non-DLLs without MD.

Table 3.

Status Comparison of Dual Language Learners (DLLs) aand Non- Dual Language Learners.

Item	Percentage of accurate equations
	With MD		Without MD
	DLLs (n = 37)	Non-DLLs (n = 19)	DLLs (n = 33)	Non-DLLs (n = 23)
Consistent
Difference	89.2%	84.2%	87.9%	95.7%
Change	86.5%	100%	93.9%	100.0%
Inconsistent
Difference	37.8%	52.6%	66.7%	69.7%
Change	35.1%	47.4%	81.8%	78.2%

Note. MD = Mathematics difficulty.

The range between scores increased on the two inconsistent problems. On the inconsistent Difference problem, only 37.8% of DLLs with MD constructed an accurate equation, compared with 52.6% of non-DLLs with MD. Non-MD students performed comparably regardless of DLL status, with 66.7% of DLLs without MD constructing an accurate equation compared with 69.7% of non-DLLs without MD. On the inconsistent Change problem, only 35.1% of DLLs with MD constructed an accurate equation, compared with 47.4% of non-DLLs with MD. Similar to the inconsistent Difference problem, students without MD performed comparably regardless of DLL status, with a range of 78.2%–81.8%.

Discussion

To investigate the potential impact of keywords instruction, we analyzed the constructed equations of 112 Grade 3 students on four word problems that included the word more. Two of the word problems were consistent (i.e., a keyword queue works to solve the problem correctly) and two were inconsistent (i.e., a keyword queue does not work to solve the problem correctly). On the two consistent word problems, most students, regardless of MD or dual-language status, constructed accurate equations. Accuracy rates for the two consistent problems ranged from 84.2% to 100% across subgroups. Conversely, on the two inconsistent problems, all subgroups demonstrated lower accuracy rates, with a range of 35.1%–81.8%. The results of this analysis align with prior studies that suggest that inconsistent word problems are particularly difficult for students to solve.

Furthermore, this analysis suggests that many of the participants may have been relying on the ineffective keywords strategy. On both inconsistent problems that included the word more, students who constructed inaccurate equations did so by adding the numbers in the word problems together. Because we excluded students who only used one operation across all of the screener items, we can assume that these participants do not represent those who add regardless of the text of the word. With accuracy rates on the two consistent problems being relatively high across all subgroups and declining on the two inconsistent problems, we can presume that students may have interpreted the word more as a cue to add.

Our analysis suggests that students with MD may be more likely to use the ineffective keywords strategy than students without MD. Evidence of this is that 39.3% of the students with MD added across all four items, compared with only 10.7% of the students without MD. In fact, we identified several students with MD who explicitly underlined or circled keywords while solving word problems. See Figure 2 for one of these students’ items of interest in which they circled the word more in three of the word problems.

Figure 2.

Circling of the Word “More” by a Student With Mathematics Disability.

Finally, further disaggregating accuracy rates by dual-language status demonstrated an interesting trend. Generally, non-DLLs constructed accurate equations more frequently than DLLs. This trend is the strongest among the students with MD, particularly on the inconsistent word problems. On the inconsistent Difference problem, 14.8% fewer DLLs with MD constructed an accurate equation than non-DLLs with MD. Similarly, on the inconsistent Change problem, 12.3% fewer DLLs with MD constructed an accurate equation than non-DLLs with MD. This indicates that among students with MD, DLLs may particularly struggle with inconsistent word problems. In fact, 16 DLLs with MD added across all four items, compared with only 6 non-DLLs with MD. This suggests that DLLs with MD may be particularly vulnerable to the ineffective keywords strategy.

In summary, the results of this analysis suggest that, due to a possible reliance on the ineffective keywords strategy, students struggle with inconsistent word problems. Students with MD, particularly those who are DLLs, may especially need support in solving inconsistent problems. Next, we describe implications for assessment, implications for instruction, and future directions for research.

Implications for Assessment

Features that influence the difficulty of a word problem include schema, position of the unknown, inclusion of irrelevant information, and relevant information presented in charts and graphs (Arsenault & Powell, 2022). This analysis demonstrated that the inclusion of a keyword (e.g., more) is also an important factor. Thus, the ratio of consistent versus inconsistent word problems on a measure of word-problem solving may influence the results. A measure that includes a larger proportion of inconsistent word problems could potentially over-identify students as having MD, particularly among DLLs.

Implications for Instruction

Crucially, these findings support prior calls for educators to avoid teaching students to associate isolated words (e.g., more, bought, share) with operations (e.g., Karp et al., 2019; Powell, Namkung & Lin, 2022). In lieu of this ineffective strategy, several research teams have investigated how to support students in solving inconsistent word problems, and several promising strategies have emerged. First is schema instruction. Schema instruction involves teaching students to identify word problems by their schema or problem type (e.g., Total, Difference, Change) and solve using associated strategies. Polo-Blanco et al. (2024) and Goñi-Cervera and Jacinto (2024) implemented modified schema-based instruction (Spooner et al., 2017), which has a large emphasis on the use of schematic diagrams. In both studies, students demonstrated significant improvement in solving inconsistent word problems.

Similarly, De Koning et al. (2022) investigated students’ use of schematic diagrams when solving inconsistent word problems. In this study, students drew and labeled bar diagrams as part of their problem-solving process. Upon analysis, students who drew accurate bar diagrams were more likely to solve inconsistent word problems successfully. In a different study, De Koning et al. (2017) implemented verbal instruction, which involved making students aware of inconsistent word problems. Students were taught to pay close attention to word problems in their entirety. They were explicitly told that at times, word problems will include words that may imply the need to add, but the required operation is subtraction (or vice versa). Students who received the verbal instruction demonstrated an improvement in solving inconsistent word problems successfully.

In summary, educators should not directly tie keywords to operations. It is important to expose students to both consistent and inconsistent word problems and explicitly teach them to be aware of the features of inconsistent word problems. To support students in solving both consistent and inconsistent word problems, educators may implement schema instruction and the use of bar diagrams. Although all students would likely benefit from these instructional strategies, this analysis suggests that they may be particularly important for students with MD, and especially those who are DLLs.

Future Directions for Research

This analysis demonstrated that both students with and without MD have more difficulty constructing accurate equations for inconsistent word problems. A much greater percentage of students with MD constructed inaccurate addition equations while solving word problems that included the word more. However, this was a secondary analysis, and we are unable to determine with certainty whether students chose to add because of the inclusion of the word more. Future researchers might consider embedding student interviews to gain insight into students’ thought processes.

Moreover, researchers should continue to examine the impact of schema instruction, bar models, and verbal instruction on students’ proficiency with inconsistent word problems. This research should be expanded by including DLLs, and participants in expanded grade levels and with a variety of disability statuses. Additionally, research should be conducted that involves multiple schemas and a variety of included keywords.

Limitations

Although this study supports findings from previous research on inconsistent word problems (e.g., Pape, 2003; Passolunghi et al., 2022), there are some notable limitations. First, we only analyzed four word problems that included the more. Broadening the analysis by including additional schemas and various keywords would increase the generalizability of the results. Second, we had a relatively small sample size of 112 students; a larger sample size could strengthen the validity of the results and provide for a more robust analysis. Additionally, our inclusion criteria excluded many students with MD because they either did not write an equation for all four items of interest or only used one operation (e.g., addition) across all screener items. As such, our MD sample may reflect a higher-performing sample of students with MD. Third, the problems used in the analysis were a small subset of the Texas Word Problems Brief and were not written for the purpose of this analysis. Thus, the language demands (e.g., word count, word-level decoding, vocabulary) varied among the four word problems, which may have influenced the results. Specially designing word problems for this kind of analysis by matching linguistic and computation complexity would have strengthened the results.

Conclusion

Across the four items on the Texas Word Problems Brief that contained the word more, students with and without MD demonstrated difficulty constructing accurate equations for inconsistent word problems. To a greater extent, students with MD, and particularly those who were DLLs, constructed inaccurate addition equations on these problems. These results further support the notion that keywords should not be tied directly to operations (e.g., more does not always mean to add). All students, but particularly those with MD who are DLLs, may benefit from explicit instruction on solving inconsistent word problems.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

This research was supported in part by the Institute of Education Sciences in the U.S. Department of Education to the University of Texas at Austin (grant no. R324A150078). The content is solely the responsibility of the authors and does not necessarily represent the official views of the U.S. Department of Education.

ORCID iDs

Alison M. Hardy

Grace P. Douglas

Katie B. MacLean

Kathleen K. Mason

Sarah R. Powell

References

Abedi

Lord

(2001). The language factor in mathematics tests. Applied Measurement in Education, 14(3), 219–234. https://doi.org/10.1207/s15324818ame1403_2

Andersson

(2008). Mathematical competencies in children with different types of learning difficulties. Journal of Educational Psychology, 100(1), 48–66. https://doi.org/10.1037/0022-0663.100.1.48

Arsenault

T. L.

Powell

S. R.

(2022). Word–problem performance differences by schema: A comparison of students with and without mathematics difficulty. Learning Disabilities Research & Practice, 37(1), 37–50. https://doi.org/10.1111/ldrp.12273

Bartalis

Á.

Péntek

Zsoldos-Marchiș

. (2023). A pilot study on investigating primary school students’ eye movements while solving compare word problems. Open Education Studies, 5, Article 20220207. https://doi.org/10.1515/edu-2022-0207

Boonen

A. J.

De Koning

B. B.

Jolles

Van der Schoot

(2016). Word problem solving in contemporary math education: A plea for reading comprehension skills training. Frontiers in Psychology, 7, Article 191. https://doi.org/10.3389/fpsyg.2016.00191

Cooper

Sweller

(1987). Effects of schema acquisition and rule automation on mathematical problem-solving transfer. Journal of Educational Psychology, 79(4), 347–362. https://doi.org/10.1037/0022-0663.79.4.347

De Koning

B. B.

Boonen

A. J.

Jongerling

Van Wesel

Van der Schoot

. (2022). Model method drawing acts as a double-edged sword for solving inconsistent word problems. Educational Studies in Mathematics, 111, 29–45. https://doi.org/10.1007/s10649-022-10150-8

De Koning

B. B.

Boonen

A. J.

Van der Schoot

. (2017). The consistency effect in word problem solving is effectively reduced through verbal instruction. Contemporary Educational Psychology, 49, 121–129. https://doi.org/10.1016/j.cedpsych.2017.01.006

García

A. I.

Jiménez

J. E.

Hess

(2006). Solving arithmetic word problems: An analysis of classification as a function of difficulty in children with and without arithmetic LD. Journal of Learning Disabilities, 39(3), 270–281. https://doi.org/10.1177/00222194060390030601

10.

Goñi-Cervera

Jacinto

(2024). Enhancing inconsistent language problem-solving in an autistic student through a modified schema-based instruction. Education 3-13: International Journal of Primary, Elementary and Early Years Education. Advance online publication. https://doi.org/10.1080/03004279.2024.2319841

11.

Hegarty

Mayer

R. E.

Green

C. E.

(1992). Comprehension of arithmetic word problems: Evidence from students’ eye fixations. Journal of Educational Psychology, 84(1), 76–84. https://doi.org/10.1037//0022-0663.84.1.76

12.

Hegarty

Mayer

R. E.

Monk

C. A.

(1995). Comprehension of arithmetic word problems: A comparison of successful and unsuccessful problem solvers. Journal of Educational Psychology, 87(1), 18–32. https://doi.org/10.1037//0022-0663.87.1.18

13.

Jaffe

J. B.

Bolger

D. J.

(2023). Cognitive processes, linguistic factors, and arithmetic word problem success: A review of behavioral studies. Educational Psychology Review, 35, Article 105. https://doi.org/10.1007/s10648-023-09821-6

14.

Jordan

N. C.

Hanich

L. B.

(2000). Mathematical thinking in second-grade children with different forms of LD. Journal of Learning Disabilities, 33(6), 567–578. https://doi.org/10.1177/002221940003300605

15.

Jordan

N. C.

Hanich

L. B.

Kaplan

(2003). A longitudinal study of mathematical competencies in children with specific mathematics difficulties versus children with comorbid mathematics and reading difficulties. Child Development, 74(3), 834–850. https://doi.org/10.1111/1467-8624.00571

16.

Karp

K. S.

Bush

S. B.

Dougherty

B. J.

(2019). Avoiding the ineffective keyword strategy. Teaching Children Mathematics, 25(7), 428–435. https://doi.org/10.5951/teacchilmath.25.7.0428

17.

King

S. G.

Powell

S. R.

(2023). Language proficiency and the relation to word–problem performance in emergent bilingual students with mathematics difficulties. Learning Disabilities Research & Practice, 38(4), 263–273. https://doi.org/10.1111/ldrp.12325

18.

Lewis

A. B.

Mayer

R. E.

(1987). Students’ miscomprehension of relational statements in arithmetic word problems. Journal of Educational Psychology, 79(4), 363–371. https://doi.org/10.1037//0022-0663.79.4.363

19.

Lubin

Rossi

Lanoë

Vidal

Houdé

Borst

(2016). Expertise, inhibitory control and arithmetic word problems: A negative priming study in mathematics experts. Learning and Instruction, 45, 40–48. https://doi.org/10.1016/j.learninstruc.2016.06.004

20.

Mabbott

D. J.

Bisanz

(2008). Computational skills, working memory, and conceptual knowledge in older children with mathematics learning disabilities. Journal of Learning Disabilities, 41(1), 15–28. https://doi.org/10.1177/0022219407311003

21.

Martin

B. N.

Fuchs

L. S.

(2019). The mathematical performance of at-risk first graders as a function of limited English proficiency status. Learning Disability Quarterly, 42(4), 244–251. https://doi.org/10.1177/0731948719827489

22.

National Center for Education Statistics (NCES). (2024a). Condition of education: English learners in public schools. U.S. Department of Education, Institute of Education Sciences. https://nces.ed.gov/programs/coe/indicator/cgf

23.

National Center for Education Statistics (NCES). (2024b). Condition of education: Students with disabilities. U.S. Department of Education, Institute of Education Sciences. https://nces.ed.gov/programs/coe/indicator/cgg

24.

Nelson

Powell

S. R.

(2018). A systematic review of longitudinal studies of mathematics difficulty. Journal of Learning Disabilities, 51(6), 523–539. https://doi.org/10.1177/0022219417714773

25.

Pape

S. J.

(2003). Compare word problems: Consistency hypothesis revisited. Contemporary Educational Psychology, 28(3), 396–421. https://doi.org/10.1016/s0361-476x(02)00046-2

26.

Passolunghi

M. C.

De Blas

G. D.

Carretti

Gomez-Veiga

Doz

Garcia-Madruga

J. A.

(2022). The role of working memory updating, inhibition, fluid intelligence, and reading comprehension in explaining differences between consistent and inconsistent arithmetic word-problem-solving performance. Journal of Experimental Child Psychology, 224, Article 105512. https://doi.org/10.1016/j.jecp.2022.105512

27.

Pearce

D. L.

Bruun

Skinner

Lopez-Mohler

(2013). What teachers say about student difficulties solving mathematical word problems in grades 2-5. International Electronic Journal of Mathematics Education, 8(1), 3–19. https://doi.org/10.29333/iejme/271

28.

Polo-Blanco

González López

M. J.

Bruno

González-Sánchez

(2024). Teaching students with mild intellectual disability to solve word problems using schema-based instruction. Learning Disability Quarterly, 47(1), 3–15. https://doi.org/10.1177/07319487211061421

29.

Powell

S. R.

Berry

K. A.

(2015). Texas word problems. U.S. Department of Education.

30.

Powell

S. R.

Berry

K. A.

Benz

S. A.

(2020). Analyzing the word-problem performance and strategies of students experiencing mathematics difficulty. The Journal of Mathematical Behavior, 58, Article 100759. https://doi.org/10.1016/j.jmathb.2020.100759

31.

Powell

S. R.

Berry

K. A.

Fall

A.-M.

Roberts

Fuchs

L. S.

Barnes

M. A.

(2021). Alternative paths to improved word-problem performance: An advantage for embedding pre-algebraic reasoning instruction within word-problem intervention. Journal of Educational Psychology, 113(5), 898–910. https://doi.org/10.1037/edu0000513

32.

Powell

S. R.

Doabler

C. T.

Akinola

O. A.

Therrien

W. J.

Maddox

S. A.

Hess

K. E.

(2020). A synthesis of elementary mathematics interventions: Comparisons of students with mathematics difficulty with and without comorbid reading difficulty. Journal of Learning Disabilities, 53(4), 244–276. https://doi.org/10.1177/0022219419881646

33.

Powell

S. R.

Fuchs

L. S.

Fuchs

Cirino

P. T.

Fletcher

J. M.

(2009). Do word-problem features differentially affect problem difficulty as a function of students’ mathematics difficulty with and without reading difficulty? Journal of Learning Disabilities, 42(2), 99–110. https://doi.org/10.1177/0022219408326211

34.

Powell

S. R.

Namkung

J. M.

Lin

(2022). An investigation of using keywords to solve word problems. The Elementary School Journal, 122(3), 452–473. https://doi.org/10.1086/717888

35.

Powell

S. R.

Urrutia

V. Y.

Berry

K. A.

Barnes

M. A.

(2022). The word-problem solving and explanations of students experiencing mathematics difficulty: A comparison based on dual-language status. Learning Disability Quarterly, 45(1), 6–18. https://doi.org/10.1177/0731948720922198

36.

Shum

H. Y.

Chan

W. W.

(2020). Young children’s inhibition of keyword heuristic in solving arithmetic word problems. Human Behaviour and Brain, 1(2), 43–48. https://doi.org/10.37716/hbab.2020010202

37.

Spooner

Saunders

Root

Brosh

(2017). Promoting access to common core mathematics for students with severe disabilities through mathematical problem solving. Research and Practice for Persons with Severe Disabilities, 42(3), 171–186. https://doi.org/10.1177/1540796917697119

38.

Van Dooren

De Bock

Hessels

Janssens

Verschaffel

. (2005). Not everything is proportional: Effects of age and problem type on propensities for overgeneralization. Cognition and Instruction, 23(1), 57–86. https://doi.org/10.1207/s1532690xci2301_3

39.

Lafay

Douglas

Di Lonardo Burr

LeFevre

J. A.

Osana

H. P.

Skwarchuk

S. L.

Wylie

Simms

Maloney

E. A.

(2022). The role of mathematical language skills in arithmetic fluency and word-problem solving for first-and second-language learners. Journal of Educational Psychology, 114(3), 513–539. https://doi.org/10.1037/edu0000673

How Keywords Impact Word-Problem Performance: When “More” Does Not Mean to Add

Abstract

Keywords

Prior Research on Inconsistent Word Problems

Keywords Instruction

Mathematics Difficulty

Dual-Language Learners

Word-Problem Features

Purpose and Research Questions

Method

Context

Measure

Participants

Participants With MD

Participants Without MD

Demographic Information

Coding and Data Analysis

Results

Discussion

Implications for Assessment

Implications for Instruction

Future Directions for Research

Limitations

Conclusion

Footnotes

Declaration of Conflicting Interests

Funding

ORCID iDs

References