Sage Journals: Discover world-class research

Abstract

Frequently, problems can be solved in more than one way. In modern computerised environments, more ways than ever exist. Naturally, human problem solvers do not always decide for the best-performing strategy available. One underlying reason might be the inability to continuously and correctly monitor each strategy’s performance. Here, we supported some of our participants’ monitoring ability by providing written feedback regarding their speed and accuracy. Specifically, participants engaged in an object comparison task, which they were asked to solve with one of two strategies: an internal strategy (mental rotation) or an extended strategy (manual rotation). After receiving no feedback (30 participants), trialwise feedback (30 participants), or blockwise feedback (30 participants) in these no choice trials, all participants were asked to estimate their performance with both strategies and were then allowed to freely choose between strategies in choice trials. Results indicated that written feedback improves explicit performance estimates. However, results also indicated that such increased awareness does not guarantee improved strategy choice and that attending to written feedback might tamper with more adaptive ways inform the choice. Thus, we advise against prematurely implementing written feedback. While it might support adaptive strategy choice in certain environments, it did not in the present setup. We encourage further research that improves the understanding of how we monitor the performance of different cognitive strategies. Such understanding will help create interventions that support human problem solvers in making better choices in the future.

Keywords

Performance monitoring feedback cognitive strategy choice distributed cognition cognitive offloading

Significance statement

In an increasingly technologised world, humans can support or replace internal thought with external tools (e.g., writing a shopping list to support memory and using the search function to support visual search) more than ever. Consequentially, it is more important than ever to help humans find cognitive strategies that work well for them—with or without a tool. Here, we provided some of our participants with written performance feedback regarding speed and accuracy of two different strategies. Interestingly, written feedback did not help participants mix strategies in a way that improved performance. Contrarily, we provided evidence that written feedback might have even harmed performance. We conclude that written performance feedback can be more harmful than beneficial and should only be employed when warranted.

Internal and extended strategies

In modern tech-infused environments, human performers can frequently decide whether to (1) exclusively rely on their mental capacities without using any external aids or to (2) reach out into the environment so solve a problem at hand. Here, we will refer to the former as relying on internal strategies (iS) and to the latter as relying on extended strategies (eS; see Clark & Chalmers, 1998). Mental arithmetic is an example for the former and instead using a calculator an example for the latter.

Note that, in terms of theoretical framing, we here prefer the term “using extended strategies” over what others, as well as ourselves, previously referred to as “cognitive offloading.” We acknowledge that both terms can be used interchangeably for the most part but decided on the former terminology because cognitive offloading has originally been defined as a behaviour purposefully exhibited to decrease internal demand (Risko & Gilbert, 2016), whereas “using extended strategies” is more liberal regarding the underlying reasons. Furthermore, we think that the choice between two internal strategies has lots in common with the choice between an internal and extended strategy. Such similarities might be more difficult to discuss if we call the former choice “strategy choice” but the latter “cognitive offloading.”

Monitoring the performance of internal and extended strategies

Human performers have the means to estimate their performance of an internal strategy (for a review, see Ullsperger et al., 2014). But monitoring can similarly apply to external aids like computers or fellow humans (e.g., Pavone et al., 2016; Pfister et al., 2020; van Schie et al., 2004; Weis & Wiese, 2019c). For example, when navigating with a map or smartphone, or when retrieving information from the internet, human performers gauge whether the navigation or information search was quick and yielded appropriate results. Monitoring the performances of both internal and extended strategies is essential when it comes to deciding which strategy to use for a given problem.

That humans can be able to monitor the performance of extended strategies is supported by studies in which the performance or reliability of an eS was altered. Results indicate that participants adaptively adjust how frequently they use an eS based on such manipulations (e.g., Gray et al., 2006; Grinschgl et al., 2020; Storm et al., 2017; Walsh & Anderson, 2009; Weis & Wiese, 2019c). For example, in one study, participants needed to navigate a mouse cursor over a button to perform an eS. The size of the button was manipulated between experiments. And crucially, participants adaptively decreased their eS use frequency with decreasing button size, which reflects the increased time costs of executing more precise movements (Gray et al., 2006).

While such results suggest that people are usually able to monitor cognitive strategies, other results point towards faulty monitoring (e.g., Dunn et al., 2018; Gilbert et al., 2020; Risko & Dunn, 2015; Touron, 2015). For example, in a previous study, participants were asked to remember sets of two to ten letters for later report via storing them in memory or by “storing” them on a piece of paper with a pen. Participants’ performance was abysmal when trying to remember 10 letters by memory. Nevertheless, participants were shown to use memory instead of pen and paper for the sets of 10 letters more than 10% of the time, which nearly always leads to wrong answers (Risko & Dunn, 2015).

What is the origin of such a maladaptive choice? Inaccurate beliefs about a strategy’s performance are a likely contributor. In the same study, an independent sample of participants largely overestimated their performance when asked how well they thought they could remember 10 letters by memory, which could explain the maladaptive choice. Similar influences of performance-related beliefs on strategy choice, which can be independent of actual behaviour, have been reported frequently (e.g., Boldt & Gilbert, 2019; Gilbert, 2015; Touron, 2015; Weis & Wiese, 2019c, 2020). But note that maladaptive choice can exist in some individuals even with well-calibrated performance beliefs (Gilbert et al., 2020), for example, due to some sort of technology aversion or mental challenge seeking (e.g., Weis & Kunde, 2024).

Written feedback (knowledge of result) and cognitive strategy choice

How could such a maladaptive choice be prevented? We conjecture that the availability and granularity of written performance feedback (often denoted knowledge of result) for relevant strategies, such as the objective duration and accuracy of completing a task, could recalibrate inaccurate beliefs. Thus, when written feedback is available, future strategy choices could rely on the information provided in the written feedback. When no feedback is available, the choice would need to be based on other sources, for example, based on the results of resource-demanding (Corallo et al., 2008) internal time monitoring (reviewed in Allan, 1979; Mauk & Buonomano, 2004).

Conceptually, we endorse the notion that written feedback and the results of unconscious internal monitoring belong to distinct yet interconnected processes: explicit and implicit metacognition (Frith & Frith, 2022). The beauty of explicit metacognition is that it allows us to harness the knowledge that other humans—or machines—have acquired, which typically is realised via language. In the case of written feedback in the present study, the idea is that knowledge has been acquired and is then communicated by a machine. Similarly, an experimenter could also inform the participant that a specific strategy is unreliable (Weis & Wiese, 2019c) or especially well-suited for a given task (Weis & Wiese, 2022).

Implicit metacognition, on the other hand, does not rely on conscious processing in the first place, although the results of unconscious monitoring can eventually penetrate consciousness (compare Figure 1 in Frith & Frith, 2022). For example, humans can monitor their own (e.g., Ullsperger et al., 2014) or other agents’ (e.g., Pavone et al., 2016; van Schie et al., 2004) performance and the results can then be used for further processing. The point here is not that the results of implicit monitoring need to stay unconscious but that the information sampling is unconscious, which is different from digesting language-based information.

Figure 1.

Extended rotation paradigm.

Skill acquisition research has shown that written feedback is frequently beneficial (cf. Salmoni et al., 1984). For example, written feedback can augment performance differences that might otherwise go unnoticed when relying on implicit metacognition alone. Thus, written feedback is thought to allow for more accurately selecting those instances of executing a skill that comes with superior performance in comparison to relying on unsupported self-observation. However, written feedback’s benefits need to be compared with its downsides. Since written feedback has to be processed itself, potential conflicts with concurrent cognitive processing are likely to emerge since mental processing resources are limited. Written feedback may also shift motivation to perform a task from intrinsic to extrinsic (Deci et al., 1999). Moreover, providing written feedback after a response can deteriorate implicit metacognition, possibly because processing the feedback interferes with the processes that inform or constitute implicit monitoring (Swinnen et al., 1990). We conclude that written feedback has the potential to improve choice but should be implemented with care.

In that vein, Touron and Hertzog (2014) suggested that the presentation of accuracy feedback in a memory task can be used to de-bias older performers’ choices between internal and extended strategies because older adults are susceptible to eS overuse due to overly pessimistic beliefs about their memory performance. And indeed, older performers downregulated eS use when receiving trialwise accuracy feedback. However, surprisingly, providing trialwise speed feedback in the same study was not able to downregulate eS use. It is surprising because older performers are known to have difficulties in monitoring response times (Craik & Hay, 1999) and the feedback should have revealed that eS use is substantially slower than iS use. In a related study, speed feedback was additionally provided block-wise (Hertzog et al., 2007). In that study, participants adaptively downregulated eS use in the feedback vs. no feedback condition, which suggests that blocked speed feedback is more digestible than its trialwise equivalent. Along these lines of research, the present study was designed as a starting point to further investigate the role of written feedback when deciding between an internal and an extended strategy to solve a problem.

Current study

To test the ramifications of written feedback, we engaged participants in an object-matching task that could be solved by relying on internal means (mental object rotation; iS) or extended means (manual object rotation; eS). The key manipulation concerned the availability of written performance feedback regarding both of these means, which was either not provided, provided trialwise (i.e., presented after each trial) or provided blockwise (i.e., averaged across various trials). The following hypotheses were evaluated:

H1: Trialwise and blockwise feedback groups estimate their performances with both iS and eS more accurately than a no feedback group.

H2: Trialwise and blockwise feedback groups outperform a no-feedback group when allowed to freely choose between iS and eS.

H3: eS use frequency in the choice block is adaptively impacted by feedback. In particular, trialwise and blockwise feedback groups rely more on whatever strategy outperformed the other strategy during forced-choice trials than a no-feedback group.

Several studies have suggested an overuse of extended in comparison to internal strategies (e.g., Gilbert et al., 2020; Virgo et al., 2017; Touron, 2015). Such bias could be partly explained by a monitoring system that is calibrated for monitoring internal rather than extended strategies. If monitoring was exclusively based on the outcome, e.g., speed and accuracy as witnessed by the eye, performance estimates should be equally accurate for both types of strategy. In contrast, if the processing of an iS itself provided cues for the monitoring system that an outsourced process of an eS cannot provide, differences should emerge. Such internal cues have been suggested as a possible reason for inaccurate monitoring of an extended memory strategy (Dunn et al., 2018). Accordingly, we hypothesise that:

H4: eS performance is estimated less accurately than iS performance.

Method

Participants

A total of 90 (mean age 23.9 years; age range 18–40; 1 diverse, 67 female, 22 male) participants equalling 30 per group after exclusions were analysed for this manuscript. Participants were recruited from the participant pool of the university. Participants were reimbursed based on an hourly rate of €10. We prematurely stopped data collection that was preregistered based on an a priori power estimation conducted in G*Power with 70 participants per group (version 3.1.9.2, Faul et al., 2007). We decided to stop because, in an interim and not preregistered analysis conducted to ensure adequate data quality without technical errors, we observed that our results substantially deviated from our hypotheses. However, data quality seemed fine and after additional analyses reported below, we found that the restricted sample of 90 participants was sufficient to conclusively reject our main hypothesis.

The original power estimation was based on the least powerful test, i.e., the one-tailed independent t-test needed to follow up on the ANOVAs (alpha = .05, 1—beta = .9, d = .5). We set d to .5 in the power calculations since the effect of feedback type on strategy use in an earlier study was of similar magnitude (d = .5 in Touron & Hertzog, 2014). However, with the current sample, participants chose the more accurate strategy less frequently in both trial and block feedback conditions in comparison to the no feedback condition, M_{delta(no, trial)} = -22.0%, M_{delta(no, block)} = −12.5%. Effect sizes were negative, d_{(no, trial)} = −.67, d_{(no, block)} = −.39, and the associated bootstrapped 95% confidence intervals¹ excluded the effect size of .5 that was used for power calculations, CI_d_{(no, trial)} = [−1.3, −.1], CI_d_{(no, block)} = [−.9, .1]. Since the present sample already provided evidence that feedback was not associated with better strategy choice regarding accuracy, we perceived no benefit in running the whole sample.²

From the initial sample of 110 participants, participants were excluded if they were (a) outside of 2.5 standard deviations around the group reaction time mean (0 participants), (b) below 75% accuracy for the internal blocks (i.e., Blocks 1 and 3 or 2 and 4, depending on the counterbalancing condition; 15 participants), or (c) did not rotate in at least 90% of the trials during the extended blocks (i.e., Blocks 1 and 3 or 2 and 4, depending on the counterbalancing condition; 5 additional participants). Thus, a total of 20 participants were excluded based on these preregistered criteria. Criterions similar to (a) and (b) have been used in an earlier study (Weis & Wiese, 2019c). Criterion (c) was necessary to ensure participants followed instructions and consequentially to ensure that the performance estimates for the eS were valid.

Apparatus

The experiment was presented on a computer screen (BENQ XL2411P 24-inch monitor set to a resolution of 1920 × 1080 pixels with a refresh rate of 100 Hz) positioned about 75 cm in front of participants. The experiment was programmed in and run with MATLAB version R2016a (The Mathworks, Inc., Natick, MA, United States) and the Psychophysics Toolbox (Brainard, 1997; Pelli, 1997). Participants responded using a USB-connected standard keyboard and mouse. The stimulus rotation (see section Methods: Procedure and Task) was updated at a frequency of about 35 Hz.

Procedure and task

After being welcomed and providing informed consent, participants were asked to engage in the Extended Rotation Paradigm (Figure 1). Participants were asked to answer as quickly and accurately as possible. Specifically, participants first engaged in a practice block, followed by a no-choice block, a performance estimation block, and a choice block (Figure 2). Except for the performance estimation block, all blocks consisted of trials of the Extended Rotation Paradigm.

Figure 2.

Procedure.

The practice block ensured that participants fully grasped the task at hand. Participants did not receive performance feedback during the practice block. However, participants were allowed to advance to the no-choice block only if they made at most one error throughout the last 16 trials of the practice block and hence received feedback on whether these last 16 trials (8 extended and 8 internal) as a whole were answered sufficiently correctly according to this condition. If there were at least two errors, participants were kindly requested to repeat the practice block.

Subsequently, the no-choice blocks allowed us to measure objective performance profiles of both internal and extended strategies. To that end, the no-choice block was subdivided into internal and extended sub-blocks. In these sub-blocks, participants were kindly asked to please either only rely on their mental abilities (internal sub-block) or the keyboard (extended sub-block) to rotate the working stimulus. During the internal sub-block, pressing a rotation key (compare Figure 1) did not rotate the working stimulus. Internal and extended sub-blocks alternated, which sub-block (i.e., internal or extended) occurred first was counterbalanced, and four blocks à 48 instead of two blocks à 96 trials were used to lessen sequential effects.

This experiment’s main manipulation, the feedback manipulation, was entirely realised during the no-choice block. One group of participants received no feedback at all throughout the no-choice block (no feedback), one group received feedback after each trial (trialwise feedback), and one group received summarised feedback about the performance of the last 48 trials at the end of each block (blockwise feedback: “Correct Answers: XX % [new line] Mean Response Time: X.X seconds”). Note that no feedback was given throughout practice and choice blocks.

Subsequently, the performance estimation block allowed us to measure the subjective performance of both internal and extended strategies. Specifically, participants were asked to provide an accuracy and speed estimate for both internal and extended no-choice trials (e.g., “Please estimate how frequently you answered correctly when exclusively rotating with your “inner eye” and without the keyboard.”; answers ranged from 60% to 100% on a visual analogue scale). Note that block feedback could potentially be rehearsed and then used for the explicit performance estimates. Also, the given performance estimates themselves could be rehearsed and used in the subsequent choice block. To avoid such rehearsals, participants had to correctly re-type an alphanumerical string consisting of 12 elements right before and right after the performance estimation block.

Subsequently, the choice block afforded to measure how participants choose between internal and extended strategies. To this end, participants were instructed to freely choose between mental and manual keyboard-based rotation (i.e., between iS and eS). Note that in the present study, feedback is only provided during no-choice blocks because a former study suggested that—when no choice blocks do exist—monitoring during choice trials might not be the prime cause for adaptive strategy selection (Weis & Wiese, 2019a).

The experimental session concluded with questionnaires about demographic data and consciously accessible considerations contributing to voluntary choice beyond individual performance differences (e.g., “With which strategy do you think your answers were more accurate?”), and the German Need for Cognition Scale (NCS; Bless et al., 1994; Cacioppo & Petty, 1982). These measures were collected to support exploratory analyses if necessary.

Stimuli

For the Extended Rotation Paradigm (see section Methods: Procedure and Task), 48 base stimuli with 16 edges were created based on a procedure described by Attneave and Arnoult (1956) and realised using a Matlab-based script provided by Collin and McMullen (2002). We decided to use rather complex stimuli with many (i.e., 16) edges and not alter the amount of edges between stimuli because a high similarity between complex stimuli was associated with a high Angle-RT slope (Folk & Luce, 1987). Given that this slope is seen as an indicator of the thought process needed for rotation, complex and similar stimuli likely increase the signal-to-noise ratio in the present study. An example base stimulus as well as all the possible manipulations of the associated working stimulus as shown at the beginning of the stimulus interval is presented in Figure 1b.

Analyses

Data cleaning

All trials were used for analysis. We purposefully did not filter out any trials based on RT since even large RTs had been included in the trialwise as well as—more importantly—the blockwise feedback that participants received throughout the study. However, we also ran exploratory analyses without trials in which RTs deviated more than 2 SDs from the individual mean and found no substantial deviations from the results reported here.

Hypotheses testing

H1 and H4 were investigated using two mixed ANOVAs with the three-level between-participants factor feedback (no, trialwise, blockwise) and the two-level within-participants factor strategy (internal, extended) as independent variables (IVs). One ANOVA was carried out with the absolute accuracy estimation error (err_acc =—accuracy_estimate—accuracy_acrtual|) and one with the absolute RT estimation error (err_acc =—RT_estimate—RT_actual|) as dependent variables (DVs). Estimates were taken from the performance estimation block, and actual performances from the no-choice block (cf. Figure 2). The mixed ANOVAs were followed up by t-tests where indicated by the hypotheses.

H2 was investigated with two one-way ANOVAs with feedback as IV and accuracy or RT during the choice block (cf. Figure 2), respectively, as DVs. H3 was investigated with two analogue ANOVAs with a percentage of trials in which that strategy was used that outperformed the other strategy during the no-choice block as DV. Whenever a rotation key was pressed throughout a trial, we used this as an indicator for an eS trial. Accordingly, when no rotation key was pressed, we used this as an indicator for an iS trial. ANOVAs were followed up by independent t-tests where indicated by the hypotheses.

All effect size intervals were bias-corrected and accelerated and were implemented via bootstrapping 2000 samples in R with the bootES function of the bootES package (version 1.2.3). To provide the readers with an option to gauge the evidence for the alternative over the null hypotheses, we complemented t-tests with Bayes Factors that were computed in R with the ttestBF function of the BayesFactor package (version 0.9.12-4.7) and medium priors (Morey & Rouder, 2011). To improve readability, we only discuss the results of the Bayes factor analyses when conflicting with a frequentist interpretation.

Results

For data transparency reasons, all relevant averaged raw data including individual means can be inspected in Figure 3. Note that on the group level with all 90 participants, mental and manual rotation had a highly similar performance profile (manual rotation: M_RT = 2,623 ms, M_accuracy = 89.2%, mental rotation: M_RT = 2,584 ms, M_accuracy = 87.7%). Also note that participant employed manual rotation in a meaningful manner: If manual rotation was employed in the same handedness condition, the end location of the working stimulus usually closely matched the base stimulus (Figure S2).

Figure 3.

Raw data for actual performance (no choice block) and performance estimates.

Hypotheses-driven results

Feedback improved performance estimates in comparison to no feedback (H1 is confirmed)

There was an interaction between how feedback and strategy influenced absolute RT estimates, F(2, 87) = 7.7, p < .001, η_G² = .05, as well as absolute accuracy estimates, F(2, 87) = 6.4, p = .003, η_G² = .04; Figure 4. Specifically, trial feedback improved absolute RT estimates for mental rotation, M_Δ = −809 ms, t(58) = −7.6, p < .0001, d_Cohen = −2.0, 95% CI_d = [−2.4, −1.4], BF₁₀ = 23,494,263, as well as manual rotation, M_Δ = −404 ms, t(58) = −3.4, p = .001, d_Cohen = −.9, 95% CI_d = [−1.4, −.3], BF₁₀ = 23.8. Trial feedback also improved absolute accuracy estimates for mental rotation, M_Δ = −.046, t(58) = −3.8, p < .001, d_Cohen = −1.0, 95% CI_d = [−1.4, −.4], BF₁₀ = 65.0, but not manual rotation, M_Δ = −.010, t(58) = −.9, p = .38, d_Cohen = −.23, 95% CI_d = [−.7, .3], BF₁₀ = .4. Block feedback improved absolute RT estimates mental rotation, M_Δ = -852 ms, t(58) = −7.6, p < .0001, d_Cohen = -2.0, 95% CI_d = [−2.6, −1.3], BF₁₀ = 21,465,593, and possibly also manual rotation, M_Δ = −343 ms, t(58) = 2.3, p = .028, d_Cohen = −.6, 95% CI_d = [−1.2, .1], BF₁₀ = 2.1, although the latter result is ambiguous when considering the low Bayes factor and the confidence interval of the effect size. Block feedback also improved absolute accuracy estimates for both mental rotation, M_Δ = −.063, t(58) = −5.2, p < .0001, d_Cohen = −1.4, 95% CI_d = [−1.8, −.9], BF₁₀ = 6,072.8, and manual rotation, M_Δ = −.030, t(58) = −2.9, p = .005, d_Cohen = −.7, 95% CI_d = [−1.2, −.3], BF₁₀ = 7.7.

Figure 4.

Estimation errors.

Feedback did not improve free choice performance (H2 is rejected)

RT in the choice block was influenced by feedback, F(2, 87) = 4.6, p = .013, η_G² = .10; Figure 5a. Interestingly, the RT effect during choice trials was driven by fast responses of the trialwise feedback group, M_trialwise = 2,120 ms, M_{no feedback} = 2,438 ms, t(58) = 2.6, p = .012, d_Cohen = −.7, 95% CI_d = [−1.2, −.1], BF₁₀ = 4.2, and not by the blockwise feedback group, M_blockwise = 2,471 ms, M_{no feedback} = 2,120 ms, t(58) = −.2, p = .806, d_Cohen = .06, 95% CI_d = [−.4, .6], BF₁₀ = .3. However, since the trialwise feedback group already exhibited particularly fast responses during no-choice trials (Figure 3a), trialwise feedback seems to speed up responses in general rather than improving strategy selection in the choice block. A similar global speeding up of responses when trialwise speed feedback was provided replicates an earlier observation (Touron & Hertzog, 2014). Accuracy in the choice block was not influenced by feedback, F(2, 87) = 1.2, p = .296, η_G² = .03; Figure 5b. In sum, we conclude that there is no evidence for a beneficial effect of feedback on choice performance.

Figure 5.

Performance and strategy choice during choice block.

Feedback does not adaptively impact eS use frequency (H3 is rejected)

Feedback did not influence our participants’ propensity to choose the faster strategy (F(2, 87) = .3, p = .744, η_G² = .01); see Figure 5c. However, feedback did influence our participants’ propensity to choose the more accurate strategy (F(2, 85) = 3.2, p = .046, η_G² = .07; Figure 5d. Unexpectedly, participants without feedback were choosing the more accurate strategy more rather than less frequently. Post hoc t-tests suggest a less adaptive choice mostly in comparison to the trialwise feedback group (M_{no feedback} = 74.0%, M_trialwise = 51.9%, t(56) = 2.6, p = .013, d_Cohen = −.7, 95% CI_d = [−.1.3, −.2], BF₁₀ = 3.8) and are less conclusive about the comparison to the blockwise feedback group (M_blockwise = 61.5%, t(56) = 1.5, p = .146, d_Cohen = -.4, 95% CI_d = [-.0.9, .1], BF₁₀ = .7). Note that two participants needed to be excluded from accuracy-related analyses since they were performing exactly equally accurately with manual and mental rotation.

Without feedback, eS performance is estimated more accurately than iS performance (H4 is rejected)

Our participants’ ability to evaluate their performance with different strategies was influenced both by feedback and strategy (see results for H1). At a closer look, eS performance was estimated more precisely than iS performance when no feedback was given. This was true for RT, M_Δ = −372 ms, t(29) = −3.2, p = .004, d_Cohen = −.6, 95% CI_d = [−.9, −.2], BF₁₀ = 10.9, and accuracy, M_Δ = −.03, t(29) = −2.7, p = .013, d_Cohen = -.5, 95% CI_d = [−.9, 0], BF₁₀ = 3.6. Once any type of feedback was given, the estimated performances of iS and eS were more comparable (all p > .2; all BF₁₀ < .5 but > .2).

Exploratory results

So far, results indicated that feedback substantially improved absolute performance estimates. However, results also indicated that these improved estimates were not associated with improved choice. Why?

Descriptively, optimal strategies are chosen even without feedback

It could be that optimal strategies hardly exist because whenever one strategy is faster, the other is more accurate. If that were the case, feedback would not help participants make more adaptive choices. An exploratory look at the data indeed confirms that for most participants—62 out of 90—there is no unequivocally optimal strategy, meaning that no strategy had been better regarding both speed and accuracy during no-choice trials. However, for the remaining 28 participants, one strategy was unequivocally optimal with respect to performance during no-choice trials. Nevertheless, most participants in the no-feedback condition descriptively preferred the optimal strategy (10 out of 12 participants). Thus, optimal strategies existed for a sizable number of participants, and optimal strategies are frequently chosen even without feedback.³ Individual actual and estimated performance data and associated eS use proportions can be inspected in Figure S1.

Without feedback, actual and estimated performance differences were unrelated

Surprisingly, actual and estimated performance differences between internal and extended strategies—i.e., ΔActual Performance(internal, extended) and ΔEstimated Performance(internal, extended) were calculated for each participant—during no choice trials were unrelated in the no-feedback group regarding accuracy, r_Pearson(28) = .14, p = .469, CI_95% = [−.24, .47].) as well as RT, r_Pearson(28) = −.06, p = .759, CI_95% = [−.41, .31]. Contrarily, actual and estimated performance differences during no-choice trials were substantially related in the trialwise feedback group regarding accuracy, r_Pearson(28) = .55, p = .002, CI_95% = [.24, .76], as well as RT, r_Pearson(28) = .48, p = .008, CI_95% = [.14, .72]. Similar results were obtained for the blockwise feedback group regarding accuracy, r_Pearson(28) = .83, p < .0001, CI_95% = [.66, .91], as well as RT, r_Pearson(28) = .38, p = .039, CI_95% = [.02, .65]. In sum, these exploratory analyses suggest that written feedback could be necessary to create accurate explicit representations of performance differences between the available strategies; see Figure S3 for the raw data underlying the correlations. That being said, we acknowledge that sample sizes are low and that these findings should be interpreted with care (Schönbrodt & Perugini, 2013).

Only without feedback, actual performance differences predicted eS use proportion

Given that actual and estimated performance differences between internal and extended strategies were not correlated for the no feedback group, we wanted to gain an understanding of how much both measures influenced eS use proportion. To this end, we conducted two logistic regressions with eS Use Proportion as DV. For the first regression, we entered standardised actual performance differences throughout the no-choice block—i.e., ΔRT(extended-internal) and ΔAccuracy(extended-internal)—as predictors. For the second regression, we analogously entered the standardised estimated performance differences that participants estimated right after the no-choice block as predictors. We then tested the reduction of residual deviance against the null model for both models. Based on a .05 alpha level, results indicate that only actual (p = .0002) but not estimated performance (p = .0578) predicted eS use proportion; compare Table 1. Interestingly, the reverse was true for both conditions with feedback. Predicted eS use proportions for both accuracy and RT for each model are depicted in Figure 6. The associated model results can be inspected in Table S1. In sum, the no-feedback group predominantly used their actual performance in the no-choice block for strategy choice. However, ironically, both feedback groups relied little on actual performance. Instead, both feedback groups seemed to prefer relying on noisy performance estimates.

Table 1.

Logistic regression results with actual or estimated ΔRT(eS-iS) and ΔAccuracy(eS-iS) predicting eS use proportion.

Feedback group	Step 1 (Null Model)		Step 2 (ΔPerformance)
Feedback group	DoF	Residualdeviance	DoF	Residualdeviance	Deviance	p
No Feedback, Actual Performance	29	22.3	27	14.1	8.2	.0002
No Feedback, Estimated Performance	29	22.3	27	19.0	3.4	.0578
Trialwise Feedback, Actual Performance	29	18.0	27	17.0	1.0	.3525
Trialwise Feedback, Estimated Performance	29	18.0	27	12.4	5.6	.0006
Blockwise Feedback, Actual Performance	29	19.1	27	16.8	2.4	.0958
Blockwise Feedback, Estimated Performance	29	19.1	27	15.6	3.5	.0245

In Step 2, both ΔRT(eS-iS) and ΔAccuracy(eS-iS) were added as predictors. The predictor-wise results of Step 2 models are provided in the Online Supplementary Material. Deviance DoF is 2 for all model comparisons. eS = extended strategy (manual rotation); iS = internal strategy (mental rotation); DoF = degrees of freedom.

Figure 6.

Logistic regression estimated marginal means.

Discussion

When asked how well they performed with both mental and manual rotation strategies, respectively, participants who did not receive written performance feedback estimated their performance poorly in comparison to participants with feedback. Although the extent to which participants needed feedback to accurately estimate their performance was somewhat surprising to us, the substantial impact of feedback was not. What we however were considerably surprised about is that despite the metacognitive help that feedback clearly provided, feedback had not been supporting participants in making more adaptive strategy choices. Quite to the contrary, our explorations suggest that feedback had ultimately been more confusing than supportive. Only participants without feedback based their strategy choice on actual performance differences between the mental and manual strategies. Contrarily, participants who received either trialwise or blockwise feedback did not. Instead, the strategy choice of participants with feedback was associated with their estimated performance, which was likely so noisy that it provided comparably bad guidance.

Against the backdrop of the present findings, we want to discuss two questions. First, what is the mechanism behind adaptive—adaptiveness is when the choice is guided by actual performance of strategies—choice for participants without feedback? Second, why is this adaptive choice not present for participants with either trialwise or blockwise feedback?

What is the mechanism behind adaptive choice for participants without written feedback?

When designing the experiment, we expected that participants would monitor their performances with both mental and manual rotation and eventually integrate the monitoring results in a consciously accessible representation. These consciously accessible representations could then inform deliberate strategy choices. However, the present data oppose such a scenario. Data suggest that, when no feedback was given, neither were valid consciously accessible performance estimates created nor were these invalid estimates used to inform strategy choice. We infer that explicit metacognition had been largely irrelevant for the no feedback group, meaning that no consciously accessible performance representations had been created and mediating the choice process. It would be conceivable that explicit representations other than performance estimates informed our participants’ choice, which we, however, deem somewhat unlikely given the importance of performance for conscious strategy choice (e.g., Gilbert, 2015; Weis & Kunde, 2024).

Instead, present data suggest that implicit metacognition had been at work (for reviews, see Cary & Reder, 2002; Frith & Frith, 2022; Koriat, 2007). In other words, we suggest that important parts of the monitoring process had been conducted without creating explicit performance representations. The process might be an analogue to the one that participants employed when solving arithmetic problems in a previous study: the frequency of arithmetic problems that allowed a quick internal strategy—the parity-check strategy⁴—altered how frequently participants used that strategy. Crucially, when asking participants in that study, they had no explicit representation of the existence of such a strategy (Lemaire & Reder, 1999). As a conceptual note, we want to add that it is challenging to disambiguate what we and others (see Frith & Frith, 2022) call implicit metacognition from lower-level processes like associative learning or episodic retrieval. Adaptive behaviour can emerge without an active “performance monitoring network” (Ullsperger et al., 2014), and we cannot rule out a substantial impact of such lower-level processes. What remains is that, taken together, our results second the notion that problem solvers need not become aware of the reasons that influenced their strategy choice and that this unawareness does not necessarily harm—and might even benefit—performance.

Why is feedback associated with less adaptive strategy choice?

Since we provided participants with correct performance feedback, it was counterintuitive to find no relationship between actual performance differences between the available strategies during the no-choice block and subsequent strategy choice. Because we found such a relationship for participants without written feedback, we conclude that providing performance feedback must have corrupted strategy choice either (1) indirectly via interfering with implicit monitoring or low-level processes like associative learning⁵ or (2) more directly by changing decision criteria.

Regarding the former, it is possible that the simple presence of written feedback focuses cognitive processing on the feedback, which might interfere with other more beneficial processes. Such interference has, for example, been found with immediate as compared to delayed feedback in a motor skill task (Swinnen et al., 1990) or after error commission as compared to correct responding in a perceptual task (Buzzell et al., 2017). However, if such inference was the prime reason for the missing correlation between actual performance differences and choice in the present study, blockwise feedback should have been superior to trialwise feedback as the number of potentially interfering feedback encounters has been much lower in the former as compared to the latter. Yet, since performance differences between mental and manual rotation during the no-choice block did not predict subsequent choice even in the block feedback condition, interference seems an unlikely explanation.

Instead, our data suggest that, when written feedback was provided, the choice was less based on information stemming from implicit metacognition or lower-level processes and more from explicit metacognition. Thus, our data support the notion that decision criteria for strategy choice were shifted when feedback was provided: The decision was more based on explicit representations of performance and less on implicit signals. Such a change in decision criteria might be analogue to decreased reliance on what has been termed an implicit “inner” monitoring loop and an increased reliance on an “outer” monitoring loop. The existence of an inner and an outer loop had been suggested as the basis for error detection in typists (Logan & Crump, 2010). Naturally, errors occur during typing. In that study, however, these errors had sometimes been corrected before appearing on the screen. Other times, inserted errors appeared on the screen although the typing was correct. What could be shown is that inner and outer loops operate with different kinds of feedback. The inner loop relies on the implicit processing of own typing signals and a typing error will lead to a slow-down in subsequent typing irrespective of what’s depicted on the screen. The outer loop relies on visual signals and affects explicit error processing: participants explicitly took blame for inserted errors and credit for corrected errors.

That inner and outer loops exist and can work independently would be in line with both present data and an earlier study by Engeler and Gilbert (2020). Engeler and Gilbert asked participants in an experimental group to predict their performance before each forced trial and received feedback on their actual performance after each forced trial for both an internal and an extended strategy. As in the present study, the intervention substantially improved how accurately participants estimated their performance with both strategies.⁶ However, in subsequent choice trials, participants in a control group without predictions and feedback were—also as in the present study—making descriptively more rather than less adaptive choices between an internal and an extended strategy.

It is an open question whether monetary incentives might render feedback more helpful. However, given that previous research found monetary incentives to not be able to remediate maladaptive choices between an internal and an extended strategy (Gilbert et al., 2020), it is questionable why the situation should change once feedback is provided.

Finally, we deem it possible that the existence of feedback frees mental resources. The awareness that there is feedback—either trialwise or blockwise—could potentially lead to a top-down downregulation of implicit monitoring in the spirit of effort minimisation (see Delgado et al., 2005). In other words, we speculate that feedback might free cognitive resources that are usually devoted to implicit monitoring, such that feedback might not help performance but might still have positive impact. Clearly, further research is required to address this speculation.

Conclusion and outlook

To solve problems efficiently, a well-informed choice between internal and extended strategies is mandatory. Here, we showed that written feedback can indeed be used to improve participants’ awareness of strategy-specific performances. However, we also showed that increased awareness does not guarantee improved strategy choice and that written feedback might even tamper with more adaptive implicit mechanisms. Thus, our advice is to avoid implementing immediate written performance feedback without having good reasons to do so. While explicit metacognition might indeed be a “human superpower” (Frith & Frith, 2022, p. 1023), it was not that powerful in the present setup. Providing specific strategy advice right before a specific problem (Gilbert et al., 2020) or general strategy advice preceding several problems of the same class (Weis & Kunde, 2023) can be more helpful than feedback in that respect. A better understanding of the interplay between explicit and implicit metacognition is desirable to create interventions that help human problem solvers make better cognitive strategy choices in the future.

Supplemental Material

sj-docx-1-qjp-10.1177_17470218241282659 – Supplemental material for When feedback backfires: Knowledge of results can impair cognitive strategy choice

Supplemental material, sj-docx-1-qjp-10.1177_17470218241282659 for When feedback backfires: Knowledge of results can impair cognitive strategy choice by Patrick P Weis and Wilfried Kunde in Quarterly Journal of Experimental Psychology

Footnotes

Author Contribution

This research was conceptualised by PPW, designed by PPW and WK, experimental code was written by PPW, and data was collected, curated, and analysed by PPW. Results were validated and visualised by PPW. The original draft was written by PPW and reviewed and edited by WK. Funding was acquired by PPW. The project was administered and supervised by PPW.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

This research was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – 463896411.

Ethics statement

Informed consent was obtained from all individual participants included in the study. This research complied with the tenets of the Declaration of Helsinki and was approved by the Ethics Committee at a European university.

Informed consent

ORCID iD

Patrick P Weis

Data availability

Data, analysis, and stimulus materials for both experiments are available in an online repository [https://osf.io/4gfqc]. The study design was preregistered [].

Supplementary material

The Supplementary Material is available at: qjep.sagepub.com

Notes

References

Allan

L. G.

(1979). The perception of time. Perception & Psychophysics, 26(5), 340–354.

Attneave

Arnoult

M. D.

(1956). The quantitative study of shape and pattern perception. Psychological Bulletin, 53(6), 452.

Bless

Wänke

Bohner

Fellhauer

Schwarz

(1994). Need for cognition: Eine Skala zur Erfassung von Engagement und Freude bei Denkaufgaben [Presentation and validation of a German version of the Need for Cognition Scale]. Zeitschrift Für Sozialpsychologie, 25, 147–154.

Boldt

Gilbert

S. J.

(2019). Confidence guides spontaneous cognitive offloading. Cognitive Research: Principles and Implications, 4(1), 45.

Brainard

D. H.

(1997). The psychophysics toolbox. Spatial Vision, 10, 433–436. https://doi.org/10.1163/156856897x00357

Buzzell

G. A.

Beatty

P. J.

Paquette

N. A.

Roberts

D. M.

McDonald

C. G.

(2017). Error-induced blindness: Error detection leads to impaired sensory processing and lower accuracy at short response–stimulus intervals. The Journal of Neuroscience, 37(11), 2895–2903. https://doi.org/10.1523/JNEUROSCI.1202-16.2017

Cacioppo

J. T.

Petty

R. E.

(1982). The need for cognition. Journal of Personality and Social Psychology, 42(1), 116.

Cary

Reder

L. M.

(2002). Metacognition in strategy selection. In Chambres

Izaute

Marescaux

P. J.

(Eds.), Metacognition (pp. 63–77). Springer.

Clark

Chalmers

(1998). The extended mind. Analysis, 58, 7–19.

10.

Collin

C. A.

McMullen

P. A.

(2002). Using Matlab to generate families of similar Attneave shapes. Behavior Research Methods, Instruments, & Computers, 34(1), 55–68.

11.

Corallo

Sackur

Dehaene

Sigman

(2008). Limits on introspection: Distorted subjective time during the dual-task bottleneck. Psychological Science, 19(11), 1110–1117. https://doi.org/10.1111/j.1467-9280.2008.02211.x

12.

Craik

F. I. M.

Hay

J. F.

(1999). Aging and judgments of duration: Effects of task complexity and method of estimation. Perception & Psychophysics, 61(3), 549–560. https://doi.org/10.3758/BF03211972

13.

Deci

E. L.

Koestner

Ryan

R. M.

(1999). A meta-analytic review of experiments examining the effects of extrinsic rewards on intrinsic motivation. Psychological Bulletin, 125(6), 627.

14.

Delgado

M. R.

Frank

R. H.

Phelps

E. A.

(2005). Perceptions of moral character modulate the neural systems of reward during the trust game. Nature Neuroscience, 8(11), 1611–1618. https://doi.org/10.1038/nn1575

15.

Dunn

T. L.

Gaspar

McLean

Koehler

D. J.

Risko

E. F.

(2018). Distributed metacognition: Examining the metacognitions associated with retrieval from internal and external stores. https://doi.org/10.13140/RG.2.2.22919.50080

16.

Engeler

N. C.

Gilbert

S. J.

(2020). The effect of metacognitive training on confidence and strategic reminder setting. PLOS ONE, 15(10), Article e0240858. https://doi.org/10.1371/journal.pone.0240858

17.

Faul

Erdfelder

Lang

A.-G.

Buchner

(2007). G* Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39(2), 175–191.

18.

Folk

M. D.

Luce

R. D.

(1987). Effects of stimulus complexity on mental rotation rate of polygons. Journal of Experimental Psychology: Human Perception and Performance, 13(3), 395.

19.

Frith

C. D.

Frith

(2022). The mystery of the brain–culture interface. Trends in Cognitive Sciences, 26(12), 1023–1025. https://doi.org/10.1016/j.tics.2022.08.013

20.

Gilbert

S. J.

(2015). Strategic use of reminders: Influence of both domain-general and task-specific metacognitive confidence, independent of objective memory ability. Consciousness and Cognition, 33, 245–260. https://doi.org/10.1016/j.concog.2015.01.006

21.

Gilbert

S. J.

Bird

Carpenter

J. M.

Fleming

S. M.

Sachdeva

Tsai

P.-C.

(2020). Optimal use of reminders: Metacognition, effort, and cognitive offloading. Journal of Experimental Psychology: General, 149(3), 501–517. https://doi.org/10.1037/xge0000652

22.

Gray

W. D.

Sims

C. R.

W.-T.

Schoelles

M. J.

(2006). The soft constraints hypothesis: A rational analysis approach to resource allocation for interactive behavior. Psychological Review, 113(3), 461–482. https://doi.org/10.1037/0033-295X.113.3.461

23.

Grinschgl

Meyerhoff

H. S.

Papenmeier

(2020). Interface and interaction design: How mobile touch devices foster cognitive offloading. Computers in Human Behavior, 108, 106317. https://doi.org/10.1016/j.chb.2020.106317

24.

Hertzog

Touron

D. R.

Hines

J. C.

(2007). Does a time-monitoring deficit influence older adults’ delayed retrieval shift during skill acquisition? Psychology and Aging, 22(3), 607.

25.

Koriat

(2007). Metacognition and consciousness. Cambridge University Press.

26.

Lemaire

Reder

(1999). What affects strategy selection in arithmetic? The example of parity and five effects on product verification. Memory & Cognition, 27(2), 364–382. https://doi.org/10.3758/BF03211420

27.

Logan

G. D.

Crump

M. J. C.

(2010). Cognitive illusions of authorship reveal hierarchical error detection in skilled typists. Science, 330(6004), 683–686. https://doi.org/10.1126/science.1190483

28.

Mauk

M. D.

Buonomano

D. V.

(2004). The neural basis of temporal processing. Annual Review of Neuroscience, 27(1), 307–340. https://doi.org/10.1146/annurev.neuro.27.070203.144247

29.

Morey

R. D.

Rouder

J. N.

(2011). Bayes factor approaches for testing interval null hypotheses. Psychological Methods, 16(4), 406.

30.

Pavone

E. F.

Tieri

Rizza

Tidoni

Grisoni

Aglioti

S. M.

(2016). Embodying others in immersive virtual reality: Electro-cortical signatures of monitoring the errors in the actions of an avatar seen from a first-person perspective. Journal of Neuroscience, 36(2), 268–279. https://doi.org/10.1523/JNEUROSCI.0494-15.2016

31.

Pelli

D. G.

(1997). The VideoToolbox software for visual psychophysics: Transforming numbers into movies. Spatial Vision, 10(4), 437–442. https://doi.org/10.1163/156856897X00366

32.

Pfister

Weller

Kunde

(2020). When actions go awry: Monitoring partner errors and machine malfunctions. Journal of Experimental Psychology: General, 149(9), 1778–1787. https://doi.org/10.1037/xge0000748

33.

Risko

E. F.

Dunn

T. L.

(2015). Storing information in-the-world: Metacognition and cognitive offloading in a short-term memory task. Consciousness and Cognition, 36, 61–74.

34.

Risko

E. F.

Gilbert

S. J.

(2016). Cognitive offloading. Trends in Cognitive Sciences, 20(9), 676–688. https://doi.org/10.1016/j.tics.2016.07.002

35.

Salmoni

A. W.

Schmidt

R. A.

Walter

C. B.

(1984). Knowledge of results and motor learning: A review and critical reappraisal. Psychological Bulletin, 95(3), 355.

36.

Schönbrodt

F. D.

Perugini

(2013). At what sample size do correlations stabilize? Journal of Research in Personality, 47(5), 609–612.

37.

Storm

B. C.

Stone

S. M.

Benjamin

A. S.

(2017). Using the Internet to access information inflates future use of the Internet to access other information. Memory, 25(6), 717–723. https://doi.org/10.1080/09658211.2016.1210171

38.

Swinnen

S. P.

Schmidt

R. A.

Nicholson

D. E.

Shapiro

D. C.

(1990). Information feedback for skill acquisition: Instantaneous knowledge of results degrades learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 16(4), 706.

39.

Touron

D. R.

(2015). Memory avoidance by older adults: When “old dogs” won’t perform their “new tricks.” Current Directions in Psychological Science, 24(3), 170–176.

40.

Touron

D. R.

Hertzog

(2014). Accuracy and speed feedback: Global and local effects on strategy use. Experimental Aging Research, 40(3), 332–356. https://doi.org/10.1080/0361073X.2014.897150

41.

Ullsperger

Fischer

A. G.

Nigbur

Endrass

(2014). Neural mechanisms and temporal dynamics of performance monitoring. Trends in Cognitive Sciences, 18(5), 259–267. https://doi.org/10.1016/j.tics.2014.02.009

42.

van Schie

H. T.

Mars

R. B.

Coles

M. G. H.

Bekkering

. (2004). Modulation of activity in medial frontal and motor cortices during error observation. Nature Neuroscience, 7(5), 549–554. https://doi.org/10.1038/nn1239

43.

Virgo

Pillon

Navarro

Reynaud

Osiurak

(2017). Are you sure you’re faster when using a cognitive tool? The American Journal of Psychology, 130(4), 493–503. https://doi.org/10.5406/amerjpsyc.130.4.0493

44.

Walsh

M. M.

Anderson

J. R.

(2009). The strategic nature of changing your mind. Cognitive Psychology, 58(3), 416–440.

45.

Weis

P. P.

Kunde

(2023). Overreliance on inefficient computer-mediated information retrieval is countermanded by strategy advice that promotes memory-mediated retrieval. Cognitive Research: Principles and Implications, 8, Article 72. https://doi.org/10.1186/s41235-023-00526-6

46.

Weis

P. P.

Kunde

(2024). Perseveration on cognitive strategies. Memory & Cognition, 52, 459–475. https://doi.org/10.3758/s13421-023-01475-7

47.

Weis

P. P.

Wiese

(2018). Speed considerations can be of little concern when outsourcing thought to external devices. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 62, 14–18.

48.

Weis

P. P.

Wiese

(2019a). Investing in brain-based memory leads to decreased use of technology-based memory. Journal of Experimental Psychology: Applied. https://doi.org/10.1037/xap0000259

49.

Weis

P. P.

Wiese

(2019b). Problem solvers adjust cognitive offloading based on performance goals. Cognitive Science, 43(12), Article e12802. https://doi.org/10.1111/cogs.12802

50.

Weis

P. P.

Wiese

(2019c). Using tools to help us think: Actual but also believed reliability modulates cognitive offloading. Human Factors, 61(2), 243–254. https://doi.org/10.1177/0018720818797553

51.

Weis

P. P.

Wiese

(2022). Know your cognitive environment! Mental Models as crucial determinant of offloading preferences. Human Factors, 64(3), 499–513. https://doi.org/10.1177/0018720820956861

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.30 MB