Abstract
Results remain mixed regarding the effects of incubation tasks on divergent thinking, a type of creativity, generally assessed via the Unusual Uses Task (UUT). Using a within-subjects design, we compared 64 participants’ performance on the UUT, after four different incubation tasks: copy a simple painting, copy a complex painting, 0-back-task, and rest. We hypothesized that an arts-related activity during incubation (here: copy a painting) would boost subsequent creativity. Five different creativity scores were computed from the raw UUT data, and we provide a step-by-step guide for how to compute these: fluency, flexibility, originality, subjective creativity, and usefulness. Creativity was only modulated by sex; women outperformed men on creative fluency. No other variables, nor the incubations, modulated any of participants’ creativity scores. A within-group comparison showed that the unusual uses of our all-Iranian participants were more useful than unique, echoing previous work suggesting differences between Eastern and Western conceptions of creativity.
“ Creativity is intelligence having fun”
Albert Einstein
The past five decades have seen an increase in the empirical assessment of creativity (Abraham, 2016; Guilford, 1967; Ward & Kennedy, 2017); what it is, how to measure it, and ultimately, how to enhance it. Extending previous work, we explore the effects of four incubation tasks to enhance creativity.
Creativity is the cognitive ability to generate solutions that are both original and useful (Paulus & Nijstad, 2003), and it plays a vital role in any field to secure innovation and progress (Adaman & Blaney, 1995; Kim, 2011; Newton & Newton, 2010; Remoli & Santos, 2017; Ritter & Mostert, 2017). Divergent thinking is the process of coming up with different solutions for a problem (Guilford, 1967). A typical example of divergent thinking is idea generation, where many different and potentially unrelated solutions are provided to a problem. The Unusual Uses Task (UUT; Guilford, 1967) is a classical task in this domain and has become the gold standard measure of divergent thinking abilities. During the UUT, participants are presented with an object (e.g., “a brick”) and are asked to generate as many unusual uses as possible for this object within a set amount of time. The number of unusual uses thus proposed is a measure of divergent thinking ability.
In a seminal article, Baird and colleagues (2012) suggested that engaging in a short unrelated incubation task, before providing the unusual uses of the UUT, could boost subsequent creativity. During an incubation task, the participant is disengaging from the UUT and is, thus, no longer consciously working on coming up with unusual uses for the object, but on the unrelated incubation task (Csikszentmihalyi & Sawyer, 1995; Dijksterhuis & Meurs, 2006; Ritter & Dijksterhuis, 2014; Segal, 2004). One meta-analysis lends support to the creativity-enhancing effects of incubation tasks (Sio & Ormerod, 2009). Yet, these effects have not always been replicable (Gall & Mendelsohn, 1967; Olton & Johnson, 1976; Vul & Pashler, 2007). Building on this previous work, we here investigate the effects on creativity of two classical incubation tasks (0-back task or rest), and two new incubation tasks (copying a simple or a complex painting). Our objective was to replicate and extend previous findings.
Our choice of art-related incubation tasks was based on a series of empirical findings. Engagement with the arts has been argued to spur creative processes (Ishiguro & Okada, 2021; Windholz, 1968). Yet, surprisingly, to our knowledge, art-related tasks as creative incubation have been explored very little as possible creativity boosters. Myszkowski et al. (2014) investigated the relation between aesthetic sensitivity and divergent thinking ability and found a relation between figural creativity and aesthetic sensitivity. In addition, engaging with creative materials (e.g., watching and rating aesthetic stimuli) prior to solving a task has been found to increase levels of felt creative inspiration on a subsequent story-writing task (Welke et al., 2023). Another study showed that copying a painting of a more experienced painter subsequently boosted experimental participants’ creativity when asked to draw their own drawing from scratch (Okada & Ishibashi, 2016). Browne and Cruse (1988) showed that after copying geometric shapes as incubation task, 40% of people solved a subsequent convergent creativity problem that was posed to them, while only 12% of people who focused continuously on the problem (i.e., without incubation) were able to solve the same problem. Despite these first data suggesting that creative tasks during incubation foster subsequent creativity, the empirical assessment of the usefulness of art-related copying tasks as incubation tasks is still relatively scarce.
In line with the aforementioned work using copying as incubation task, we chose two paintings that participants were asked to copy during the incubation phases of our experiment. One simple and one complex painting was chosen because previous work has suggested that a high working memory load (e.g., a 2-back task) obstructs creative incubation, while low load (e.g., a 0-back task) facilitates it (Baird et al., 2012). We hoped to mirror these two conditions by having one simple and one complex painting to copy.
Since we were interested in exploring the potential creativity-facilitating effect of art-related tasks, we thought it is possible that prior engagement and familiarity with artistic practices, along with general aesthetic responsivity, may influence creativity. Therefore, we applied a standardized screening tool for people's responsivity to aesthetic experiences, the Aesthetic Responsiveness Assessment (AReA; Schlotz et al., 2021; in Farsi: Golbabaei et al., 2023). A series of additional interindividual difference variables have been found to modulate participants’ performance on the UUT. For instance, interindividual differences in mood, mind-wandering propensity, sex, and culture. We will briefly review the literature for these in what follows, to motivate our choice of interindividual difference measures in our experiment.
Mood
Participants’ mood has been found to influence performance on creativity tasks, yet the direction of the effect is still unclear. Some studies suggest that positive mood induction, as compared to no mood induction, results in a larger number and more original creative solutions to a problem (Forgeard, 2011; Hammers, 2018; Isen et al., 1987; Ritter & Ferguson, 2017). Further, listening to happy music has been shown to enhance divergent thinking (Yamada & Nagai, 2015). Conversely, other studies have shown that positive mood inhibits creativity, while negative mood enhances it (George & Zhou, 2002; Kaufmann & Vosburg, 2002; Kaufmann, 2003). To further explore the link between mood and creativity, we asked participants to rate their mood via self-report for possible exploratory analyses.
Mind-Wandering
Mind-wandering, that is, moments where the mind drifts off, is a phenomenon with both benefits and costs for the individual (Kane et al., 2021; Killingsworth & Gilbert, 2010; Schooler et al., 2014). To explore the effect of participants’ general propensity for mind-wandering on creativity, participants filled in the Daydreaming Frequency Scale (DDFS; Giambra, 1993).
Besides, one study found that the number of mind-wandering episodes that participants had during incubation phases was positively related to their subsequent performance on the UUT (Baird et al., 2012). Another finding was that participants had more mind-wandering episodes during undemanding tasks. This has led to the idea that undemanding tasks leave a portion of an individual's mental capacities “free” to contemplate the problem “offline” (or “unconsciously”), and, thus, contribute to a solution, without the cognitive pressure of having the duty to find a solution (Förster et al., 2004). Therefore, in our study, we also measured of how cognitively demanding the task was, by asking participants to count the number of mind-wandering episodes they had during the task (more mind-wandering episodes = less task demand). We asked participants the question: “How many times did you think about the object during the task that you just did?.” We aimed to add this metric as a possible covariate for exploratory analyses.
Sex
While results regarding the effect of individuals’ biological sex on creativity are mixed, there are different facets of creativity, where men's and women's performance has sometimes been found to differ (Abraham, 2016; Baer & Kaufman, 2008). For instance, results fall in favor of women in terms of idea generation, fluency (Abraham, 2016), while men outperform women on measures of creative accomplishment (which is another measure of creativity; e.g., number of inventions, number of successes, etc.; Baer & Kaufman, 2008; Simonton, 1994). Due to these prior inconclusive results, we assess the impact of sex in our study.
Culture
Taking advantage that our study was conducted in Iran, a Middle-Eastern country, an additional aim of this study was to explore the effect of culture on creative solutions. While we did not compare two cultures in this experiment, we followed the conjecture by Ivancovsky et al. (2021). We compared two measures of creative performance on the UUT within this sample of Iranian participants: subjective creativity (uniqueness of response), versus the usefulness of the proposed usages. Holistic thinking, where context and the potential usefulness of a proposed solution weighs stronger, is said to be a common feature of the thought processes of people from Eastern cultures, while Western cultures emphasize analytical thinking and uniqueness more (de Oliveira & Nisbett, 2017; Kitayama et al., 2009; Lechuga et al., 2011). In one experiment, Ivancovsky et al. (2021) found that the Israeli group of their experiment scored higher on Western measures of creativity (fluency, flexibility, and originality) than the Japanese group did. However, their within-culture analysis (= what we did), showed that for the Japanese group, the usefulness score was higher, than the other dimensions of creativity. Conversely, in the Israeli group, the usefulness score was lowest as compared to other creativity measures. Similarly, in a study by Liou and Lan (2017), the creative performance of Taiwanese and Americans was assessed in a task that involved teamwork to generate creative slogans for promoting a newly designed 3D hologram. The Taiwanese group expressed more useful ideas and rejected the original ones, while Americans expressed more original ideas and tended to suppress the useful ones. These findings are in line with previous research that illustrates how Eastern cultures focus on usefulness, rather than on originality during creative tasks (Chiu & Kwan, 2010; De Dreu, 2010; Hempel & Sue-Chan, 2010; Simonton & Ting, 2010; Zhou & Su, 2010). Echoing this research, we computed a subjective (uniqueness) and a usefulness score of creativity in this sample, and compared the two measures within our sample of Iranian participants.
Creativity Measures
As a key contribution of this paper, we performed a review of common creativity scores to compute from raw data extracted from the UUT. We provide a step-by-step guide about how to compute each of them in the methods section, extending previous work (Ritter & Ferguson, 2017).
According to Guilford (1962, as cited in Kim, 2006), one of the pioneers of creativity research, divergent thinking is a multidimensional construct. As also stated by Torrance (1974, cited in Kim, 2006), different facets of creativity should be included in any creativity research, rather than just using a single score. Besides, collectivist cultures and individualistic cultures have different measures of excellence, potentially also, with regard to what is considered particularly “creative” (Ng, 2001). Hence, due to this multifaceted nature of creativity (see also Ivancovsky et al., 2021; Treffinger, 1985), we assess five creativity metrics that have been proposed in the literature: fluency, flexibility, originality, subjective creativity score, and usefulness score. We explored whether these creativity metrics were modulated differentially by the independent variables and covariates.
The Present Experiment
Summing up, in the present study, we aimed to assess, with a starting point in findings of previous work (Baird et al., 2012; Browne and Cruse, 1988; Myszkowski et al., 2014; Ritter et al., 2012; Welke et al., 2023; Okada & Ishibashi, 2016), the impact of two art-related copying incubation tasks on participants’ divergent thinking performance on the UUT.
To introduce a more entertaining task than geometric shapes (as in Browne and Cruse, 1988), we asked participants to copy the renderings of two paintings (as in Okada & Ishibashi, 2016) for, respectively, 5 min (incubation tasks). In addition, participants performed two classical incubation tasks, a 0-back task and a rest period for 5 min each.
Classically, participants perform only a single trial of the UUT, and different groups of participants perform the different incubation phases (between-subjects design; e.g., Baird et al., 2012). We used a statistically somewhat stronger within-subjects design in which participants passed through four trials of the UUT (with four counterbalanced incubation tasks: copy a simple painting, copy a complex painting, 0-back, and rest).
We hypothesized:
Hypothesis 1a: The painting–copying incubations would increase creative performance on the UUT, as compared to the rest condition. Aesthetic responsiveness and daydreaming propensity were hypothesized to modulate this effect as covariates. Hypothesis 1b: In case H1a turned out significant, a subsequent model would explore the effect of working memory load (number of mind-wandering episodes during incubation) and participants’ mood (assessed on a visual analogical scale from happy to sad after the incubation task) on creativity scores. Hypothesis 2: Participants’ sex would modulate creativity scores. Hypothesis 3: Our sample of all-Iranian participants would provide usages that were more useful than subjectively and uniquely creative.
Method
This experiment was approved by the Ethics Committee Faculty of Shahid Beheshti University, Teheran, Iran, and the ethics code ID was IR.SBU.REC.1399.042. All participants provided informed consent before participation and all experimental procedures followed the guidlines of the Declaration of Helsinki.
Participants
Sixty-four participants (25 males) took part in this experiment (age: M = 26.88; SD = 3.58; range: 18–32 years).
Sample size was determined as follows: in this within-subjects design, we planned to perform mixed effects models. Mixed effects models are a type of regression, and based on Cohen (1988), f2 = 0.02 is considered a small, f2 = 0.15 a medium, and f2 = 0.35 a large effect size. Using G*Power 3.1. (Faul et al., 2007) for sample size calculation for linear multiple regression with only a moderate-high effect size (deviation from zero; effect size = .25; alpha = .05; power = .95; number of tested predictors = 2), the suggested sample size was 65. For counterbalancing our four incubation task conditions (0-back, copying simple painting, copying complex painting, and rest) and four UUT objects for each of the conditions (paperclip, sheet of paper, brick, and cup), 16 different combinations of these were prepared. Sixty-four participants were tested to have the same number of participants in each of our 16 different combinations. Participant inclusion criteria were based on previous similar research: participants were 18 years or above and had no known color blindness (Baird et al., 2012; Gilhooly et al., 2013). Participants were recruited via advertisements at the university campus and via social media. The experiment was conducted on-site and in-person in a comfortable dimly lit university classroom. Participants did not receive any compensation and the session took about 35–40 min. Participants did not take breaks between conditions. See Table 1 for participant characteristics.
Demographics and Questionnaire Measures; Independent T-Tests.
Note. N = 64. AReA = Aesthetic Responsiveness Assessment; AA = Aesthetic Appreciation; IAA = Intense Aesthetic Appreciation; CE = Creative Engagement; DDFS = Daydreaming Frequency Scale.
Painting Experience is the years that participants were engaged with painting (e.g., participating in painting class).
Materials
The UUT
The UUT requires participants to generate as many unusual uses as possible for a common object, such as a brick. The name of an object is first displayed on the screen, then the participant is told that they will be asked to provide as many unusual uses for this object as possible within a set time frame. In our experiment, the display of the name of the object was followed by an “incubation task” (Baird et al., 2012). Hence, before participants were allowed to provide their answers, one of four different incubation tasks was interleaved. The incubation tasks were copying simple painting, copying complex painting, 0-back, and rest. Four different objects were used (brick, a cup, a sheet of paper, and paperclip), one for each trial. For counterbalancing, 16 different orders of the task were prepared, where the combination and order of incubation tasks and objects varied.
Stimuli: Paintings
Two paintings, one simple and one complex were selected for the two copying painting incubation conditions. The selection was based on a selection procedure involving a list of 20 paintings, that is, described in Supplementary Material Section 1: “Experiment: Selection of Paintings for the Main Experiment.”
Incubation Tasks
Four distinct incubation tasks with hypothesized differential effects on creative performance were set up: copying simple painting, copying complex painting, 0-back, and rest.
In the copying simple and complex painting incubations, participants were informed that they would be asked to copy the subsequently appearing painting as completely as possible, using a sheet of paper and an assortment of color pencils that was provided to them before the session. They were told on the screen: “try to copy the painting as best as you can. You will have five minutes. It is not important to finish the painting during the given time; you should just focus on copying the painting as long as it's on the screen.” Then, the painting was displayed on the screen for 5 min. The participants copied the painting until the instructions on the screen told them to move on and provide their unusual uses for the object that had been revealed to them before starting to copy the painting.
In the 0-back incubation, participants were presented with a string of one-digit numbers and instructed to use their index and middle finger of their dominant hand and press the right key on the keyboard if the character appeared on the screen was the target number, and the left key if any other digit appeared on the screen (distractor).
In the rest incubation, participants were asked to sit quietly without doing anything, and relax for 5 min.
Questionnaires
Aesthetic Responsiveness Assessment. The AReA questionnaire was developed as a coarse screening measure of research participants’ aesthetic responsiveness to a series of different artistic materials (Schlotz et al., 2021; in Farsi: Golbabaei et al., 2023). It has three subscales that each probe for a different facet of aesthetic responsiveness; Aesthetic Appreciation (AA), Intense Aesthetic Experience, and Creative Engagement. Respondents rate a total of 14 items with regard to how much the statements describe them on a Likert scale between 0 (never) and 4 (very often). The Farsi version of the AReA used in this study has previously been validated in the Iranian culture. This Farsi version of the AReA showed good internal validity. Cronbach's alpha for the overall score was .848 and varied between .64 and .81 for subscales. All subscales of this Farsi version of AReA showed a high test-retest reliability, ranging from .715 to .778 (Golbabaei et al., 2023).
Daydreaming Frequency Scale. The DDFS is one of the 28 subscales of the Imaginal Process Inventory, a 344-item questionnaire designed to assess an individual's inner mental life (Singer & Antrobus, 1963). The DDFS contains 12 items about the extent to which individuals experience daydreaming in their daily life. Respondents rate these items using a 5-point Likert scale in terms of how much each item applies to them.
A Farsi version of the DDFS was prepared by the authors, using the translation and back-translation method.
Procedure
Participants were given an information sheet and a consent form and given the opportunity to ask questions. Task instructions and the whole task were set up in the computer software Psychopy v3.1.0. and presented to the participants via an Asus computer with a 15.6-inch monitor (1366 × 768 px resolution); viewing distance was approximately 11.8 inches. Participants performed the experiment individually, seated comfortably in a dimly lit laboratory. Before starting the UUT task, participants completed the questionnaires about demographic information, painting experience, the AReA, and the DDFS.
Participants were randomly assigned to one of the 16 versions of the task. Task instructions were given to participants at the beginning of the experiment. The task included four trials of UUT, each with a different object, and a different incubation task. At the beginning of each UUT trial, instructions appeared on the screen: “You will now be given the name of an object and have 10 s to think about it. At a later point of the experiment, you will be asked to write unusual uses for this object.” Then the name of the object was displayed for 10 s. Immediately after, the 5-minute incubation phase started. After the incubation phase, participants had 2 min to write down all the unusual usages for the object they had been told before the incubation phase on a sheet of paper. See Figure 1 for an illustration.

The procedure of the experiment.
After each incubation phase, two questions appeared on the screen; “how many times did you think of the object?” (Ratings by mouse click on a scale from 0 to 10), and “how do you feel right now?” (Ratings by mouse click on a 9-point Likert scale ranging from 1 (extremely happy) to 9 (extremely sad).
To familiarize the participants with the task, one shorter UUT practice trial was presented before the four main trials. The object was a sponge and the incubation task (2 min) was a simple mathematical sum and subtractions task. The practice trial was discarded from any further analysis.
Data Preparation
Five creativity metrics, that have been proposed in the literature, were computed, following standard procedures: fluency, flexibility, originality, subjective creativity score, and usefulness score (Guilford, 1962; Ivancovsky et al., 2021; Kim, 2006; Torrance, 1974; Treffinger, 1985). These are set out in Table 2.
Response Coding Step-By-Step Guide: Fluency, Flexibility, Originality, Subjective Creativity and Usefulness of Proposed Usages.
Note: Coding these creativity measures with UUT data requires substantive raw response coding of the answers given by participants, with several judges. Ideally, as a first step, all participants’ responses (unusual uses) are imported into a spreadsheet. Second step: two judges independently classify, as “discarded” such responses that are nonsensical or not unusual (provide the total of number of responses, and the number of discarded responses, and the percentage). Third step: the judges classify the responses into different categories of unusual uses (e.g., tools; jewellery items; decoration/aesthetics/art; container/holder; clothing; weapons; toys; etc). When a response contains two unusual uses in one sentence, for instance, “crumple the paper (usage 1) and play with it (usage 2),” a rule of thumb is to “focus on the final use” (here: play; hence, the category is → toy). Fourth step: one of the judges compares the two spreadsheets from the two judges and highlights conflicts between the two (i.e., instances where the judges have put usages in different semantic categories). The percentage of agreement between two judges should be computed and provided, along with the total number of conflicts (where the two judges disagreed on the classification of a response). Fifth step: hold a mediation meeting with the two first judges and two additional judges. Discussions should be held for each conflict between judge 1 and judge 2, while judges 3 and 4 listen. Only if judges 1 and 2 cannot conclude, the other two judges are consulted, or, if they disagree with a discussion, they can intervene. Following this procedure, step six is to compute how many times judge 1 and judge 2 changed their classification based on the discussion. State how many such coding meetings were held and how long it took. Step seven: compute originality scores based on the final spreadsheet, and obtain subjective creativity and usefulness ratings of the final set.
We computed a series of additional variables to be used in our analysis only if the main analyses turned out to show significant effects. However, since none of the analyses for the incubation tasks were significant, these variables were not included in any analysis. We merely report them here because they were collected and available if other researchers want to explore further hypotheses with the data, in accordance with Open Science initiatives (Simmons et al., 2011; Munafò et al., 2017).
Two measures were calculated to gauge the level of difficulty of the two copying painting tasks for the individual participants. Instead of asking participants directly for task difficulty, we obtained external judges’ impressions of each of the 64 participants’ drawings in the copying simple painting and copying complex painting conditions. Four judges were asked to rate how complete participants had drawn the simple and complex paintings, respectively (average rating of four judges to the question “how complete do you think the drawing of the participant is, as compared to the original?” on a scale from 1 = not complete at all; 3 = complete; Cronbach's alpha for interrater agreement for this rating task was 0.87), and how well participants had drawn the simple and complex paintings, respectively (average rating of three judges to the question “how well do you think the participant copied the drawing, as compared to the original?” on a scale from 1 = not very well; 3 = very well; Cronbach's alpha for interrater agreement for this rating task was 0.71). As outlined in the procedure section, participants were instructed to copy the painting as best as they could. The rationale for asking the external judges to rate the completeness of the copy, even if participants had not been instructed to try to complete the copy, was that we assumed that the level of difficulty (and thus, working memory load) of the copying task would be reflected in how completely the participants had managed to copy the painting within the allotted time window of 5 min.
We also measured of how cognitively demanding the task was, by counting the number of mind-wandering episodes during the task that participants had (more mind-wandering episodes = less task demand). We asked participants the question: “How many times did you think about the object during the task that you just did?.” This procedure was based on the proposal published by Baird et al. (2012).
Results
We performed sanity checks on our data, including whether participants’ number of errors in the 0-back task affected their UUT scores (all ps > .151), and whether there was an effect of order of the 16 different conditions (all ps > .393). Results showed no such error or condition effect. Thus, no participants were discarded, and all data were included in the subsequent analysis.
Analysis 1: Incubation Tasks (Mixed Effects Models)
We conducted five mixed effects models with the fluency score, flexibility score, originality score, subjective score, and usefulness score as the dependent variables. We report the result of the mixed effects models for fluency score and flexibility score below. The rest are reported in Supplementary materials. As effect size, we report the betas of the fitted models. As outlined by Wiley and Rapp (2019), one advantage of linear mixed effects models is that effect sizes are easily obtained; the beta coefficients from the fitted models constitute standardized effect sizes.
Fluency (Mixed Effects Model)
A mixed effects model was conducted for fluency score as the dependent variable with incubation and sex as fixed factors, their interaction, AReA and DDFS as covariates, alongside random intercepts and slopes for participants and a random intercept for objects (four objects that were used in different incubation phases: brick, cup, sheet of paper, and paperclip). Based on the mixed effects model, the fluency score was not significantly predicted by the incubation phase, b = 0.156, t(61.91) = 1.1, p = .275, DDFS score, b = 0.016, t(59.99) = 0.652, p = .517, nor AReA score, b = 0.023, t(60.01) = 0.73, p = .468. However, sex was found to have a significant effect on the fluency score, b = 1.341, t(62.79) = 2.064, p = .043, with males scoring lower (M = 4.388, SE = 0.411), than females (M = 5.54, SE = 0.325). The interaction between incubation and sex was not significant; b = −0.069, t(62.1) = −0.381, p = .704. See Table 3.
Mixed Model Effects for Fluency Score (Fixed Effects).
Note. N = 64. b
For the subject level, the estimated variance for the intercept was 2.48, and the estimated variance for the slope was 0.04. For the object level, the estimated variance for the intercept was 0.121. The estimated variances and standard deviations for the random effects and residual variance are presented in Table 4.
Mixed Model Effects for Fluency Score (Random Effects).
Note. N = 64.
Flexibility (Mixed Effects Model)
A mixed effects model was conducted for flexibility score as the dependent variable with Incubation and sex as fixed factors, their interaction, AReA and DDFS as covariates, alongside random intercepts and slopes for participants, and a random intercept for objects (four objects that were used in different incubation phases: brick, cup, sheet of paper and paperclip).
Based on the mixed effects model, the flexibility score was not significantly predicted by the incubation phase, b = 0.038, t(61.93) = 0.439, p = .662, DDFS score, b = 0.009, t(59.97) = 0.748, p = .457, AReA score, b = −0.01, t(59.99) = −0.679, p = .499, nor sex, b = 0.621, t(58.165) = 1.758, p = .084. The interaction between incubation and sex was also not significant; b = −0.067, t(62.11) = −0.608, p = .545 (see Table 5).
Mixed Model Effects for Flexibility Score (Fixed Effects).
Note. N = 64. b = beta coefficient; SE = standard error; AReA = Aesthetic Responsiveness Assessment; DDFS = Daydreaming Frequency Scale.
For the subject level, the estimated variance for the intercept was 0.687, and the estimated variance for the slope was 0.04. For the object level, the estimated variance for the intercept was 0.045. The estimated variances and standard deviations for the random effects and residual variance are presented in Table 6.
Mixed Model Effects for Flexibility Score (Random Effects).
Note. N = 64.
Analysis 2: Comparison Usefulness Score vs Subjective Creativity Score
To explore whether our sample of all-Iranian participants provided more useful scores than unique and subjectively creative usages, a repeated measures (RM) ANOVA was conducted.
A 2 × 4 RM ANOVA was performed with the within items factors Task (4 levels; copying simple painting, copying complex painting, 0-back, and rest), and Creativity Type (2 levels; Subjective Creativity and Usefulness). There was a significant main effect of Creativity Type, F(1, 63) = 32.278, p < .001, partial η2 = .339. However, there were no main effects of Task, F(3, 61) = 1.557, p = .201, partial η2 = .024, nor an interaction between Task × Creativity Type, F(3, 61) = .419, p = .739, partial η2 = .007. The main effect of Creativity Type confirms that in our sample of Iranian participants, participants’ usefulness scores were higher (M = 14.69, SD = 5.94), than the uniqueness creativity scores, the subjective creativity score (M = 13.36, SD = 5.85; See Figure 2).

The main effect of creativity type: usefulness × subjective creativity score.
Discussion
We here tested four different types of incubation tasks in terms of their effect on participants’ subsequent creative performance on the classical divergent thinking task, the UUT. Two of our incubation tasks were arts-related (copy a simple or a complex painting), and two were classical incubation tasks (0-back task and rest).
We found no effects of either incubation tasks nor of our interindividual difference measure aesthetic responsiveness, as measured with the AReA. There was a main effect of sex; women outperformed men on creative fluency, but on none of the other measures of creativity.
The null-effects regarding the incubation tasks are in accordance with the findings by Baird et al., (2012), who also failed to find differences in creative performance (fluency score), between their different incubation phases (1-back, 0-back, and rest). Similarly, Frith et al., (2019) assessed the effect of creative incubation (rest vs exercise) on creative fluency in a creativity task (the Instances Creativity Task), but found no difference between incubation conditions. Yamaoka and Yukawa (2020) also found no differences in creative performance between their three incubation phases (rest, deliberate thinking, and sudoku) using the fluency, flexibility, and originality scores. We replicate these null-effects, but with a stronger within-subjects design.
Aesthetic responsiveness, surprisingly, did not impact performance on the UUT after the creative conditions—copying simple painting or copying complex painting—contrary to the predictions that we had made, based on previous work (Myszkowski et al., 2014; Welke et al., 2021). It is possible that our arts-related incubation task of “copying” a painting did not constitute the right level of aesthetic engagement and was, therefore, not reflected in a relationship with AReA scores. Conversely, an incubation task, where the individual would be left free to improvise what they want to draw by themselves (e.g., as in Hetland & Winner, 2004), may be more suitable (as in Okada & Ishibashi, 2016). In fact, Weigand and Jacobsen (2021) showed that a higher working memory load is associated with fewer aesthetic experiences, and our copying task may have done precisely that: increase working memory load instead of boosting creative fluency. Future work may investigate whether it is not the copying, but the self-generation process that makes the difference for creative incubation. Or, use different art forms that involve more full-body and creative movement as creative incubation, like music-making.
The sex effects that we found (women outperforming men in fluency) were in line with some previous research (Abraham, 2016; Baer & Kaufman, 2008; Simonton, 1994). Many different explanations have been given for sex effects on creativity scores. The most prevalent explanations propose cultural or environmental factors, including discrimination during socialization (Simonton, 1994) and inequality of opportunities (Baer & Kaufman, 2008).
To investigate possible cultural effects on our creativity measures, following the conjecture by Ivancovsky et al. (2021), we compared two measures of creative performance on the UUT: subjective creativity and usefulness of the proposed usages. We found the same within-group effect for our Iranian sample that also Ivancovsky et al. (2021) reported for their Japanese group: their proposed usages were more useful than uniquely creative. The usefulness of a proposed usage is said to grasp conceptions of creativity in Eastern cultures better, where uniqueness is not commonly encouraged as much as in the West (de Oliveira & Nisbett, 2017; Kitayama et al., 2009; Lechuga et al., 2011). Western conceptions of creativity are likely strongly influenced by eighteenth century conjectures of art as disinterested, that is, free from any concerns regarding its usefulness or function (Kant, 1790/1987; Mortensen, 1994; Rind, 2002, Sherman & Morrissey, 2017; Stolnitz, 1961). These effects of culture on creative fluency merit further explorations in future work.
Conclusion
As creativity may be a multifaceted phenomenon (Ivancovsky et al., 2021; Treffinger, 1985), we computed five metrics to assess participants’ creativity on the UUT, including fluency, flexibility, originality, subjective creativity, and usefulness. We provided a step-by-step guide for the community for computing each of these. Yet, importantly, recently also automated creativity scoring systems have been proposed (Beaty & Johnson, 2021; Patterson et al., 2023), which may readily replace the laborious tasks of coding and scoring these classical creativity metrics of the UUT.
Summing up, overall, we used a statistically stronger within-groups design and replicated previous results that have suggested that incubation periods did not increase creativity on the UUT. Creative fluency was modulated by sex; women outperformed men. None of the other assessed interindividual differences modulated results.
Our participants who were all from Iran made more useful than uniquely creative proposals of unusual usages, which is in accordance with more Eastern conceptions of creativity, that likely remain relatively free from Western eighteenth century conceptions of art as disinterested. This latter finding highlights the importance of broadening the definition of creativity to better grasp the realities of non-WEIRD cultures (Henrich et al., 2010). Our results, in general, support previous results and suggest interesting avenues for future testing.
Supplemental Material
sj-docx-1-art-10.1177_02762374231217638 - Supplemental material for Some Effects of Sex and Culture on Creativity, No Effect of Incubation
Supplemental material, sj-docx-1-art-10.1177_02762374231217638 for Some Effects of Sex and Culture on Creativity, No Effect of Incubation by Nastaran Kazemian, Khatereh Borhani, Soroosh Golbabaei and Julia F. Christensen in Empirical Studies of the Arts
Footnotes
Acknowledgments
The authors thank Raha Golestani and Marco Münzberg for their kind help in coding the responses of the Unusual Uses task. And to Prof. Dr. Winfried Menninghaus for supporting this project from beginning to end.
Author Note
This article is based on the master thesis completed by Nastaran Kazemian (2021). Nastaran Kazemian was funded by the Cognitive Sciences & Technologies Council of the Islamic Republic of Iran, and Julia F. Christensen was funded by the Max Planck Society, Germany. Data that support the findings of this study are available on
.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Max-Planck-Gesellschaft, Cognitive Sciences & Technologies Council of the Islamic Republic of Iran, (grant number NA).
Supplemental Material
Supplemental material for this article is available online.
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
