In or Out of Sync: Federal Funding and Research in Early Childhood

Abstract

Understanding the relation between federal investment and research has implications for promoting science production in early childhood, a rapidly expanding area in education research and policy. Federally funded research has shaped fundamental issues in early childhood, yet few studies have systematically examined the relation between federal investment and publication output. Our study applies topic modeling and regression analyses on a text corpus of 15,608 publication and grant abstracts in early childhood education to distill the most prominent topics, and the relationship between grant funding and later publications within these topics. We find that grants topics focused on health and early intervention, while publications covered a wider interdisciplinary range. A topic’s prevalence in grants, as a proxy of federal investment, was positively associated with its prevalence in publications in the following year. The study illustrates the affordances of textual analyses and contributes insights about how federal investment motivates scholarly production.

Keywords

early childhood education research and development science policy topic modeling

Introduction

Investment in research and development (R&D), which amounts to over $134 billion in 2020, is a substantial component of federal expenditures in the United States (Sargent, 2020). The extent to which federal grants sustain an infrastructure for research is key to establishing research rigor and evaluating replicable practices (Burkhardt & Schoenfeld, 2003; Jacob & Lefgren, 2007). Understanding the impact of federal expenditures on research is important to identify scientific fields where funding can productively incentivize research and evidence-based practices (Jacob & Lefgren, 2007). Science policy scholars have explored this relationship between grants, research output (i.e., publications), and subsequent funding attainment, mostly in different scientific fields (Hicks, 2012; Jacob & Lefgren, 2007; Payne & Siow, 2003).

In the fields of education and child development, federally funded research has shaped fundamental issues, including our understanding of development and learning across the life cycle, teacher preparation in K–12 systems, and child health (Diamond et al., 2013; Koppich & Knapp, 1998). Yet few studies have systematically examined the relation between federal grants in education research and publication output. Prior work that traces research findings to funding sources has used a case study approach or manual review of small subsets of publications (Diamond et al., 2013; Walsh & Sanchez, 2010). However, these approaches may not scale easily or account for the highly interdisciplinary nature of development and education research.

This study explores the relation between publications and federal grants in early childhood education (ECE). We focus on ECE for several reasons. First, it represents the multidisciplinary intersection of educational and developmental inquiry. Second, this research field has experienced substantial growth in policy and scientific interest (Gormley, 2011); yet federal funding only supports a small fraction of scholarship (Walsh & Sanchez, 2010). Thus, studying the potential mismatches between federal expenditures and publications can help identify areas for funding incentives. Finally, grant support for ECE research has direct implications for educational practices, such as regulation of and resource allocation for early care programs. Effective funding can help bridge policies, research, practice, and scholarly output.

To capture the broad range of topics in ECE, we apply natural language processing (NLP) approaches to provide a first parse of cleaning and categorizing to a text corpus of 15,608 publication and grant abstracts published between 1990 and 2020. We use these data in topic modeling and regression analyses to examine the following research questions:

Research Question 1. What are the prominent topics in grants and publications in early childhood education?

Research Question 1a: To what extent is there a difference in topic prevalence between grants and publications?

Research Question 1b: To what extent is there a difference in a topic’s keywords between grants and publications?

Research Question 2: To what extent can we link topic prevalence in publications to that in prior grants?

This study has three main contributions. To the best of our knowledge, our study is the first to examine at-scale patterns in grant and publication in education, and ECE in particular. While prior work has focused on a small set of topics in education and development (Diamond et al., 2013; Walsh & Sanchez, 2010), our analysis covers a wide range of disciplines, from program administration to child health.

Second, we illustrate the affordances of NLP approaches, namely structural topic modeling, to derive the underlying themes of a large document corpus in a data-driven manner, without prior researcher input on which topics the corpus covers (Roberts et al., 2014). In the process, we outline the steps to establish the semantic, predictive, and hypothesis validity of the results.

Third, our study provides methodological considerations for science policy research. Researchers have commonly used bibliometrics (e.g., citation, publication counts) to study the impact of funding on research output (Boyack & Börner, 2003; Jacob & Lefgren, 2007). We use results from the structural topic models, namely the prevalence of output (grants or publications) in a given topic area as an outcome variable. We find differences in both how often a topic occurred and which words were more frequently used in grants and publications. Tapping into document content by using a data-driven approach may capture the language nuances from varied discipline-specific perspectives in funded versus published research, rather than relying on predefined keywords.

Overall, our findings reflect the interdisciplinarity of the ECE research field, with implications for scholarly output and funding allocation. We find substantive trends in ECE funding, such as grants’ focus on health interventions, compared with the wide range of topics in publications. Regression results indicate that the prevalence of a topic in grants in a given year is positively associated with its prevalence in publications in the following year. This provides some evidence that federal investment in ECE research can support research production in immediate terms. Together, findings reveal opportunities for diversifying funding to support multidisciplinary ECE research.

Background

Overview of Research Trends and Federal Investments in Early Childhood

Our review paints the broad strokes of ECE research that other scholars have comprehensively reviewed (e.g., Chaudry & Datta, 2017; Haslip & Gullo, 2018). To foreground later discussions of development in publications and grants, we note major trends in demographics, child development, education, and teacher professionalization, and discuss how they relate to funding initiatives. The review illustrates the interdisciplinary nature of ECE and the far-reaching implications for children, parents, teachers, and administrators of federal investments.

Changing Demographics

The demographics of students entering early childhood programs has become increasingly diverse in terms of family structures, language proficiencies, socioeconomic status, and school preparedness (Chaudry & Datta, 2017; Jiang et al., 2015). Bassok et al. (2016) study two nationally representative samples of kindergarteners in 1998 and 2010 and find that while the gap in parental investment in many facets of early care (e.g., increases in access to home technology, home reading practices, child’s participation in enrichment activities) has become narrower, the gap in participation in formal preschool has widened between low- and high-income children. The types of curricula and programs that children engage in (e.g., full-day vs. half-day, academic vs. social focus) may also influence children’s readiness by the time they enter the primary grades (Haslip & Gullo, 2018).

Thus, from a federal funding perspective, there has been an increasing emphasis on the scaling of effective prekindergarten programs and the development of universal versus targeted programs for students from different backgrounds (Greenberg, 2018). Researchers have increasingly called for the development and adoption of tools that meet the needs of all children, including English language learners, children with special needs, and underrepresented children of color (National Research Council, 2008).

Enhanced Understanding of Child Development

A growing understanding of children’s cognition also informs federal investments. Over the past 40 years, convergent research from psychology, education, neuroscience, and economics have all demonstrated the critical importance of high-quality early care and education experiences for positive, healthy child development, especially for children from low-income families and those with special needs (Barnett, 2011; Bowman et al., 2000; Hackman & Farah, 2009; NICHD Early Child Care Research Network, 2005; Sarneroff, 2010; Shonkoff & Phillips, 2000). Early childhood experiences can have long-term impacts. For example, trauma and toxic stress during early life has been linked to developmental disorders and poor health in adulthood (Shonkoff et al., 2012). This research highlighted the need for policies to comprehensively support the family and home environments as contexts for early learning (Britto et al., 2017).

Over time, these scientific discoveries gradually changed the notion that caring for young children was simply adult work support and framed early childhood services as an economic investment (Kirp, 2007; Rose, 2010). In turn, this body of work created major policy and research implications: to promote early learning programs that cultivate the foundations for school readiness, and to continue examining the link between early learning experiences and development and wellbeing (Sripada, 2012; Webster-Stratton & Herman, 2009).

Curricular Shift

The 1990s and 2000s witnessed a shift in the rhetoric of early childhood from “caregiving” to “education” (Rose, 2010). Researchers have documented the positive impact of quality pre-K on individuals beyond school entry and on society at large (Heckman, 2011). As a result, there have been growing efforts to align the preschool classrooms with learning standards in language literacy, math, science, and social emotional learning, among others (Haslip & Gullo, 2018). These efforts are particularly salient following calls for incorporating child care into the K–12 system (Rose, 2010). Assessment of standard implementation, student learning outcomes, and prescriptive curriculum development come to the forefront in evaluating the effectiveness of early childhood programs (Black et al., 2017).

The shift to “education” instead of “care” also influences early learning activities. Traditionally, the early childhood experiences emphasize exploratory, play-based experiences and social interactions (Bresler, 2013; Graue et al., 2015). However, alignment with accountability standards may promote scripted activities that focus on certain disciplines and skills (e.g., language, math) in place of exploratory learning (Lewin-Benham, 2011). Additionally, emerging curricula seek to integrate arts, science, and technology content, with a growing body of work that explores the potential of such curricula on children’s learning (Aldemir & Kermani, 2017; French, 2004; Kermani & Aldemir, 2015; Wang et al., 2010). These efforts drive researchers and early care practitioners to develop activities, create assessments, and update prior theories of child development to align with the shifts in curricular focus.

Professionalization

The focus on accountability has also influenced the policy and research agenda to improve teacher learning, with multiple states adopting quality standards in public preschools (Barnett, 2011; Whitebook et al., 2012). Changes in approaches to teacher professional development also came about in part from the multiple shifts in student demographics, curricular expectations, and the growing understanding of young children’s development (Artman-Meeker et al., 2015). Thus, in recent years, teacher preparation not only focuses on developmentally appropriate practices but also the integration of technology and social emotional learning, among other factors, into early childhood classrooms (Rosen & Jaruszewicz, 2009).

In sum, a growing recognition of diverse children and learners as well as enhanced understanding of children’s cognitive development has led to shifts in ECE program structure, professionalization, and curricula in practice. In turn, funding initiatives present opportunities for researchers to study the impacts and implications of different policies and research-based practices across diverse early childhood settings. These patterns drive our hypotheses that there is more scholarly production, in terms of publications, across topics in ECE over time, and that prevalence in publications and grants for certain topics, such as early care, would be on the rise following federal initiatives.

Federal Investments and Research Production

Federal Investments in ECE

Government institutions can serve as a mechanism to foster competition for quality research and scientific production, one channel of which is through federal investment (Whitty, 2006). Federal investment in research is based on the assumption that knowledge production and knowledge transfer are primary resources for economic growth (Geuna & Muscio, 2009; Stewart & Capital, 1997). In addition to producing basic research, researchers can collaborate with practitioners to turn knowledge into practical applications (Geuna & Muscio, 2009).

In the field of early childhood, the role of federal investments in advancing research in teacher professional development, child health, school readiness, and consequently public policies around child care and social programs has been recognized in reviews of notable literature across fields (Diamond et al., 2013; Gormley, 2011; Kilburn & Karoly, 2008; Koppich & Knapp, 1998; Lee et al., 2017). For example, a synthesis of research funded by the Institute of Education Sciences (IES) finds significant contributions of the funded work to our knowledge of effective instructional practices for promoting children’s development of language and literacy (Diamond et al., 2013). Building on past research helps the authors identify gaps in existing knowledge and propose follow-up funding opportunities. This example illustrates how federal investments can (1) motivate research and (2) influence future research through building on and broadening areas of inquiries in subsequent funding opportunities.

Despite its promise for advancing research, recent reviews of articles in major journals in ECE reveals that federal funding is the exception rather than the norm in the field (Walsh & Sanchez, 2010). A potential reason is that federal initiatives often focus on specific areas in ECE, rather than covering a broad range of topics. These areas include evidence-based research that employs experimental design (e.g., randomized control trial), longitudinal studies, and meta-analyses to determine “what works” (Barnett, 2011; Engle et al., 2011; Gormley, 2011; Heckman, 2006, 2011). Evidence-based and outcome-focused research, however, may not account for the varied experiences of childcare practitioners, children, and parents (Vandenbroeck et al., 2012). Thus, researchers have called for policy and practice to make use of exploratory and responsive research that considers participants’ perspectives in early childhood settings (Cutspec, 2004; Vandenbroeck et al., 2012).

Science Policy Approaches

Tracing the alignment between funding initiatives and research development can offer insights into areas where funding could further motivate scholarly output. Beyond direct counts of funding acknowledgments in publications, there has been limited work that traces federal funding to research output in ECE. We thus turned to the literature on research policy to explore other approaches to examining the relation between governmental investment and R&D. For instance, researchers have focused on linking funding levels to measures of scientific productivity—the submission of findings in publications and grant proposals for peer review, application, and extension in research and practice.

Such measures include scholarly activity metrics (e.g., counts of publications or patents), impact (e.g., citation), and linkages between publications and grants (Boyack & Börner, 2003; Jacob & Lefgren, 2007; McAllister & Narin, 1983). For example, Jacob and Lefgren (2007) use regression analyses to estimate a small positive impact of receiving NIH funding on publication rate. Ebadi and Schiffauerova (2016) apply topic modeling approaches (i.e., latent Dirichlet allocation [LDA]) to extract title keywords and curate research domains to match researchers in publication information to funding database. Their results suggest evidence of the positive association between funding and publication quantity and citation impacts.

In sum, the literature in science policy informs our hypotheses that there may exist a positive association between funding amount and subsequent publication rate or quantity. Prior research highlights the need to consider time lags (accounting for the peer review process that affects publication metrics) when studying research production (Ebadi & Schiffauerova, 2016). A limitation to this body of work, however, is that citation metrics or indexed titles may not capture the broader content in which the publications or grants are situated.

In our study, we employ topic modeling to conduct a first parse of analyzing and categorizing the data corpus in ECE before using the output from the models to study the relation between grants and publications. As an unsupervised approach that uses an unsupervised machine learning algorithm to infer topic from the data, topic modeling can offer new insights into data patterns. This helps researchers to gain a deeper understanding of the data and confirm or revise existing theoretical frameworks (Blei et al., 2003). In so doing, our study applies conceptual approaches from science policy to exploring the underlying, interdisciplinary structure of a new field—early childhood—using data science methodologies.

Method

Data Sources

We first prepared the data corpus for grants and publications in one data set, by compiling a target journal list using Google Scholar’s h index. The h index of a publication is the largest number h, such that h articles in that publication were cited h times each at minimum. We obtained a “high impact” sample (highest h index) to represent journals in ECE, education, educational psychology, educational administration, and educational technology. We excluded journals that targeted different subjects (e.g., higher education). Then, we retrieved all publications related to ECE from the journal list, using the database Scopus. For journals that included publications across the age range (e.g., infant–adult), we searched in the title or abstract for the keywords: early child*, preschool, preK, pre-K, pre-kindergarten, or kindergarten. We limited publication year to 1990–2020, because a preliminary search for articles that contained “early childhood education” in their titles or abstracts returned fewer than 25 articles on Scopus per year, for publications prior to 1990. From 12,824 publications, after removing those with no available abstract, we retained 12,446 publication abstracts. The online Supplemental Table S1 lists the 72 journals and distribution of publications.

We searched for grants that contained any of the following keywords in their title or abstract: early child*, preschool, preK, pre-K, pre-kindergarten, or kindergarten. The grant databases included the top funding agencies for ECE in the United States, namely the IES, the National Science Foundation, and the National Institute of Child Health and Human Development (NICHD). We limited our search to grants awarded between 1990 and 2020 to match the publication corpus. In total, we retrieved 3,199 grant abstracts (2,022 from the NICHD, 756 from the IES, and 421 from the National Science Foundation). The data set included the grant’s title, abstract, award year, funding agency, and funding amount.

Because the number of grants prior to 1997 was small (number in each year <7; Figure 1), to avoid skewing later analyses predicting topic prevalence by publication year, we limited our corpus for grants to 1997–2020. The final data set consisted of 15,608 abstracts (12,446 publications and 3,162 grants). Figure 1 presents the descriptive statistics for abstract counts (Panel A) and average funding by year (Panel B). Omitting the grants prior to 1997 did not substantially change the overall representation of each category in the data set.

Figure 1.

Descriptive statistics by year for abstract count and average funding.

Structural Topic Model

We applied structural topic models (STM) to cluster the data into topics using an unsupervised approach (for technical details, see Roberts et al., 2016). The approach is similar to other mixed-membership models such as LDA (Blei et al., 2003): any document is defined as a mixture of topics, with each word in the document representing one topic. Two main outputs of the mixed-membership models for each document include (1) topic prevalence, or the proportion of different topics in a document ( $θ$ ; topic prevalences range from 0 to 1, and the prevalence values of all topics in a document would add up to 1), and (2) the probability of drawing a word for a particular topic in that document ( $θ$ ). For example, the current study can be represented as a mixture of several topics, such as “text analysis,” “early childhood,” and “research.” Each of these topics can be represented as a distribution over several words, with different high-frequency, representative words for each topic (e.g., “text analysis” may be associated with words like “topic model,” “text,” “validation”).

An affordance of STM, compared with other topic modeling approaches such as LDA, is that STM incorporates covariates (e.g., information about the type of publication, years, authors, etc.) into the topic models. Including the covariates can account for structural changes that predict (1) topic prevalence—how the covariates relate to the prevalence of the topics and (2) topical content—which words are frequently used in a topic, in relation to a covariate. STM achieves these metrics by allowing the topic prevalence ( $θ$ ) to correlate, using regression models to predict topic prevalence by covariates, rather than applying a global mean, and varying frequency of word within a topic by covariates (e.g., grants may more frequently use certain words, such as “administration”). Researchers have found in simulation studies that STM can better capture covariate relationships, compared with a two-step LDA process that first estimates topics, and then relates topics to covariates (Roberts et al., 2014). In the following sections, we illustrate how including covariates into the STM can help to establish different facets of validity for the analyses, and later to answer our research questions.

Research Question 1: Prominent Topics in Grants and Publications Over Time

The workflow to examine topic patterns in grants and publications consisted of three steps. First, we selected a topic model by running several models with different number of topics k (k ranges from 5 to 40) and compared the results with another topic model methodology (LDA). We evaluated the models based on their semantic coherence and exclusivity. Second, we used researcher judgment to examine different facets of validity for the results. Finally, we used STM to visualize and estimate topic prevalence and content, topic correlations, and differences in topic prevalence and content in relation to year and abstract type.

Model Selection

The text from the abstracts was preprocessed prior to running the STM by removing common stop words (e.g., “the,” “and”). Next, to determine the number of topics, we ran a series of topic models with the number of topics k from 5 to 40, segmented by 5. We also ran models with k = 100. We ran each model up to 75 iterations. All models allowed the topic prevalence to vary by years and types of abstract (grants vs. publications). The selection of these covariates was informed by prior research to examine the following hypotheses:

Hypothesis 1: Prevalence in grants and publications for several topics would increase over time with growing understanding of child development.

Hypothesis 2: Grants and publications can differ in topic prevalence from one another, for example, there may be higher prevalence for grants in a narrower set of topics (Gormley, 2011).

We compared the models along several model fit diagnostics to determine k, namely held-out likelihood, residual check, exclusivity, and semantic coherence.

Held-out likelihood captures the predictive validity of the model. For a subset of documents, the algorithm removes half of the set and evaluates the probability of detecting the words that are held out (Asuncion et al., 2012). Higher held-out likelihood indicates higher predictive power. We also examined residual check, which tested for the dispersion of variance within the data. Lower residuals suggest that fewer topics would be needed to account for the data variance (Roberts et al., 2014). Finally, we turned to semantic coherence to evaluate the consistency of a given topic, such that high-frequency words for that topic are likely to cooccur within documents (Mimno et al., 2011; Roberts et al., 2014).

In our data set, generally, a higher number of topics was associated with higher predictive power (i.e., higher held-out likelihood). However, semantic coherence appeared to drop noticeably when the topic number exceeded 25 (online Supplemental Appendix A). These different evaluations revealed that the optimal number of topics was in the range from 15 to 25.

On identifying the range of topic numbers, we compared the exclusivity and semantic coherence of models with k equaled to 15, 20, and 25. Exclusivity describes how unique a topic’s high-frequency words are, such that the words tend to not appear within the top words of other topics (Airoldi & Bischof, 2012; Roberts et al., 2014). To illustrate the tradeoff between semantic coherence-exclusivity, we included a model with k = 100. Our criteria for selecting the number of topics k was that topics in the selected model were both exclusive and cohesive.

The online Supplemental Appendix B reports the exclusivity and semantic cohesion from different models. We noticed broader categories when the number of topics was small. Take the 15-topic model as an example. One topic in this model captured aspects of research (keywords: education, teacher, early, learn, research), but these words likely appeared in a large number of topics and thus may have low exclusivity. Although the 100-topic model had high exclusivity (online Supplemental Appendix B), several topics in this model had low semantic coherence. The result suggested k = 25 may be the optimal number of topics to achieve a balance between semantic coherence and exclusivity, with only two outliers with low exclusivity.

To validate the STM, we estimated a 25-topic model using another method (LDA), with the alpha parameter set at the default .1 and 1,000 iterations (alpha represents document to topic proportion). The online Supplemental Appendix D reported the top 15 high-frequency words associated with the LDA topics. Overall, several topics in the two models appeared to overlap in high probability words, suggesting that the 25-topic STM, and later interpretation, reflect robust features of the text corpus.

Validity Checks

We followed prior work (Quinn et al., 2010) to establish different criteria of validity—determining whether the results are creditable. Semantic validity refers to the extent to which each document or topic had a coherent underlying meaning and could be related to each other meaningfully. Predictive validity is the extent to which the predicted topics correspond to external events. Finally, hypothesis validity represents the extent to which the measures can be applied to testing hypotheses.

Semantic validity

To examine semantic validity, we inspected intratopic and intertopic semantic validity (Quinn et al., 2010). For intratopic semantic validity (i.e., whether topics were semantically coherent and thus valid), we first examined the representative words and exemplar abstracts. The STM returned four different word lists: highest frequency, FREX, lifts, and scores. Highest probability keywords are those with the highest frequency in a topic. These words are nonexclusive, which means that these words may be associated with any number of topics. FREX, which is the weighted harmonic means of the word’s rank, suggests words that are both exclusive and frequent (Roberts et al., 2014). Lift (Taddy, 2012) is the word’s probability within a topic, divided by its probability within the corpus. Lift captures words that are generally distinct from other topics. Score, which divides the log frequency of the word in the topic by its log frequency in other topics (Chang, 2011), has similar interpretations to lift. Table 1 shows example keywords from the three most prevalent topics in our corpus, and online Supplemental Appendix C lists the remaining topics.

Table 1

Topic, Topic Prevalence, and Top Words from the 3 Most Prevalent Topics

Topic	Mean topic prevalence	Top words
Education policy/Administration	.073	Highest Prob: educ, early, paper, year, childhood, develop, profession, primary, learn, policy, study, curriculum, research, approach, practice FREX: organize, Wale, practitioners, Irish, centre, professionalism, England, program, tactic, pupil, ASP, Australia, leadership, ECEC, Zealand Lift: -three, ability, best, glass, intent, nature, parent, primary, schoolification, anti-discriminatory, Copenhagen, cultural-discourse, drift, eel, feminist Score: ECEC, educ, paper, centre, profession, asp, program, leadership, tactic, policy, England, pedagogy, curriculum, organize, pedagogy, practitioner
Theory/Concept	.070	Highest Prob: article, author, practice, way, research, theory, discuss, draw, discourse, understand, perspective, within, construct, critic, analysis FREX: discourse, notion, entangle, author, metaphor, ident, postmodern, illustrate, semiotic, sociocultural, feminist, frame, picture-book, assemblage, intra-act Lift: human, possible, real, ableist, anthropocentric, archetype, assemblage, Bakhtinian, Barad, border-crossing, Buber, carniv, cartography, censor, chronotype Score: article, author, discourse, discuss, argue, identity, think, pedagogy, draw, notion, theory, concept, semiotic, way, perspective
Special education policy	.070	Highest Prob: early, educ, childhood, program, special, need, service, provide, develop, research, state, public, right, inclusion, implement FREX: inclusion, service, special, right, partnership, ECS, recommend, prepare, reserve, meet, administration, barrier, council, stakeholder, need Lift: Charlotte-Mecklenburg, continents, data-lit, expel, family–profession, four-year, IHE, LGBT, expand, meta-synthesis, research-program, scotia, warehouse, all-insight, analysis Score: educ, service, program, inclusion, right, childhood, early, article, policy, partnership, reserve, special, implement, profession, public

Note. Highest Prop = highest probability words; FREX = words that are frequent and unique to the topics; Lift and Scores indicate unique words. Topic prevalence = proportion of the topic within a document.

In addition, we manually examined the top five representative abstracts from each topic. The authors (including an expert in ECE) examined the keywords and abstracts to create potential labels for the topics, through two iterations of discussion. The online Supplemental Appendix E shows exemplar abstracts for two related topics: early intervention and child care quality. The labels for these topics stemmed from our observation that one set of abstracts covered environmental factors (e.g., nutrition, pregnancy, etc.), while the other appeared to focus on care arrangements and impact of certain policy initiatives (e.g., subsidies) on the quality of care.

Next, we examined intertopic semantic validity (i.e., the relationship between topics) through topic correlations. Topics that tended to appear together in a document would be more highly correlated and would cluster closer together in the network graph. We aimed to understand whether the topics clustered together in meaningful ways.

Predictive validity

Predictive validity describes the correlation between a predicted topic and an external event not included in the modeling process (Quinn et al., 2010). We examined topic prevalence over time, in relation to major grant initiatives in early childhood, such as the federal Race to the Top-Early Learning Challenge that occurred between 2011 and 2016. The program aimed to increase access to early learning programs for disadvantaged children, design and implement systems of services, and provide rigorous measurements of outcomes and progress. We would expect a change in topic prevalence in related topics, for example, classroom interventions, school readiness, education policy, and child care quality, during and following this initiative. Alternatively, we could trace topic prevalence in topics that are presumably unrelated to the federal initiative. We could gather evidence for predictive validity, if these unrelated topics did not indicate a distinct trend in prevalence in the period of the initiative.

Hypothesis validity

Finally, we examined the hypothesis validity, defined as the usefulness of a measure for evaluating theoretical and empirical hypotheses (Quinn et al., 2010). Prior work in R&D, particularly in early childhood research, has suggested that federal investment in grants tended to concentrate on outcome-focused research (e.g., intervention), in place of exploratory research (Vandenbroeck et al., 2012). We hypothesized that grants in our ECE corpus would be more prevalent in outcome-focused research, including topics with more emphasis on evaluation, assessment, and intervention. The topic model results would show validity if they could be used toward testing this hypothesis, which we explored in our research questions.

Research Question 1a: Difference in Topic Prevalence Between Grants and Publications

We turned to our research question about potential differences in topic prevalence and content between publications and grants. We first examined how the topic prevalence may differ across grants versus publications. Here, the topic prevalence is the outcome variable in a linear regression (Roberts et al., 2014). Each document is an observation, and the predictor variable is a variance-covariance matrix of user-specified predictors (i.e., abstract type, year).

Research Question 1b: Difference in Topic Content Between Grants and Publications

We then explored how the frequency of certain words from the same topic may differ across grants versus publications. We created another STM with k = 25, which was similar to our original model in the model selection step (i.e., using abstract type and year as predictors for topic prevalence), but also allowed the content words of the topics to vary by abstract type. The model used a multinomial logistic regression, where the outcome variable was a parameter of the rate of use for a word w (i.e., occurrence of a word w, divided by its distribution in the whole corpus). The predictor variables included the topic that the word w belongs to, abstract type, and the interactions between the topic and the abstract type. Intuitively, this model allows us to explore whether the rate of use of an individual word within a topic differs by abstract type. We hypothesized that differences between grant and publication topics (e.g., on evidence-based practices) would be reflected in the rate of word use between grants and publications.

Research Question 2: Predicting Topic Prevalence in Publications

We then used outcomes from the topic model to examine the extent to which grant topic prevalence—as a proxy of federal investment—was associated with topic prevalence in subsequent publications. For each of the two subsets (grants, publications), we calculated the mean prevalence of a specific topic for each year (i.e., the average value of topic prevalence in a year). The data formed a panel of topics from 1997 to 2020 for 25 topics (575 topic-year observations; as stated in the Data Sources section, observations prior to 1997 were excluded due to the small number in grants). We then estimated the following regression model to examine topic prevalence in publication, as a measure of grant (R&D):

\begin{matrix} Topic prevalence {(pub)}_{i t} = β Topic prevalence \\ {(grant, lag x)}_{i t} + \\ {Topic effect}_{i t} + ε_{i t} \end{matrix}

where i denotes the individual topic and t denotes the annual time period. Following prior work (e.g., Brodnax & James, 2018), we used topic fixed effects to control for observable and unobservable differences between the topics, for example, for more popular fields, or differences in research methods across the highly interdisciplinary corpus. This restricts our predictions of grant topic prevalence to publication topic prevalence to be within-topic over time, such that $β$ represents the average of these within-topic correlations. The error term $ε_{i t}$ represents idiosyncratic variance across topics. The lag, x, was by 1 year, 3 years, or 5 years, to account for the period between grant awards and output (i.e., publication). Because the grant corpus started from 1997, the 1-year lag model examined the relationship for grants in 1997 and publications in 1998; 3-year lag was grants in 1997 and publication in 2000, and the 5-year lag was grants in 1997 and publications in 2002.

We examined the appropriateness of the fixed effect over random effect alternatives for panel data. The assumption of the random effects model is that there exists no significant correlation between the unique errors and the model’s predictors (i.e., no unobserved variables). We ran the Hausman test to detect whether the coefficients in the fixed versus random effects model were systematically different. If the null hypothesis is rejected (p < .05), we can conclude that the predictors are correlated with the error terms and the fixed effect model is preferred. Hausman test provided evidence to reject the null hypothesis for models at 1, 3, and 5-year lags and suggested that the use of fixed effect for the topics was appropriate ( $χ^{2}$ = 3.83, p = .05; $χ^{2}$ = 6.51, p = .01; $χ^{2}$ = 9.88, p = .002).

We note that for the regression analyses, we do not consider the results to represent a causal relationship. Rather, they reflect a descriptive analysis of the relation between grants and publications, two important components of scientific production.

Findings

In this study, we explore how federal investment focus, as indicated by topic prevalence in grants, may be related to scholarly output in ECE. We report the validity of the STM models, before using these results to answer our research questions.

Validity

Semantic Validity

Our results suggest evidence for intratopic and intertopic semantic validity. The STM revealed interesting patterns about the underlying topics of our data corpus. For example, the topic about Teaching and Professional Development was represented by high-frequency keywords such as “teach,” “profession,” and “classroom.” The FREX words, which listed words that were both frequent and unique to the topic, illuminated different aspects of the teaching profession, such as teacher attitude, self-efficacy, or classroom climate.

We further examine the relationships between topics. The online Supplemental Appendix F visualizes the topic correlations as connected networks with the cutoff of .08 for correlation values (i.e., small correlation). The results for the topic correlation suggest evidence of intertopic semantic validity, as the clusters generally appeared homogeneous and well-defined. Consider the large cluster of topics around school readiness at the bottom of online Supplemental Appendix F, with topics around Assessment/Measurement, School Readiness, Social/Emotional/Behavioral Learning, English learners, and Language. This cluster was connected to a smaller triad around teaching and learning (STEM; Teaching and Professional Development; Writing). Another cluster emerged for the topics related to child health intervention (early intervention; child health; and risk factors) and appeared distinct from the school readiness cluster.

Predictive Validity

Figure 2, Panel A plots the topic prevalence over time for four topics: education policy/administration, school readiness, child care quality, and assessment. The x-axis displays the years, and the y-axis shows the expected topic prevalence. The dashed lines show 95% confidence intervals. The general pattern of topic prevalence appears to confirm our conjecture: There was an increase in prevalence in school readiness, care quality, and assessment between 2011 and 2016 in grants (blue line) and publications (red) and an increase in prevalence in publications of policy/administration during this period.

Figure 2.

Topic prevalence over time, in relation to Race to the Top: Early Learning challenge.

Another way to examine predictive validity is to examine prevalence in topics that would not be feasibly related to the Race to the Top initiative. Panel B of Figure 2 displays the same plot for an unrelated topic, Theory/Concept, which did not show corresponding patterns (i.e., increase in the same period) and provides evidence for predictive validity.

Research Question 1a: Topic Prevalence in Grants and Publications

To explore topical patterns in grants versus publications, we first examine the difference in frequency of topic prevalence by abstract type. Results confirm our hypothesis that grants would have prevalence in a narrower range of topics, and suggest potential distinctions between publication and grant focus. Table 2 presents the average topic prevalence per topic for publications and grants. For example, a topic prevalence of .08 means that on average, this topic accounted for 8% of the content in a document in the corpus. Figure 3 shows the differences in prevalence of topics for publications and that for grants, with the whiskers indicating 95% confidence interval. Positive values toward grants indicate that this topic was more likely to appear in grants.

Table 2

Descriptive Statistics for Topic Prevalence in Publication and Grant

Topic	Publication		Grant
Topic	M	SD	M	SD
Theory/Concept	0.081	0.018	0.005	0.011
Early Intervention	0.007	0.002	0.064	0.028
Parenting Support	0.044	0.01	0.021	0.007
Developmental Psychology	0.003	0.002	0.027	0.02
Special Education Policy	0.108	0.039	0.038	0.028
Assessment and Measurement	0.041	0.011	0.024	0.01
Mathematical Development	0.037	0.013	0.083	0.067
Maternal and Child Health	0.003	0.001	0.063	0.092
Socio-Emotional-Behavioral Learning	0.028	0.004	0.021	0.007
Executive Functioning	0.052	0.016	0.01	0.006
Education Policy and Administration	0.074	0.023	0.003	0.006
School Readiness	0.046	0.006	0.032	0.011
Writing	0.043	0.013	0.012	0.006
Risk Factors	0.013	0.003	0.122	0.046
Peer Relationships	0.075	0.027	0.012	0.007
Media	0.066	0.021	0.022	0.034
Child Health	0.003	0.002	0.104	0.056
Teaching and Professional Development	0.074	0.015	0.014	0.013
Special Needs	0.012	0.005	0.076	0.046
English Leaners	0.025	0.007	0.016	0.012
Classroom Interventions	0.002	0.001	0.078	0.041
STEM (Science, Technology, Engineering, and Mathematics)	0.025	0.006	0.097	0.086
Language and Reading Development	0.042	0.01	0.011	0.008
Child Care Quality	0.042	0.008	0.013	0.006
Intervention for Students with Disabilities	0.054	0.009	0.03	0.013

Note. Topic prevalence indicates the proportion of a topic within a document.

Figure 3.

Coefficient plots (topic prevalence difference between grants and publications).

A noticeable difference is in topics related to health (e.g., maternal and child health), which was associated with a higher prevalence in grants; b = 0.26, SE = 0.06, p < .001. The average topic prevalence, or proportion of the maternal and child health topic in a grant abstract tended to be .26, or 26 percentage points higher than that in publications, after accounting for fiscal or publication year. Topics about interventions were also more likely to appear in grants, b = 0.08, SE = 0.03, p = .04. Meanwhile, the publication corpus showed a higher association with a range of topics, for example, topics related to theory/concept, policy and administration, teacher professional development, peer relationships, and use of media, among others (Figure 3). Topics such as assessment/measurement, school readiness, and math development appeared to show a less distinct difference in prevalence between the two corpora.

Research Question 1b: Topic Content in Grants and Publications

Building on prior review of ECE research, we hypothesize that grants would be more likely to include words related to evidence (Barnett, 2011; Engle et al., 2011; Gormley, 2011). However, across the 25 topics, we did not find consistent distinctions between grants and publications in terms of key word frequencies. The two STMs—one using abstract type as predictor for topical content (Research Question 1b) and one without (Research Question 1a)—yielded similar FREX words (i.e., words that are both exclusive and frequent). This suggests that the results about the underlying topic patterns and associated keywords did not vary by abstract type.

Still, using abstract type as a predictor for a topic’s content words illuminates grants’ focus on “data,” compared with publications, in a number of topics. Figure 4 illustrates this pattern. In this figure, the bigger, bolded text suggests words that were more frequently used within each corpus (publication or grant), compared with those words’ distribution for the whole data set. Looking at abstracts related to education policy/administration, the word “data” appeared more frequently in the grant corpus than the publication corpus, indicating a focus on evidence-based work. Within the same topic, publications made use of words like “practitioners” and “profession” more frequently, suggesting a focus on practice. The same focus on “data” and “valid items” in grants can be found in topics about assessment/measurement and child care.

Figure 4.

Comparing content word in the same topic between grants and publications.

Research Question 2: Predicting Topic Prevalence in Publications

We use topic prevalence in grants to examine its association with topic prevalence in publications in subsequent years. The online Supplemental Appendix G shows topic prevalence over time for grants and publications for all topics. Table 3 presents results to predict topic prevalence in publications (Models 1–3). Overall, the models confirm our hypothesis of a positive association between prior federal investment and publication output.

Table 3

Predictors of Topic Prevalence in Publications

Predictor	Topic prevalence in publications
Predictor	1	2	3
Topic prevalence in grant (lag = 1)	.046* (.020)
Topic prevalence in grant (lag = 3)		.020 (.014)
Topic prevalence in grant (lag = 5)			.021 (.015)
df	549	499	449
R ²	.010	.002	.003
Topic fixed effects	×	×	×
Year	×	×	×

Note. Coefficients are standardized.

p < .05. **p < .01. ***p < .001.

We find that topic prevalence in grants in the previous year was significantly associated with topic prevalence in publication, after controlling for topic. An increase of one standard deviation (SD = .047, or 4.7 percentage points) in the grant topic prevalence in the previous year was associated with an increase of .046 standard deviation (.2 percentage points), SE = .020, in publication topic prevalence, p = .02 (Table 3, Models 1). This means that a 5-percentage-point increase in average topic prevalence in grants would be associated with an increase of .2-percentage-point in topic prevalence in publications in the following year. Intuitively, an increase in the average proportion that a topic made up grant documents was associated with an increase in that topic’s prevalence in publications in the subsequent year.

In general, topic prevalence in publication was not associated with topic prevalence in grants in the 3–year lag: $β$ = .020, SE = .014, p = .21; or 5-year lag: $β$ = .021, SE = .015, p = .26. This suggests a positive association between publication and grant topic prevalence, although the relationship was not statistically significant.

Discussion

The motivation for this research was to map out associations between grant and publication topic prevalence, as a mechanism to drive research production. Although the ECE field has witnessed growth in policies and scholarly interest, federal funding only supports a limited fraction of research (Walsh & Sanchez, 2010). This observation is reflected in our data set (Figure 1). While publication quantity has consistently increased over time, funding has not followed the same pattern of growth. Thus, studying the potential misalignments between federal investments and publications can help pinpoint areas for funding allocation.

Topic Patterns in Early Childhood Federal Grants and Publications

We observe patterns in how frequent a topic appeared in grants and publications over time, in ways that are consistent with external policy initiatives (i.e., the Race to the Top: Early Learning Challenge) and influential publications. We find that grants tended to focus on topics of health and intervention, whereas publications tended to more frequently cover a wide range of topics, from teaching and care quality to literacy. Interestingly, we also notice differences in the frequency of vocabulary use within the same topic in grants versus publications, for example, grants appeared to more frequently use words like “data” in discussing topics of policy, administration, and assessments. These patterns are consistent with trends in ECE, as revealed by smaller-scale content analyses of education grants’ focus on evaluation-based work (Barnett, 2011; Engle et al., 2011; Heckman, 2006, 2011).

We find topics with less of a difference in prevalence in grants and publications, such as assessment and measurement. The discussion on standards and evaluation can be linked to the growing amount of evidence that quality child care has long-term impacts for academic and behavioral outcomes into adolescence (Vandell et al., 2010). Teacher’s use of formative assessments is important for improving program quality because it provides teachers with information about each child’s developmental progress, allowing for individualized instruction and care (Ackerman & Coley, 2012). Indeed, the recent Race to the Top: Early Learning Challenge initiative gave priority to states that focused on strengthening their use of evidence-based assessments (Congressional Research Service, 2016). The topic prevalence in both grants and publications studying child assessments maps onto the need to develop psychometrically sound measures of children’s development and learning for diverse populations of young children, and to study the impacts of early care and its potential fadeout in later years (Russo et al., 2019; Vandell et al., 2010; Yoshikawa et al., 2013).

Importantly, the more numerous prevalence of topics such as professional development or theory/concept in publications, compared with grants, may reflect advances in the ECE field and areas for increased federal investment. Take an example of the topic around teaching and professional development. The professional development topic occurs more frequently in publications, which highlights the growing attention to the preparation of teachers, observation of classroom practices and interactions with children, and effective professional development (Vitiello et al., 2018; Whitebook et al., 2012; Zaslow et al., 2010). Assessing the components of teacher performance and effective practices for early care practitioners is complicated, as there is no universal standard for high-quality practice (Early et al., 2007; Lin & Magnuson, 2018; Whitebook et al., 2012). Thus, to inform policy, gaps remain to understand nuances in early care professional development, effective practices, and variation in teacher–child interactions (Zigler et al., 2011).

Overall, the topic prevalence results highlight areas that can benefit from focused funding attention, such as policy/administration, theory/concept, or special education policy. These topics have the highest average prevalence in our corpus, but are less present in grant abstracts. Our findings illuminate potential areas for grant reallocation with the aim of a comprehensive advancement of the ECE field.

Predicting Topic Prevalence in Publications

Aligning grant focus with publication topic is important, because prior work suggests that grants can drive subsequent publication production. Our findings largely converge with findings from prior research about the positive association between publication and grants. We find that a topic that appears more frequently in federal grants would be associated with higher prevalence in publications in the subsequent year. This increase in topic prevalence can be attributed to the publications from the funded projects. The observed association can also indicate the broader ECE field’s response to the grants topics in a given year. This finding consequently suggests that grants may motivate future scientific activities, by allocating funding to certain areas of inquiries. However, these regression results only reflect a descriptive analysis of the relation between grants and publications, rather than causal relations.

We contribute to understanding of science research policy, an area that is largely underexplored in education, and in early childhood in particular. Our findings suggest a more immediate follow-up effect, compared with the 5-year lag in the association between federal research funding and research output in science fields that prior research has observed (Jacob & Lefgren, 2007). Frameworks on the interaction of R&D and grants have posited a two-way mechanism, where federal investments and scholarly production are interlinked (Van Der Meulen, 1998). Science knowledge production can inform federal initiatives, and government institutions can provide incentives for scientific productivity (Whitty, 2006).

While it is encouraging that grants appeared to motivate research, we find a lack of strong alignment over longer time periods. This finding highlights the state of early childhood investments, where federal initiatives and research in ECE have historically been driven by multiple stakeholders with varied interests, as opposed to building on one agenda (Cohen-Vogel, 2005; Gormley, 2011; Nadeem et al., 2010; Rose, 2010). Our findings illustrate the need for more consistent funding schemes, and future inquiries into how funding may motivate publication over longer time periods.

Methodological Contributions

In this study, we extend prior work (Walsh & Sanchez, 2010) by applying computational approaches to review a more comprehensive corpus of work, rather than manually pulling from subsets of journals. There are two main affordances for this approach. First, we reduce the manual coding of the document content. Second, STM provides affordances for exploring potential differences in topical content by covariates (grants vs. publications), in ways that are semantically meaningful and useful toward hypothesis testing. The STM offers key insights into the different focus in the same topic, leading to new research directions. For example, a future content analysis can delve into the difference in data types proposed in grants versus publications in intervention studies, and how these focuses may shape future publications and follow-up grants.

Limitations and Future Research

Our analyses only explore the association between grants and publications, rather than establishing causality. In addition, we do not include other predictors in prior research on the relation between funding and R&D, such as researchers’ collaboration network, career seniority, and citation impact (Ebadi & Schiffauerova, 2016). Finally, our publication corpus focuses on education fields, and may be exclusive of other fields such as health. Future research should consider those variables to examine whether topics with high responsiveness between grants and publications predict more productivity in subsequent scholarly output. Researchers can examine variations across state policies and private investments, as there may exist crowd-out effect of federal funding on publications and funding at the state and organizational levels (Lanahan et al., 2016). Finally, beyond comparing keywords and topics, future work can leverage emergent NLP techniques to examine the overlapping content structure between grants and publications.

Conclusions

In this study, we explore the topic prevalence and content in publications and grants in early childhood over time. To the best of our knowledge, this is the first study to apply NLP approaches to examine research and federal investment trends in education. We summarize our main findings in light of their implications for policy and research.

We find that grants appeared to more likely focus on health and intervention studies, while publications appeared to cover a wider range of topics. This finding illuminates the shifting landscape in the field to emphasize broad, systemic aspects of early childhood. Findings further indicate that higher topic prevalence in grants in a given year was related to higher prevalence in subsequent publications, echoing science policy research.

These findings have practical implications for the administration of federal resources. It is encouraging that federal investment appeared to motivate scholarly production. However, grants appeared to only cover a smaller range of topics, compared with publications. A relevant policy recommendation is to establish centralized resources to link research findings across disciplines to inform future policies and subsequent R&D.

We illustrate the affordances of text analyses approaches, namely structural topic modeling, to explore the latent topics of a large document corpus. While prior work in science policy has relied on bibliometrics, we tap into the document content to explore the relation between funding and research. In the process, we illustrate ways to establish semantic, predictive, and hypothesis validity for the results.

In sum, findings reflect the landscape of early childhood and underlying structures of research and grants content. Understanding the relation between research and government agenda has implications for promoting publications in early childhood, a rapidly expanding area that has far-reaching impacts for children, families, educators, and society.

Supplemental Material

sj-docx-1-ero-10.1177_2332858420979568 – Supplemental material for In or Out of Sync: Federal Funding and Research in Early Childhood

Supplemental material, sj-docx-1-ero-10.1177_2332858420979568 for In or Out of Sync: Federal Funding and Research in Early Childhood by Ha Nguyen and Jade Jenkins in AERA Open

Supplemental Material

sj-xlsx-2-ero-10.1177_2332858420979568 – Supplemental material for In or Out of Sync: Federal Funding and Research in Early Childhood

Supplemental material, sj-xlsx-2-ero-10.1177_2332858420979568 for In or Out of Sync: Federal Funding and Research in Early Childhood by Ha Nguyen and Jade Jenkins in AERA Open

Footnotes

ORCID iD

Ha Nguyen

Authors

HA NGUYEN is a PhD student in STEM Teaching and Learning, School of Education, University of California-Irvine. She applies learning analytics to understand ways to promote student-centered, equitable learning in STEM through responsive digital environments.

JADE JENKINS is an assistant professor at the School of Education, University of California-Irvine. Her research interests include early childhood development, child and family policy, policy analysis and management, and program evaluation.

References

Ackerman

D. J.

Coley

R. J.

(2012). State Pre-K assessment policies: Issues and status (Policy Information Report). Educational Testing Service. https://www.ets.org/Media/Research/pdf/PIC-PRE-K.pdf

Airoldi

E. M.

Bischof

J. M.

(2012). A Poisson convolution model for characterizing topical content with word frequency and exclusivity (ArXiv Preprint ArXiv:1206.4631). https://arxiv.org/pdf/1206.4631.pdf

Aldemir

Kermani

(2017). Integrated STEM curriculum: Improving educational outcomes for Head Start children. Early Child Development and Care, 187(11), 1694–1706. https://doi.org/10.1080/03004430.2016.1185102

Artman-Meeker

Fettig

Barton

E. E.

Penney

Zeng

(2015). Applying an evidence-based framework to the early childhood coaching literature. Topics in Early Childhood Special Education, 35(3), 183–196. https://doi.org/10.1177/0271121415595550

Asuncion

Welling

Smyth

Teh

Y. W.

(2012). On smoothing and inference for topic models (ArXiv Preprint ArXiv:1205.2662). https://arxiv.org/ftp/arxiv/papers/1205/1205.2662.pdf

Barnett

W. S.

(2011). Effectiveness of early educational intervention. Science, 333(6045), 975–978. https://doi.org/10.1126/science.1204534

Bassok

Finch

J. E.

Lee

Reardon

S. F.

Waldfogel

(2016). Socioeconomic gaps in early childhood experiences. AERA Open, 2(3). Advance online publication. https://doi.org/10.1177/2332858416653924

Black

M. M.

Walker

S. P.

Fernald

L. C. H.

Andersen

C. T.

DiGirolamo

A. M.

McCoy

D. C.

Fink

Shawar

Y. R.

Shiffman

(2017). Early childhood development coming of age: Science through the life course. Lancet, 389(10064), 77–90. https://doi.org/10.1016/S0140-6736(16)31389-7

Blei

D. M.

A. Y.

Jordan

M. I.

(2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3(January), 993–1022. http://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf

10.

Bowman

B. T.

Donovan

Burns

M. S.

(2000). Eager to learn: Educating our preschoolers National Research Council, Committee on Early Childhood Pedagogy. https://nap.edu/read/9745/chapter/1

11.

Boyack

K. W.

Börner

(2003). Indicator-assisted evaluation and funding of research: Visualizing the influence of grants on the number and citation counts of research papers. Journal of the American Society for Information Science and Technology, 54(5), 447–461. https://doi.org/10.1002/asi.10230

12.

Bresler

(2013). Knowing bodies, moving minds: Towards embodied teaching and learning (Vol. 3). Springer Science & Business Media.

13.

Britto

P. R.

Lye

S. J.

Proulx

Yousafzai

A. K.

Matthews

S. G.

Vaivada

Perez-Escamilla

Rao

Fernald

L. C. H.

MacMillan

Hanson

Wachs

T. D.

Yao

Yoshikawa

Cerezo

Leckman

J. F.

Bhutta

Z. A.

(2017). Nurturing care: Promoting early childhood development. Lancet, 389(10064), 91–102. https://doi.org/10.1016/S0140-6736(16)31390-3

14.

Brodnax

James

(2018, July). Topics as outcomes: Using structural topic models to measure policy diffusion. Prepared for the 35th Annual Meeting of the Society for Political Methodology, Brigham Young University. https://scholar.google.com/citations?user=HarhXYsAAAAJ&hl=en#d=gs_md_cita-d&u=%2Fcitations%3Fview_op%3Dview_citation%26hl%3Den%26user%3DHarhXYsAAAAJ%26citation_for_view%3DHarhXYsAAAAJ%3Ad1gkVwhDpl0C%26tzom%3D-330

15.

Burkhardt

Schoenfeld

A. H.

(2003). Improving educational research: Toward a more useful, more influential, and better-funded enterprise. Educational Researcher, 32(9), 3–14. https://doi.org/10.3102/0013189X032009003

16.

Chang

(2011). lda: Collapsed Gibbs sampling methods for topic models. R. https://rdrr.io/cran/lda/#:~:text=Implements%20latent%20Dirichlet%20allocation%20(LDA,Gibbs%20sampler%20written%20in%20C

17.

Chaudry

Datta

A. R.

(2017). The current landscape for public pre-kindergarten programs. The Current State of Scientific Knowledge on Pre-Kindergarten Effects (pp. 5–18). https://www.brookings.edu/wp-content/uploads/2017/04/duke_prekstudy_final_4-4-17_hires.pdf

18.

Cohen-Vogel

(2005). Federal role in teacher quality: “Redefinition” or policy alignment? Educational Policy, 19(1), 18–43. https://doi.org/10.1177/0895904804272246

19.

Congressional Research Service. (2016). Preschool development grants (FY2014-FY2016) and race to the top—early learning challenge grants (FY2011-FY2013). https://www.everycrsreport.com/reports/R44008.htm

20.

Cutspec

P. A.

(2004). Bridging the research-to-practice gap: Evidence-based education. Centerscope: Evidence-Based Approaches to Early Childhood Development, 2(2), 1–8. https://www.cebma.org/wp-content/uploads/Cutspec-Bridging-the-research-to-practice-gap-evidence-based-education.pdf

21.

Diamond

K. E.

Justice

L. M.

Siegler

R. S.

Snyder

P. A.

(2013). Synthesis of IES Research on Early Intervention and Early Childhood Education. NCSER 2013-3001 (National Center for Special Education Research). https://ies.ed.gov/ncser/pubs/20133001/pdf/20133001.pdf

22.

Early

D. M.

Maxwell

K. L.

Burchinal

Alva

Bender

R. H.

Bryant

Cai

Clifford

R. M.

Ebanks

Griffin

J. A.

Henry

G. T.

Howes

Iriondo-Perez

Jeon

H.-J.

Mashburn

A. J.

Peisner-Feinberg

Pianta

R. C.

Vandergrift

Zill

(2007). Teachers’ education, classroom quality, and young children’s academic skills: Results from seven studies of preschool programs. Child Development, 78(2), 558–580. https://doi.org/10.1111/j.1467-8624.2007.01014.x

23.

Ebadi

Schiffauerova

(2016). How to boost scientific production? A statistical analysis of research funding and other influencing factors. Scientometrics, 106(3), 1093–1116. https://doi.org/10.1007/s11192-015-1825-x

24.

Engle

P. L.

Fernald

L. C. H.

Alderman

Behrman

O’Gara

Yousafzai

de Mello

M. C.

Hidrobo

Ulkuer

Ertem

(2011). Strategies for reducing inequalities and improving developmental outcomes for young children in low-income and middle-income countries. Lancet, 378(9799), 1339–1353. https://doi.org/10.1016/S0140-6736(11)60889-1

25.

French

(2004). Science as the center of a coherent, integrated early childhood curriculum. Early Childhood Research Quarterly, 19(1), 138–149. https://doi.org/10.1016/j.ecresq.2004.01.004

26.

Geuna

Muscio

(2009). The governance of university knowledge transfer: A critical review of the literature. Minerva, 47(1), 93–114. https://doi.org/10.1016/S0140-6736(11)60889-1

27.

Gormley

W. T.

(2011). From science to policy in early childhood education. Science, 333(6045), 978–981. https://doi.org/10.1126/science.1206150

28.

Graue

M. E.

Whyte

K. L.

Karabon

A. E.

(2015). The power of improvisational teaching. Teaching and Teacher Education, 48(May), 13–21. https://doi.org/10.1016/j.tate.2015.01.014

29.

Greenberg

E. H.

(2018). Public preferences for targeted and universal preschool. AERA Open, 4(1). https://doi.org/10.1177/2332858417753125

30.

Hackman

D. A.

Farah

M. J.

(2009). Socioeconomic status and the developing brain. Trends in Cognitive Sciences, 13(2), 65–73. https://doi.org/10.1016/j.tics.2008.11.003

31.

Haslip

M. J.

Gullo

D. F.

(2018). The changing landscape of early childhood education: Implications for policy and practice. Early Childhood Education Journal, 46(3), 249–264. https://doi.org/10.1007/s10643-017-0865-7

32.

Heckman

J. J.

(2006). Skill formation and the economics of investing in disadvantaged children. Science, 312(5782), 1900–1902. https://doi.org/10.1126/science.1128898

33.

Heckman

J. J.

(2011). The economics of inequality: The value of early childhood education. American Educator, 35(1), 31–35. https://files.eric.ed.gov/fulltext/EJ920516.pdf

34.

Hicks

(2012). Performance-based university research funding systems. Research Policy, 41(2), 251–261. https://doi.org/10.1016/j.respol.2011.09.007

35.

Jacob

Lefgren

(2007). The impact of research grant funding on scientific productivity. National Bureau of Economic Research. https://doi.org/10.3386/w13519

36.

Jiang

Ekono

M. M.

Skinner

(2015). Basic facts about low-income children, children 12 through 17 years, 2013. https://doi.org/10.7916/D87W6B11

37.

Kermani

Aldemir

(2015). Preparing children for success: Integrating science, math, and technology in early childhood classroom. Early Child Development and Care, 185(9), 1504–1527. https://doi.org/10.1080/03004430.2015.1007371

38.

Kilburn

M. R.

Karoly

L. A.

(2008). The economics of early childhood policy: What the dismal science has to say about investing in children (Occasional paper). RAND Corporation. https://www.rand.org/content/dam/rand/pubs/occasional_papers/2008/RAND_OP227.pdf

39.

Kirp

D. L.

(2007). The sandbox investment: The preschool movement and kids-first politics. Harvard University Press.

40.

Koppich

J. E.

Knapp

M. S.

(1998). Federal research investment and the improvement of teaching: 1980-1997. https://www.researchgate.net/publication/234707491_Federal_Research_Investment_and_the_Improvement_of_Teaching_1980-1997

41.

Lanahan

Graddy-Reed

Feldman

M. P.

(2016). The domino effects of federal research funding. PloS One, 11(6), e0157325. https://doi.org/10.1371/journal.pone.0157325

42.

Lee

Kuo

L.-J.

Moody

S. M.

Chen

(2017). Reviews of research funded by US Institute of Educational Sciences: A case of reading development and instruction. Cogent Education, 4(1). https://doi.org/10.1080/2331186X.2017.1401444

43.

Lewin-Benham

(2011). Twelve best practices for early childhood education: Integrating Reggio and other inspired approaches. Teachers College Press.

44.

Lin

Y. C.

Magnuson

K. A.

(2018). Classroom quality and children’s academic skills in child care centers: Understanding the role of teacher qualifications. Early Childhood Research Quarterly, 42(1st quarter), 215–227. https://doi.org/10.1016/j.ecresq.2017.10.003

45.

McAllister

P. R.

Narin

(1983). Characterization of the research papers of U.S. medical schools. Journal of the American Society for Information Science, 34(2), 123–131. https://doi.org/10.1002/asi.4630340205

46.

Mimno

Wallach

Talley

Leenders

McCallum

(2011). Optimizing semantic coherence in topic models. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (pp. 262–272). https://www.researchgate.net/publication/221012637_Optimizing_Semantic_Coherence_in_Topic_Models

47.

Nadeem

Maslak

Chacko

Hoagwood

K. E.

(2010). Aligning research and policy on social-emotional and academic competence for young children. Early Education and Development, 21(5), 765–779. https://doi.org/10.1080/10409289.2010.497452

48.

National Research Council. (2008). Early childhood assessment: Why, what, and how (Vol. 1). National Academies Press.

49.

NICHD Early Child Care Research Network. (2005). Early child care and children’s development in the primary grades: Follow-up results from NICHD Study of Early Child Care. American Educational Research Journal, 42(3), 537–570. https://doi.org/10.3102/00028312042003537

50.

Payne

A. A.

Siow

(2003). Does federal research funding increase university research output? Advances in Economic Analysis & Policy, 3(1). https://doi.org/10.2202/1538-0637.1018

51.

Quinn

K. M.

Monroe

B. L.

Colaresi

Crespin

M. H.

Radev

D. R.

(2010). How to analyze political attention with minimal assumptions and costs. American Journal of Political Science, 54(1), 209–228. https://doi.org/10.2202/1538-0637.1018

52.

Roberts

M. E.

Stewart

B. M.

Airoldi

E. M.

(2016). A model of text for experimentation in the social sciences. Journal of the American Statistical Association, 111(515), 988–1003. https://doi.org/10.1080/01621459.2016.1141684

53.

Roberts

M. E.

Stewart

B. M.

Tingley

(2014). stm: R package for structural topic models. Journal of Statistical Software, 10(2), 1–40. https://cran.r-project.org/web/packages/stm/vignettes/stmVignette.pdf

54.

Rose

(2010). The promise of preschool: From Head Start to universal pre-kindergarten. Oxford University Press.

55.

Rosen

D. B.

Jaruszewicz

(2009). Innovations in early childhood teacher education: Reflections on practice: Developmentally appropriate technology use and early childhood teacher education. Journal of Early Childhood Teacher Education, 30(2), 162–171. https://doi.org/10.1080/10901020902886511

56.

Russo

J. M.

Williford

A. P.

Markowitz

A. M.

Vitiello

V. E.

Bassok

(2019). Examining the validity of a widely-used school readiness assessment: Implications for teachers and early childhood programs. Early Childhood Research Quarterly, 48(3rd quarter), 14–25. https://doi.org/10.1016/j.ecresq.2019.02.003

57.

Sargent

J. F.

(2020). U.S. research and development funding and performance: Fact sheet. https://fas.org/sgp/crs/misc/R44307.pdf

58.

Sarneroff

(2010). A unified theory of development: A dialectic integration of nature and nurture. Child Development, 81(1), 6–22. https://doi.org/10.1111/j.1467-8624.2009.01378.x

59.

Shonkoff

J. P.

Garner

A. S.

Siegel

B. S.

Dobbins

M. I.

Earls

M. F.

McGuinn

Pascoe

Wood

D. L

., Committee on Psychological Aspects of Child and Family Health, & Committee on Early Childhood, Adoption, and Development Care (2012). The lifelong effects of early childhood adversity and toxic stress. Pediatrics, 129(1), e232–e246. https://doi.org/10.1542/peds.2011-2663

60.

Shonkoff

J. P.

Phillips

(2000). From neurons to neighborhoods: The science of early childhood development. National Academies Press.

61.

Sripada

(2012). Neuroscience in the capital: Linking brain research and federal early childhood programs and policies. Early Education and Development, 23(1), 120–130. https://doi.org/10.1080/10409289.2012.617288

62.

Stewart

Capital

(1997). The new wealth of organizations. Nicholas Brealey.

63.

Taddy

(2012). On estimation and selection for topic models. Artificial Intelligence and Statistics, 1184–1193. http://proceedings.mlr.press/v22/taddy12/taddy12.pdf

64.

Vandell

D. L.

Belsky

Burchinal

Steinberg

Vandergrift

(2010). Do effects of early child care extend to age 15 years? Results from the NICHD Study of Early Child Care and Youth Development. Child Development, 81(3), 737–756. https://doi.org/10.1111/j.1467-8624.2010.01431.x

65.

Vandenbroeck

Roets

Roose

(2012). Why the evidence-based paradigm in early childhood education and care is anything but evident. European Early Childhood Education Research Journal, 20(4), 537–552. https://doi.org/10.1080/1350293X.2012.737238

66.

Van Der Meulen

. (1998). Science policies as principal-agent games Institutionalization and path dependency in the relation between government and science. Research Policy, 27(4), 397–414. https://doi.org/10.1016/S0048-7333(98)00049-3

67.

Vitiello

V. E.

Bassok

Hamre

B. K.

Player

Williford

A. P.

(2018). Measuring the quality of teacher–child interactions at scale: Comparing research-based and state observation approaches. Early Childhood Research Quarterly, 44(3rd quarter), 161–169. https://doi.org/10.1016/j.ecresq.2018.03.003

68.

Walsh

B. A.

Sanchez

(2010). Reported research funding in four early childhood journals. Early Childhood Education Journal, 37(4), 289–293. https://doi.org/10.1007/s10643-009-0358-4

69.

Wang

Kinzie

M. B.

McGuire

Pan

(2010). Applying technology to inquiry-based learning in early childhood education. Early Childhood Education Journal, 37(5), 381–389. https://doi.org/10.1007/s10643-009-0364-6

70.

Webster-Stratton

Herman

K. C.

(2009). Disseminating incredible years series early-intervention programs: Integrating and sustaining services between school and home. Psychology in the Schools. https://doi.org/10.1002/pits.20450

71.

Whitebook

Austin

L. J. E.

Ryan

Kipnis

Almaraz

Sakai

(2012). By default or by design? Variations in higher education programs for early care and education teachers and their implications for research methodology, policy, and practice. http://www.irle.berkeley.edu/cscce/2010/no-single-ingredient/

72.

Whitty

(2006). Education(al) research and education policy making: Is conflict inevitable? British Educational Research Journal, 32(2), 159–176. https://doi.org/10.1080/01411920600568919

73.

Yoshikawa

Weiland

Brooks-Gunn

Burchinal

M. R.

Espinosa

L. M.

Gormley

W. T.

Ludwig

Magnuson

K. A.

Phillips

Zaslow

M. J.

(2013). Investing in our future: The evidence base on preschool education. Foundation for Child Development.

74.

Zaslow

M. J.

Tout

Halle

Whittaker

J. V.

Lavelle

(2010). Toward the identification of features of effective professional development for early childhood educators, Literature review. U.S. Department of Eduation, Office of Planning, Evaluation and Policy Development, Policy and Program Studies Service.

75.

Zigler

E. E.

Gilliam

W. S.

Barnett

(2011). The pre-K debates: Current controversies and issues. Paul H. Brookes.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

1.45 MB

0.02 MB