Sage Journals: Discover world-class research

Abstract

Concept maps have been used to assess knowledge acquisition, track learning, and reveal mental models. This study proposes and validates a computationally measurable coding scheme to overcome educational assessment limitations: time consuming, inconsistency in coding, and difficulty in measuring semantic and structural complexity. The three-step coding scheme includes (i) classifying vertices and edges using a guidebook, (ii) training coders through a standardized manual, and (iii) validating reproducibility via inter-rater reliability (IRR) using Fleiss’ Kappa. Results from 22 undergraduate researchers coding six student-generated maps yielded moderate to substantial agreement (Kappa = 0.67 for vertices, 0.45 for edges), supporting both the accuracy and consistency of the scheme. This coding scheme enables scalable, real-time analysis of student thinking and lays the groundwork for automated feedback systems, with potential applications in adaptive learning and tracking of engineering identity in students pursuing engineering education. Beyond enabling structural and semantic analysis of student thinking, the scheme supports automated translation of hand-drawn maps into a digital, analyzable format laying the groundwork for scalable, real-time feedback systems in engineering education. By aligning methodological precision with reflective assessment practices, this research introduces a tool for measuring how students organize and evolve their understanding within project-based curricula. Future directions include integrating AI for automated coding, with promising applications in adaptive learning environments, formative feedback mechanisms, and long-term identity tracking in engineering programs.

Keywords

engineering education concept maps assessment tool learning assessment

Motivation: Concept Graphs as Assessment Tools

In engineering education, evaluating student learning is often constrained by traditional assessment tools such as exams, quizzes, or project reports. These instruments typically fail to capture the iterative and non-linear nature of how students develop conceptual understanding throughout a design or systems-based course. Engineering curricula increasingly emphasize inquiry, reflection, and problem-solving, demanding assessment tools that can account for growth in cognitive complexity, process thinking, and disciplinary identity formation. As a result, educators and researchers are turning to more expressive and diagnostic tools that can reveal how students organize and connect knowledge.

Concept maps can visualize a learner's mental model by representing relationships between concepts through nodes (vertices) and connecting lines (edges). They capture what a learner knows, in addition to how that knowledge is structured and interrelated. In educational research, concept maps have been used to identify misconceptions, assess conceptual depth, and track knowledge evolution over time. Their application in design education is especially promising, where systems thinking, creativity, and iterative refinement are integral learning outcomes.

Challenges with Concept Maps

Concept maps represent relationships between vertices (concepts described with words) through edges (connections between the concepts) (Patel et al., 2024). This visualization helps users make sense of complex systems and captures users mental models (Chang, 2007). The maps is used in design to represent systems, subsystems and the flow of information between them (Padilla et al., 2017; Reichherzer & Leake, 2006). Concept maps are created with specific purposes in mind and are used in various experiments and studies to analyze thought processes (Ruiz-Primo et al., 2001; Watson et al., 2016). However, the challenge of coding these maps increases when numerous unique maps are generated by participants with diverse backgrounds (Lopez et al., 2014). This necessitates a robust coding scheme and a tool to streamline the coding and analysis processes, ultimately improving research outcomes when dealing with large datasets generated from concept maps that have to be manually coded (Khajeloo & Siegel, 2022; Kinchin et al., 2010).

Concept maps are effective in capturing and assessing learning by visualizing a learner's knowledge structure (Grundspenkis & Strautmane, 2010; Lopez et al., 2014; Patel et al., 2024; Tergan et al., 2006). They help educators identify misconceptions, evaluate conceptual understanding, and track knowledge evolution over time (Kamble & Tembe, 2013). In educational contexts, particularly in engineering and design, concept maps facilitate brainstorming (Besterfield-Sacre et al., 2004; Trochim & Kane, 2005), understanding complex systems (Chang, 2007; Darmofal et al., 2002; Schroeder et al., 2018), and visualizing design processes (Correia et al., 2008; Crampes et al., 2006). They also support collaboration and knowledge/project management by providing a clear framework for integrating multidisciplinary knowledge (Gamble & Blackwell, 2001; Henry, 1974). Earlier research excludes robust guidelines for encoding these maps systematically, which can lead to inconsistencies and biases in evaluations (Carley, 1993; Mathieson & Summers, 2010; Mathieson & Summers, 2017). Specifically, this study highlights that understanding the structural complexity of concept maps has proven to be a challenging task, necessitating a rigorous training manual and guidebook to ensure uniform, accurate coding that mitigates evaluator bias.

Manual coding of many maps is time-consuming, while achieving consistency across coders is complicated by individual variations in interpreting and classifying map elements, raising concerns about inter-rater reliability (IRR). Furthermore, existing systems of coding hand-drawn concept maps often lack the computational capacity to analyze structural and semantic complexity, limiting their scalability and integration into digital learning environments. Although previous studies have proposed various scoring techniques, few have produced validated frameworks that are both reproducible and scalable. Most rely on subjective expert judgment or oversimplified rubrics, compromising analytical depth and consistency. Additionally, few coding schemes have undergone rigorous validation using statistical measures like Fleiss’ Kappa or demonstrated effectiveness in large-scale or longitudinal educational settings. These limitations highlight the need for a standardized, evidence-based coding framework that can deliver both structural and semantic insights from hand-drawn, student-generated, concept maps.

To address these challenges, this research introduces the foundational principles of the proposed concept map coding scheme. The coding scheme includes defining core components as discrete units of analysis and classifying their variations through a structured typology of types, scenarios, and cases. The goal is provide an approach that is independent of individual coders (objective) and suitable for future image recognition support.

Concept Map Coding Scheme

The Concept Map Coding Scheme was developed as part of an ongoing study to align and measure the engineering identity of undergraduate students (Kumar & Summers, 2024a). The coding scheme introduces and standardizes three key terms, coding, rating, and scoring, each grounded in existing literature and designed to ensure consistency across evaluations (D’Antoni et al., 2009; Jackson & Trochim, 2002; McClure et al., 1999; Ruiz-Primo et al., 2001; Rye & Rubba, 2002; Schau et al., 2001; Van Zele et al., 2004; West et al., 2002; Yin et al., 2005). By using concepts or keywords (vertices) and connections between the keywords (edges), the graphs represent the conceptualized mental models of the student.

For the scope of this research, concept maps are analyzed for three things:

Topology: In this step, complexity metrics (Mathieson & Summers, 2017) are captured during the codification of the concept maps are analyzed. This helps in understanding the type of concept map expected and what its complexity can be attributed to.

Vocabulary: This step analyses all the keywords, words, and sentences. Trends emerge from these words, along with sentiment and correlation/semantic (LSA) scores.

Comprehensive: This step combines the two other steps and analyses the concept maps to observe larger trends that emerge in larger categories and classifications.

To minimize evaluator bias, the coding process follows a uniform, method supported by a detailed training manual and guidebook. This guidebook organizes coding practices into Types, Scenarios, and Cases, enabling coders to accurately and consistently interpret a wide range of student responses that largely occur in vertices, edges or overall maps. Each concept map is treated like a fingerprint unique to every modeler (Chang, 2007; Hamdiyati et al., 2017; Moon et al., 2018). Figure 1 illustrates this process and provides an overview of the coding process. In the case of studying engineering identity, students are asked to model their understanding of “engineer”.

Figure 1.

Concept map coding process.

Guidebook

The guidebook serves as a foundational document to assist researchers in coding concept maps. It outlines the challenges encountered due to the varied styles of expression, thought, and diagraming used by individuals. The guidebook captures these variations to promote a consistent understanding of participants’ mental models through their concept maps that need to be retraced and recreated by the coders. The itemized list of these various challenges with concept maps are illustrated through scenarios, cases, and examples in Table 1.

Table 1.

Type, Scenarios and Cases Capturing Complexity During Coding.

Type	Scenario	Case
Vertex – V: represent keywords/concepts, usually words or phrases that convey the participant's data or facts.	1. Independent Terms	i. Combo Nodes, ii. Compressed Node, iii. Compressed Combo Node, iv. Bullet List
	2. Only Nodes
	3. Mathematical
Edge – E: Edges are the connections showing information flow between vertices on a concept map.	1. Overlook	i. Use and Discard, ii. Dead Ends, iii. Pit Stop, iv. Slink, and v. Crossover
	2. Disjoin Terms	i. Multiple Repeats, ii. Divergent, iii. Convergent, and iv. Reduction
	3. Directionality of Flow	i. Inconsistent, ii. Squiggles, iii. Self-Loop, and iv. I Guess So
Overall – O: concept maps that are hybrids or don't fit in Type-V or Type-E, focusing on overall structure	1. Refer The Original	i. Ghosted, and ii. Illegible
	2. Null Graphs	i. Dummy Map, and ii. Blanks

Training Manual

To illustrate the coding process, consider an example concept map for this section as shown in Figure 2.

Figure 2.

Example concept map by a pre-service engineering student (Initial).

The three-step procedure to ensure accuracy and consistency in coding for this example is illustrated below:

Step 1 - Checking and Classification: This initial step involves assessing and classifying the original concept maps based on attributes such as TSC (Type, Single or Multiple), readability, relevance of information, completeness, and complexity. Table 2 provides the classification as a quick meta-level overview of the concept map.

Step 2 - Recreate the Concept Map: In this step as shown in Figure 3 researchers translate the original concept map into a structured version, focusing on coding the vertices and edges. This process involves marking up the existing map or redrawing it to create a bi-partite graph format necessary for further analysis.

Step 3 - Code the Translated Concept Map: The final step shown in Table 3 involves entering the coded data into a spreadsheet format for each participant. The researchers verify that the number of vertices and edges matches the original concept map and check that they are accurately represented in the spreadsheet.

Figure 3.

Step 2: Recreate the concept map (Translated Concept Map).

Table 2.

Step 1: Checking and Classification.

Code	TSC	Readability	Relevance of Information	Completeness	Complexity
AA00ZZ00	Single	High	Medium	Complete	Medium

Table 3.

Step 3: Code the Translated Concept Map (Sample).

Name	Description	Input	Output	Source	Sink
A	Engineer	0	7	A	I
B	Problems	1	1	I	H
C	Research	1	1	A	J
D	Design	1	1	J	G
E	Solutions	5	2	A	K
…	…	…	…	…	…

Throughout these steps, emphasis is placed on maintaining consistency and accuracy to facilitate the extraction of complexity metrics for analysis, making sure that the spreadsheet files correlate with the original maps.

Study Deployment and Data Collection

As part of a larger study (Kumar, 2025; Kumar & Summers, 2024b, 2025a, 2025b) on engineering identity, approximately 5,000 undergraduate students at a public R1 university in a U.S. suburban metropolitan area completed two instruments, a survey and a concept map, administered at the beginning and end of the Fall 2024 semester. The survey measured engineering identity using validated items and asked students to define “engineer,” while the concept map served as a visual, complement by capturing their understanding of the “engineer”. Conducted as a timed, in-class activity, students were given six minutes to hand-draw a concept map on a template sheet with “engineer” as the central node. The given prompt was “What does an engineer do? What should engineers focus on? What do you anticipate as the future for engineers? What does it mean for an engineer to be successful? What skills or characteristics does an engineer possess?” The time limit of six minutes was calibrated in pilot studies to ensure that there was sufficient time for students to build a concept map of sufficient complexity to be of interest. These maps are reconstructed into bipartite graphs using the three-step systematic coding approach to analyze their structural complexity and gain deeper insights into the concept maps which ultimately lead to mapping their evolving Engineering Identity.

The output from this process is processed computationally. This method enables researchers to convert the tool from one format (Concept Maps-Visual Representation Models) to another format (Computational Models) using a guidebook and training manual. Hence, it is important to present all the aspects of the coding scheme for a thorough investigation to capture all potential complexities that arise through type, scenarios and cases, which enables an early mitigation or failure-modes strategy to avoid future pitfalls. To validate the objective robustness of this coding scheme, 22 undergraduate researchers were given six concept maps and asked to translate them into the spreadsheet format to capture the bi-partite graphs (concepts are vertices or nodes, connecting lines are edges or relations). These bi-partite graphs are then analyzed for the structural complexity using 29 complexity metrics (Mathieson & Summers, 2010; Owensby & Summers, 2014). The specific vocabulary terms included in the concept maps are captured and analyzed through a separate sematic coding approach.

Use Case and Validation

The guidebook, training manuals, videos and other hands-on training were provided to the coders for a duration of two weeks. The coders were tested once after the training period and once again after two more weeks to check if more time and effort on working on several concept maps would increase their understanding of the coding scheme and the process. All researchers were given the same material, information and time duration for the two interrater reliability (IRR) tests conducted to check the validity of the method and tools of the coding scheme of concept maps as observed in prior work (D’Antoni et al., 2009; Rye & Rubba, 2002; Watson et al., 2016; West et al., 2002). The consistency of interpretation and reconstruction of the concept graphs needs to be sufficient to allow for parallel processing of the concept maps without a loss of information. Thus, inter rater reliability and intra rater reliability tests are conducted to verify the consistency. As the concept maps could have multiple concepts within a single node, it is important to be able to split these consistently. Tests are run on the objectivity of the code for vertex splitting and edge construction. Six concept maps were randomly selected for testing. The more complicated of the maps is illustrated in Figure 4.

Figure 4.

Original concept map - P1 detailing high complexity.

The goal of this analysis is to assess robustness of the scheme by comparing the results as coded by 22 undergraduate researchers coming from diverse backgrounds and years of engineering in their undergraduate program. These six concept maps were chosen randomly using the random Excel function. The results of the analysis are presented in this section using the averages of the vertices and edges coded by the participants using the coding scheme for all these six concept maps as part of this use case. The analysis is used to determine the standard deviation and coding consistency amongst the 22 coders.

Analysis of Vertices and Edges

Figure 5 highlights the standard deviation on the average number of vertices. Here one can see that Map P1 and P6 were the most complex in nature and hence they had the highest nonconformity compared to the other four with P5 being the least. This variability highlighted the relative complexity of vertices and edges chosen for this test. Figure 6 shows standard deviation on the average number of edges as they were the most difficult element on the concept maps to code. Hence, there is a larger magnitude of standard deviation recorded in their coding. The largest deviance is P2 which was coded with a standard deviation of 1.6 edges compared to its counterparts. Figure 7 shows the coding consistency related to vertices and edges for the 22 coders. Figure 7 shows the scatter plot of the 22 coders arranged alphabetically, thus randomizing the sequence as it is to check the consistency amongst the coders. It is seen that the R² Values of the three trendline shows that the coding has been consistent for both vertices (0.0942) and edges (7E-05), and when we take an average (0.0317) for both as well. The small R² values indicate that there is no increasing or decreasing trend.

Figure 5.

Standard deviation of vertices.

Figure 6.

Standard deviation of edges.

Figure 7.

Coding consistency of coders.

It is also worth noting that the two sets of concept maps analyzed in this study were obtained during distinct phases of the research timeline. P1-P3 was taken earlier just after the coding training period and P4-P6 was taken during a later time to test the coding scheme. This analysis revealed that the standard deviation is the same for all the three concept maps (P4-P6) highlighting that the coders might have gotten used to the coding scheme and were able to be in a particular deviation range. This also highlights the consistency of the coders, and the coding scheme. Overall, the analysis validated the coding scheme's effectiveness in capturing the structural intricacies of the participants’ concept maps.

Inter-Rater Reliability (IRR) - Fleiss’ Kappa

The final validation of the coding scheme developed for concept maps can be shown with IRR testing. This section analyses, validates and tests the hypotheses that the coding scheme is robust with reasonable agreement between the coders, and this is not based on chance as suggested by the IRR's null hypothesis. The testing includes IRR for coding the vertices and edges. Since this research had 22 coders, the research used Fleiss’ Kappa (FKappa) (Falotico & Quatto, 2015; Moons & Vandervieren, 2023) which is a statistical measure used to evaluate the agreement between multiple coders (more than two) when assigning categorical coding to a fixed set of items. It is an extension of Cohen's Kappa. FKappa is the main statistic that shows the agreement between coders and has a value between 0 and 1, where 0 would be no agreement and 1 would be in complete agreement eliminating the probability of agreement by chance and other bias. FKappa is widely used in research where IRR among multiple observers or evaluators needs to be quantified, such as healthcare, education, and social sciences.

Unlike ANOVA or Z-test, the p-value is preferred to be zero in Fleiss’ Kappa (or very close to 0), indicating that the null hypothesis (which assumes agreement by chance) can be rejected with high confidence. In other words, the observed level of agreement among coders is statistically significant and not due to random chance. The research also acknowledges that a p-value of 0 doesn't necessarily mean perfect agreement; it just means that the observed agreement is statistically significant. The strength of the agreement is still determined by the FKappa value itself. It is noted that if the value is between 0.75 and 1, it means “Excellent” agreement. Between 0.4 to 0.75 is recognized as “Moderate to substantial” agreement and 0 to 0.4 is “Slight to fair” agreement (Falotico & Quatto, 2015; Moons & Vandervieren, 2023).

Vertices IRR

This section presents the IRR related to vertices to be checked between all coders on the vertices of the concept maps. The vertices are the nodes that are presented in the concept maps that need to be coded in the excel file as close to the original concept map while also ensuring accuracy and consistency by applying the rules in the guidebook and the training manual. Table 4 shows the calculation relevant for this section, and Table 5 shows the FKappa value for the coding of vertices, 0.6787. Thus showing that most of the coders are in “Substantial” agreement with each other, hence proving the effectiveness of the coding scheme and process (method and tool) for vertices. It can be observed from the coding as well that P1 and P6 can be seen to have more spread of values. If we consider P1 in the first row, we see that out of the 22 coders, 1 coded for 13 vertices, 19 coders coded 14 vertices, 1 coder coded it to have 15 vertices and 1 more coder coded it to have 17 vertices, showing that each row adding up to 22 coders to complete the calculation section of this table.

Table 4.

FKappa IRR Calculation for Coding of Vertices.

Concept Map	Number of Vertices Chosen to be Coded by Coders as Per Coding Scheme
Concept Map	4	5	6	7	13	14	15	16	17	26	27
P1					1	19	1		1
P2							3	19
P3							1	19	2
P4					20	2
P5										1	21
P6	1	14	3	4

Table 5.

FKappa Final IRR Results for Coding of Vertices.

Important Final FKappa (IRR) Parameters for Vertices
m	n	pa	pe	kappa	s.e	z	p-value	alpha	lower	upper
22	6	0.7345	0.1736	0.6787	0.0117	58.0980	0	0.05	0.6558	0.7016

Table 5 highlights several final parameters pertaining to Fkappa test, such as m - number of coders/coders, n - number of concept maps selected for the IRR test, pa - the average proportion of coders that agree for each item, pe - the expected agreement by chance, s.e - standard error which measures the variability of the Kappa statistic, z - a statistic that tests the null hypothesis, p - the probability of observing the computed z-value under the null hypothesis and it has to be less than ‘ alpha’ - which is the significance level for confidence intervals, often set to 0.05, and finally the lower and upper which are the bounds of the confidence interval.

Edges IRR

Like the IRR of vertices, the research obtained a Fkappa value of 0.4562 with an alpha of 0.05 and other important values pertaining to the FKappa test as shown in Table 6. Thus, indicating that there was substantial agreement on the coding scheme applied to the edges as well. The research acknowledges that as the edges are the elements that create complexity in concept maps which ultimately presents the understanding through flow of information by connecting the facts or data points. These edges also get further complicated when the writing utensil or the scan of the concept maps have issues. Thus, justifying the need for a digital tool that can eliminate such human-induced errors in data collection and codification. Again, this Fkappa value for coding of edges according to the process and tool, proves and validates the coding scheme developed for concept maps.

Table 6.

FKappa Final IRR Results for Coding of Edges.

Important Final FKappa (IRR) Parameters for Edges
m	n	pa	pe	kappa	s.e	z	p-value	alpha	lower	upper
22	6	0.5094	0.0978	0.4562	0.0085	53.3696	0	0.05	0.4394	0.4730

Conclusion, Limitations and Future Work

As demonstrated throughout this study, the development of a consistent, scalable coding scheme for concept maps addresses a critical methodological gap in engineering education research. By grounding the framework in theoretical clarity and validating it through empirical reliability testing, the study offers a robust approach for analyzing the structural and semantic dimensions of student thinking. The following concluding section summarizes the core contributions of this work, outlines its limitations, and suggests avenues for future research and expansion.

This study developed and validated a structured, reproducible, and scalable concept map coding scheme tailored for engineering education. The research contributions can be summarized as follows:

Foundational Framework: Introduced a conceptually grounded coding architecture based on three dimensions: Type, Scenario, and Case, enabling coders to capture both structural and semantic features of hand-drawn concept maps.

Guidebook and Training Manual: Developed and implemented a comprehensive guidebook and training protocol that ensures uniformity in coder interpretation and minimizes evaluator bias.

Three-Step Coding Procedure: Operationalized the scheme through a stepwise method involving map classification, recreation, and data translation into computational formats.

Validation with Fleiss’ Kappa: Demonstrated moderate to substantial IRR with 22 undergraduate coders across six maps (Fleiss’ Kappa = 0.6787 for vertices, 0.4562 for edges), validating both the consistency and practicality of the coding process.

Application in Engineering Identity Research: Deployed the coding scheme within a longitudinal study on engineering identity, linking coded map complexity to self-reported measures such as initiative, self-efficacy, and disciplinary engagement.

Metric Extraction for Computational Analysis: Enabled the generation of 29 complexity metrics per concept map, facilitating learner-level and cohort-level analysis in design education research.

Despite its strengths, this study has several limitations. First, while the coding scheme has shown promising IRR, its application was limited to a specific domain, engineering education and within a controlled environment using hand-drawn concept maps. Second, the Fleiss’ Kappa for edge coding, though within acceptable thresholds, was lower than that for vertices, indicating that additional refinement or digital pre-processing tools may improve coder consistency in capturing complex relational structures.

This research opens several promising directions for future work. One key opportunity is to extend the coding scheme to other disciplines such as health sciences, business, or the liberal arts where visual reasoning and systems thinking are equally relevant. Applying the framework to digitally scanned concept maps could also streamline analysis by enabling direct computational processing. Further, integrating artificial intelligence to automate coding tasks such as identifying patterns in concept structures could enable real-time feedback for educators and learners. Lastly, longitudinal studies could leverage this scheme to track conceptual development over time, linking map evolution to academic performance, identity formation, and learner outcomes. Collectively, these directions highlight the coding scheme's potential for interdisciplinary impact, automation, and long-term learning analytics.

Results from the engineering identity study that includes concept graph comparisons, engineering identity surveys, student definitions of engineering, and keyword selection to describe engineering can be found in (Kumar, 2025). The concept maps helped explore the growth of individuals across the semester, comparing the complexity of the graphs and the increase in the number of words or concepts employed.

Footnotes

ORCID iDs

Pavan Kumar

Joshua D Summers

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Author Biographies

Pavan Kumar has over 19 years of experience across academia, industry, and entrepreneurship, building and leading initiatives that bridge technology, manufacturing, and human wellbeing. He is currently the General Manager of UltraFactory and Director of Engineering at Ultrahuman Healthcare. He completed his Ph.D. at the University of Texas at Dallas in August 2025 studying engineering identity development. His research explored how curricular, co-curricular, and socio-personal factors shape undergraduate engineering identity. An early pioneer of the maker movement in India (2013), he helped catalyze a nationwide shift toward innovation-led learning and ecosystem building. He is deeply committed to reforming engineering education and advancing human-centered technology, building a bridge between the next generation of engineers, sustainable manufacturing, and the future of health technology. He thrives on solving complex problems at the intersection of strategy and engineering, translating vision into scalable, real-world impact.

Dr. Joshua D. Summers is a Professor of Mechanical Engineering at the University of Texas at Dallas. He earned his Ph.D. in Mechanical Engineering from Arizona State University, focusing on design automation, and holds B.S. and M.S. degrees from the University of Missouri-Columbia, where he worked on virtual reality-based submarine design. Dr. Summers has held roles such as Associate Dean, Department Head, Director of Graduate Studies, and Director of Education for a university-wide advanced manufacturing center. His research - funded by over $22 million from NASA, NSF, the US Army, BMW, Michelin, and others - centers on collaborative design and computational design enablers. He has mentored over 90 graduate students and the team has authored more than 400 peer-reviewed publications. A recognized innovator in engineering education, Dr. Summers has introduced new curricula, restructured capstone and advising programs, and led the development of novel interdisciplinary degree pathways. His contributions have been honored with awards from ASME, SAE, TMCE, and the South Carolina Governor's Office.

References

Besterfield-Sacre

M. E.

Gerchak

Lyons

Shuman

L. J.

Wolfe

(2004). Scoring concept maps: An integrated rubric for assessing engineering education. Journal of Engineering Education, 93(2), 105–115. https://doi.org/10.1002/j.2168-9830.2004.tb00795.x

Carley

(1993). Coding choices for textual analysis: A comparison of content analysis and map analysis. Sociological Methodology, 23(1993), 75–126. https://doi.org/10.2307/271007

Chang

S. N.

(2007). Externalising students’ mental models through concept maps. Journal of Biological Education, 41(3), 107–112. https://doi.org/10.1080/00219266.2007.9656078

Correia

P. R.

Infante-Malachias

M. E.

Godoy

C. E.

(2008). From theory to practice: The foundations for training students to make collaborative concept maps. Proceedings of the third international conference on concept mapping (pp. 414–421).

Crampes

Ranwez

Villerd

Velickovski

Mooney

Emery

Mille

(2006). Concept maps for designing adaptive knowledge maps. Information Visualization, 5(3), 211–224. https://doi.org/10.1057/palgrave.ivs.9500127

D’Antoni

A. V.

Zipp

G. P.

Olson

V. G.

(2009). Interrater reliability of the mind map assessment rubric in a cohort of medical students. BMC Medical Education, 9(1), 1–8. https://doi.org/10.1186/1472-6920-9-19

Darmofal

D. L.

Soderholm

D. H.

Brodeur

D. R.

(2002, November 6–9). Using concept maps and concept questions to enhance conceptual understanding. In 32nd annual frontiers in education, Boston, MA. https://doi.org/10.1109/fie.2002.1157954 .

Falotico

Quatto

(2015). Fleiss’ kappa statistic without paradoxes. Quality & Quantity, 49(2), 463–470. https://doi.org/10.1007/s11135-014-0003-1

Gamble

P. R.

Blackwell

(2001). Knowledge management: A state of the art guide .

10.

Grundspenkis

Strautmane

(2010). Usage of graph patterns for knowledge assessment based on concept maps. Scientific Journal of Riga Technical University. Computer Sciences, 38(38), 60–71. https://doi.org/10.2478/v10143-009-0005-y

11.

Hamdiyati

Sudargo

Redjeki

Fitriani

(2017, October). Using concept maps to describe undergraduate students’ mental model in microbiology course. In 4th international seminar of mathematics, science, and computer education, Bandung, Indonesia. https://doi.org/10.1088/1742-6596/1013/1/012014

12.

Henry

N. L.

(1974). Knowledge management: A new concern for public administration. Public Administration Review, 34(3), 189. https://doi.org/10.2307/974902

13.

Jackson

K. M.

Trochim

W. M. K.

(2002). Concept mapping as an alternative approach for the analysis of open-ended survey responses. Organizational Research Methods, 5(4), 307–336. https://doi.org/10.1177/109442802237114

14.

Kamble

S. K.

Tembe

B. L.

(2013). The effect of use of concept maps on problem solving performance and attitude in mechanical engineering course. Procedia - Social and Behavioral Sciences, 83(4), 748–754. https://doi.org/10.1016/j.sbspro.2013.06.141

15.

Khajeloo

Siegel

M. A.

(2022). Concept map as a tool to assess and enhance students’ system thinking skills. Instructional Science, 50(4), 571–597. https://doi.org/10.1007/s11251-022-09586-5

16.

Kinchin

I. M.

Streatfield

Hay

D. B.

(2010). Using concept mapping to enhance the research interview. International Journal of Qualitative Methods, 9(1), 52–68. https://doi.org/10.1177/160940691000900106

17.

Kumar

(2025). Investigation, alignment, measurement, and prediction of engineering identity of engineering undergraduates [Dissertation]. The University of Texas at Dallas.

18.

Kumar

Summers

J. D.

(2024a). Refining engineering identity framework for implementation in engineering education – a literature-based analysis .

19.

Kumar

Summers

J. D.

(2024b). Refining engineering identity framework for implementation in engineering education – a literature-based analysis. Volume 4: 21st international conference on design education (DEC), American Society of Mechanical Engineers Digital Collection, Washington, DC. https://doi.org/10.1115/DETC2024-143188

20.

Kumar

Summers

J. D.

(2025a, August). Establishing a practical coding scheme computationally measure concept maps. In International conference on engineering design (pp. 1575–1584). Dallas, TX. https://doi.org/10.1017/pds.2025.10171

21.

Kumar

Summers

J. D.

(2025b, August). A practical use case and IRR based validation of concept maps coding scheme. In International conference on engineering design (pp. 219–228). Dallas, TX. https://doi.org/10.1017/pds.2025.10036

22.

Lopez

E. J.

Shavelson

R. J.

Nandagopal

Szu

Penn

(2014). Ethnically diverse students’ knowledge structures in first-semester organic chemistry. Journal of Research in Science Teaching, 51(6), 741–758. https://doi.org/10.1002/tea.21160

23.

Mathieson

Summers

J. D.

(2017). A protocol for modeling and tracking engineering design process through structural complexity metrics applied against communication networks. Concurrent Engineering, 25(2), 108–122. https://doi.org/10.1177/1063293X16666936

24.

Mathieson

J. L.

Summers

J. D.

(2010). Representations: Theory and applications DETC2010-28561 (pp. 1–12).

25.

McClure

J. R.

Sonak

Suen

H. K.

(1999). Concept map assessment of classroom learning: Reliability, validity, and logistical practicality. Journal of Research in Science Teaching, 36(4), 475–492. https://doi.org/10.1002/(SICI)1098-2736(199904)36:4<475::AID-TEA5>3.0.CO;2-O

26.

Moon

Johnston

Moon

(2018). A case for the superiority of concept mapping-based assessments for assessing mental models. In International conference on concept mapping, (pp. 1–10).

27.

Moons

Vandervieren

(2023). Measuring agreement among several raters classifying subjects into one-or-more (hierarchical) nominal categories. A generalisation of fleiss’ kappa. ArXiv. [Online]. http://arxiv.org/abs/2303.12502

28.

Owensby

J. E.

Summers

J. D.

(2014). Assembly time estimation: Assembly mate based structural complexity metric predictive modeling. Journal of Computing and Information Science in Engineering, 14(1), 011004-1–011004-12. https://doi.org/10.1115/1.4025808

29.

Padilla

Methven

T. S.

Robb

D. A.

Chantler

M. J.

(2017). Understanding concept maps: A closer look at how people organise ideas. In Conference on human factors in computing systems - proceedings, 2017-May (pp. 815–827). https://doi.org/10.1145/3025453.3025977

30.

Patel

Summers

J. D.

Kumar

P. P.

Edwards

S. L.

(2024). Investigating the use of concept maps and graph-based analysis to evaluate learning. In ASEE annual conference and exposition, conference proceedings. https://doi.org/10.18260/1-2–47702

31.

Reichherzer

Leake

(2006). Understanding the role of structure in concept maps. In Proceedings of the 28th annual conference of the cognitive science society (pp. 2004–2009). [Online]. http://www.google.lv/#sclient=psy-ab&hl=lv&site=&source=hp&q=%22+Understanding+the+Role+of+Structure+in+Concept+Maps%22&psj=1&oq=%22+Understanding+the+Role+of+Structure+in+Concept+Maps%22&aq=f&aqi=&aql=&gs_sm=3&gs_upl=725l1511l0l2310l3l3l0l0l0l0l204l355l2

32.

Ruiz-Primo

M. A.

Shavelson

R. J.

Schultz

S. E.

(2001). On the validity of cognitive interpretations of scores from alternative concept-mapping techniques. Educational Assessment, 7(1), 99–141. https://doi.org/10.1207/S15326977EA0702_2

33.

Rye

J. A.

Rubba

P. A.

(2002). Scoring concept maps: An expert map-based scheme weighted for relationships. School Science and Mathematics, 102(1), 33–44. https://doi.org/10.1111/j.1949-8594.2002.tb18194.x

34.

Schau

Mattern

Zeilik

Teague

K. W.

Weber

R. J.

(2001). Select-and-Fill-in concept map scores as a measure of students’ connected understanding of science. Educational and Psychological Measurement, 61(1), 136–158. https://doi.org/10.1177/00131640121971112

35.

Schroeder

N. L.

Nesbit

J. C.

Anguiano

C. J.

Adesope

O. O.

(2018). Studying and constructing concept maps: A meta-analysis. Educational Psychology Review, 30(2), 431–455. https://doi.org/10.1007/s10648-017-9403-9

36.

Tergan

S. O.

Keller

Burkhard

R. A.

(2006). Integrating knowledge and information: Digital concept maps as a bridging technology. Information Visualization, 5(3), 167–174. https://doi.org/10.1057/palgrave.ivs.9500132

37.

Trochim

Kane

(2005). Concept mapping: An introduction to structured conceptualization in health care. International Journal for Quality in Health Care, 17(3), 187–191. https://doi.org/10.1093/intqhc/mzi038

38.

Van Zele

Lenaerts

Wieme

(2004). Improving the usefulness of concept maps as a research tool for science education. International Journal of Science Education, 26(9), 1043–1064. https://doi.org/10.1080/1468181032000158336

39.

Watson

M. K.

Pelkey

Noyes

C. R.

Rodgers

M. O.

(2016). Assessing conceptual knowledge using three concept map scoring methods. Journal of Engineering Education, 105(1), 118–146. https://doi.org/10.1002/jee.20111

40.

West

D. C.

Park

J. K.

Pomeroy

J. R.

Sandoval

(2002). Concept mapping assessment in medical education: A comparison of two scoring systems. Medical Education, 36(9), 820–826. https://doi.org/10.1046/j.1365-2923.2002.01292.x

41.

Yin

Vanides

Ruiz-Primo

M. A.

Ayala

C. C.

Shavelson

R. J.

(2005). Comparison of two concept-mapping techniques: Implications for scoring, interpretation, and use. Journal of Research in Science Teaching, 42(2), 166–184. https://doi.org/10.1002/tea.20049

Designing and Validating a Concept Map Coding Scheme: Practical Applications for Engineering Education

Abstract

Keywords

Motivation: Concept Graphs as Assessment Tools

Challenges with Concept Maps

Concept Map Coding Scheme

Guidebook

Training Manual

Study Deployment and Data Collection

Use Case and Validation

Analysis of Vertices and Edges

Inter-Rater Reliability (IRR) - Fleiss’ Kappa

Vertices IRR

Edges IRR

Conclusion, Limitations and Future Work

Footnotes

ORCID iDs

Funding

Declaration of Conflicting Interests

Author Biographies

References