Abstract
This paper compares two popular R packages for structural equation modelling (SEM) – lavaan and seminr – to help researchers understand not only how they work, but also when each is most appropriate. Although both tools allow users to estimate the same structural models, they are grounded in different methodological traditions and are designed to support different research goals. Using an identical model, we estimated results with both covariance-based SEM (lavaan) and variance-based SEM (seminr) and compared their outputs, including model specification syntax, evaluation criteria, and reporting conventions. The results show that both approaches lead to substantively similar conclusions regarding the relationships between constructs, while differing in emphasis: lavaan provides richer global model-fit diagnostics, whereas seminr places greater emphasis on prediction-oriented assessment and convenient access to latent variable scores. The contribution of this study lies in its practical, hands-on demonstration rather than in a theoretical or simulation-based comparison. The findings reinforce that there is no universally “better” SEM approach; instead, methodological choice should be guided by the research objective. Researchers focussed on theory testing may benefit more from lavaan, while those prioritising prediction or exploratory analysis may find seminr more suitable. Ultimately, considering both perspectives can support more transparent, robust, and methodologically appropriate SEM applications.
Introduction
Structural equation modelling (SEM) has become one of the most powerful tools in the researcher’s toolkit, particularly in disciplines such as behavioural sciences, psychology, education, information systems, and marketing (Guenther et al., 2023; Hair et al., 2021, 2022; Sarstedt et al., 2022). Its appeal lies in its ability to examine complex theoretical models by combining measurement models – linking latent variables to their observed indicators – with structural models that map out the relationships between those latent variables (Hair et al., 2021; Kline, 2016). In other words, SEM allows researchers to go beyond simple correlations and regressions, enabling them to test entire theories in a single, cohesive framework (Almeida, 2024; Chin, 1998; Kline, 2023).
Within the R, an open-source software (R Core Team, 2025) ecosystem, researchers are fortunate to have access to several robust packages for SEM. Among them, lavaan (Rosseel, 2012; Rosseel et al., 2025) and seminr (Ray et al., 2025) stand out as two of the most widely adopted. Although they serve the same overarching purpose, the philosophy behind their design and the problems they address differ significantly.
The lavaan package is the workhorse of covariance-based SEM (CB-SEM; Kline, 2016). It provides an extensive suite of functions for model specification, parameter estimation, model fit evaluation, and hypothesis testing. It is ideal for researchers focussed on theory confirmation and model fit assessment under the assumption of multivariate normality. On the other hand, seminr was developed to make partial least squares SEM (PLS-SEM; Hair et al., 2021), a variance-based approach, more accessible to R users. PLS-SEM is particularly useful when the goal is prediction rather than theory testing, or when researchers must deal with small sample sizes, formative measurement models, or non-normal data.
While numerous studies have discussed the conceptual differences between CB-SEM and PLS-SEM (eg, Dash and Paul, 2021; Rigdon et al., 2017; Sarstedt et al., 2016; Schuberth et al., 2023), there is limited practical guidance on how the two approaches compare when applied to the same dataset using R. Researchers often face uncertainty about which package to choose, especially when theoretical models could be analysed using either method (Sakaria et al., 2023; Vuković, 2024). Therefore, a systematic, hands-on comparison is both timely and valuable for applied researchers seeking clarity in methodological selection.
The aim of this paper is to offer a practical, side-by-side comparison of lavaan and seminr. We demonstrate how the same model can be specified and estimated using both packages, highlighting differences in syntax, output, and interpretation. By doing so, we hope to provide researchers with clear guidance on selecting the approach that best matches their data characteristics, research objectives, and theoretical goals.
The study is guided by the following research questions:
By addressing these questions, this study aims to contribute to the methodological literature on when and how to appropriately choose between PLS-SEM and CB-SEM approaches.
The remainder of this paper is organised as follows. Section 2 reviews the theoretical foundations of CB-SEM and PLS-SEM. Section 3 describes the dataset and the comparative model, outlining the implementation steps for both lavaan and seminr while highlighting key syntactical and estimation differences. Section 4 presents and compares the empirical results, and Section 5 interprets the findings, discussing their methodological implications, practical recommendations, and directions for future research. Finally, Section 6 concludes the paper by summarising the key insights.
Literature review
Structural Equation Modelling (SEM) has evolved over several decades into one of the most versatile tools for empirical research. Its origins can be traced to the combination of path analysis (Hair et al., 2017; McDonald, 1996; Wright, 1921) and factor analysis (Spearman, 1904), resulting in a methodology that allows researchers to assess both measurement validity and structural relationships in a single statistical framework (Bollen, 1989; Bollen and Diamantopoulos, 2017; Kline, 2023; Lohmöller, 1989; Vinzi et al., 2010; Wold, 1975). Over the years, SEM has been widely adopted in disciplines such as psychology, marketing, and information systems, where researchers often deal with latent variables or constructs that cannot be measured directly.
At its core, SEM distinguishes between latent variables, which represent theoretical constructs that are not directly observable, and manifest (observed) variables, which serve as empirical indicators of those constructs. These relationships are formalised through a measurement model, which links latent variables to their observed indicators, and a structural model, which specifies the hypothesised directional relationships (paths) among latent variables (Hair et al., 2021, 2022; Kline, 2023). By integrating these two components within a single analytical framework, SEM enables researchers to simultaneously assess measurement validity and test theoretical relationships.
Covariance-based SEM (CB-SEM; Hair et al., 2017, 2025; Reinartz et al., 2009) is traditionally used for theory testing, focussing on how well a proposed model fits the observed data (Bollen, 1989; Kline, 2016). It has been the dominant approach to SEM. CB-SEM aims to reproduce the observed covariance matrix as closely as possible, placing strong emphasis on model fit and theory confirmation (Kline, 2016, 2023). Its popularity is partly due to software such as LISREL, AMOS, and Mplus, and more recently, the open-source lavaan package (Rosseel, 2012; Rosseel et al., 2025) in R. Lavaan has become the de facto standard for researchers seeking a free, powerful, and flexible implementation of CB-SEM. It supports a wide range of models – confirmatory factor analysis (CFA), mediation, moderation, multigroup analysis, and more – while offering robust options for model fit evaluation through indices such as CFI, TLI, RMSEA, and SRMR.
In contrast, variance-based SEM (PLS-SEM; Cepeda et al., 2024; Hair et al., 2022; Sarstedt et al., 2017; Schuberth et al., 2025) has grown in popularity for its predictive focus and flexibility in handling complex models, formative constructs, small sample sizes, and non-normal data (Hair et al., 2022). PLS-SEM does not attempt to reproduce the covariance matrix but rather maximises the explained variance of endogenous constructs. This makes it especially suitable for exploratory studies or prediction-oriented research (Chin, 1998; Goktas and Dirsehan, 2025; Richter and Tudoran, 2024). Traditionally, PLS-SEM has been associated with proprietary software such as SmartPLS and WarpPLS, but with the advent of seminr, researchers can now perform PLS-SEM directly within R, enabling reproducibility and seamless integration with other statistical workflows. Seminr (Ray et al., 2021, 2025) implements modern PLS-SEM algorithms and is popular for its intuitive model specification using constructs() and relationships() functions.
Recent methodological works (Gudergan et al., 2025; Hair et al., 2021, 2022; Sakaria et al., 2023) have emphasised that CB-SEM and PLS-SEM are complementary rather than competing approaches. CB-SEM is recommended when the research objective is theory confirmation and model fit evaluation, while PLS-SEM is recommended when the focus is on prediction, theory development, or when data do not meet CB-SEM assumptions (Hair and Alamer, 2022; Hair et al., 2019; Sharma et al., 2024).
Despite the availability of both lavaan and seminr, there is still limited academic discussion comparing the two tools in a systematic way. Most studies focus on either CB-SEM or PLS-SEM exclusively, rarely demonstrating how a researcher might specify and analyse the same model using both approaches. A few comparative studies (eg, Dash and Paul, 2021; Reinartz et al., 2009; Sarstedt et al., 2024) have examined the conceptual differences between CB-SEM and PLS-SEM, but practical, hands-on guides remain scarce, especially for R users who must decide which package best suits their data characteristics and research objectives.
This gap motivates the present study. By providing a side-by-side comparison of lavaan and seminr, this paper contributes to the methodological literature by clarifying when each package is most appropriate, illustrating their syntax and outputs, and offering guidance for applied researchers.
Methodology
The purpose of this study is to provide a hands-on, side-by-side comparison of the lavaan and seminr packages in R. To achieve this, we adopt a demonstration-based research design, specifying and estimating the same structural equation model (SEM) using both packages and then comparing the syntax, estimation process, and results. This section outlines the model, data, procedures, and evaluation criteria.
Model specification
A hypothetical SEM was constructed to include both a measurement model and a structural model, reflecting typical use cases in the social sciences. The model consists of six latent variables (constructs) measured by multiple reflective indicators, along with structural paths linking the constructs to test hypothesised relationships. This design ensures that the example is sufficiently complex to demonstrate both covariance-based SEM (CB-SEM) and variance-based SEM (PLS-SEM) capabilities.
The proposed conceptual model (Figure 1) posits a series of hypothesised relationships among the study constructs. Specifically, social effects (SE) and perceived benefits (PB) are expected to jointly predict perceived value (PV); SE, PB, and PV are hypothesised to predict attitude (ATT); ATT, PB, and SE are proposed to influence behavioural intention (BI); and finally, BI, SE, and PV are anticipated to predict actual use (AU). This model was specified in both lavaan and seminr.

Conceptual framework of the proposed model.
lavaan syntax
seminr syntax
)
)
Data
For the purposes of demonstration, a synthetic dataset was generated with realistic parameter values and sample size (N = 500), ensuring that model identification criteria are satisfied. Synthetic data allows for full control of measurement error, construct correlations, and path strengths, and avoids concerns of data privacy or domain-specific bias. To ensure full reproducibility, the data were generated using a fixed random seed in R (version 4.5.1). The dataset was simulated using the simulateData() function in lavaan, ensuring consistency between the population model and the subsequent analyses. All observed variables (indicators) were generated as continuous measures from the underlying population model and subsequently transformed into Likert-type scales for analysis. For this comparison, we use a dataset measuring six latent variables: Social effects (SE), perceived benefits (PB), perceived value (PV), attitude (ATT), behavioural intention (BI), and actual use (AU). Each construct is measured using three reflective indicators. This discretisation step was applied for illustrative purposes and was not intended to evaluate estimator performance under ordinal data conditions. Because the dataset is synthetic, results are presented solely for illustrative comparison of software outputs rather than for substantive inference. The complete R code used to generate the synthetic dataset, including the random seed, is provided in Appendix A to facilitate full reproducibility.
Analysis in lavaan (CB-SEM)
The model was first estimated using the lavaan package (Rosseel, 2012; Rosseel et al., 2025), which applies maximum likelihood (ML) estimation under the assumption of multivariate normality. All analyses were conducted using lavaan (version 0.6-20) to ensure reproducibility. The following steps were performed:
1.
2.
3.
4.
Analysis in seminr (PLS-SEM)
Next, the model was specified and estimated using seminr (Ray et al., 2021, 2025), which implements the Partial Least Squares (PLS) algorithm. The analyses were performed using seminr (version 2.3.7), consistent with current methodological recommendations. The steps included:
1.
2.
3.
4.
Comparison criteria
The results from lavaan and seminr were compared on the following dimensions:
The goal of this comparison is not to determine the superiority of one approach over the other, but rather to illustrate their respective strengths, trade-offs, and practical implications when applied to the same model and dataset. Accordingly, differences observed in the results are interpreted in light of the underlying estimation philosophies and output conventions of CB-SEM and PLS-SEM, rather than as evidence of methodological dominance.
Results
This section presents the outcomes of estimating the same structural equation model (SEM) using lavaan (CB-SEM) and seminr (PLS-SEM). The results are organised into three parts: measurement model assessment, structural model results, and a comparative summary.
Measurement model results
Lavaan (CB-SEM)
As shown in Table 1, the confirmatory factor analysis (CFA) results indicate that all factor loadings were statistically significant (p < 0.001) and exceeded the recommended threshold of 0.70–0.708 (Cepeda et al., 2024; Hair et al., 2014, 2019), thereby establishing strong indicator reliability. Composite reliability (CR) and average variance extracted (AVE) were not automatically generated because lavaan does not compute these metrics by default. However, they can be derived manually or obtained using supplementary packages such as semTools (Jorgensen et al., 2025). These results suggest that the measurement model fits the data well and adequately represents the underlying constructs.
Measurement model results (lavaan).
Source: Authors’ own compilation.
Note. Std err: standard error; Z: test statistics; CI: confidence interval.
Seminr (PLS-SEM)
The PLS-SEM results (Table 2) provided strong support for the reliability and validity of the measurement model. All indicator loadings exceeded the recommended 0.708 threshold, confirming indicator reliability. Multicollinearity was not a concern, as all VIF values ranged from 1.69 to 3.54, well below the conservative cutoff of 5. Internal consistency was robust, with Cronbach’s alpha values between 0.829 and 0.906 and composite reliability (rhoC) values ranging from 0.897 to 0.941. Convergent validity was also established, as all constructs achieved AVE values between 0.745 and 0.842, comfortably above the 0.50 benchmark (Hair et al., 2021, 2022). The rhoA values (0.829–0.909) further supported construct reliability.
Measurement model results (seminr).
Source: Authors’ own compilation.
Note. VIF: variance inflation factor; alpha: Cronbach’s alpha; rhoC: composite reliability; AVE: average variance extracted; rhoA: consistent reliability coefficient.
Discriminant validity was further examined using both the HTMT criterion (Henseler et al., 2015) and the Fornell-Larcker criterion (Fornell and Larcker, 1981). As shown in Table 3, all HTMT ratios ranged between 0.380 and 0.771 – well below the conservative threshold of 0.85 (Henseler et al., 2015) – indicating strong discriminant validity. The Fornell-Larcker results (Table 4) also supported this conclusion, with the square roots of AVE (diagonal values) exceeding the correlations among constructs in every case. Together, these results confirm that each construct in the model is empirically distinct and measures a unique conceptual domain.
Heterotrait-monotrait (HTMT) ratio.
Source: Authors’ own compilation.
Fornell-larcker criterion.
Source: Authors’ own compilation.
Note. Bolded values = √AVE; Unbolded values = inter-construct correlations.
Structural model results
Lavaan (CB-SEM)
Table 5 summarises the structural model estimates. Most hypothesised paths were significant and in the expected directions. Perceived value (PV) was strongly predicted by social effects (SE; β = 0.276, p < 0.001) and perceived benefits (PB; β = 0.443, p < 0.001), explaining 39.3% of its variance. Attitude (ATT) was mainly driven by PV (β = 0.669, p < 0.001), with a total R2 of 57.8%; the path from SE to ATT was the only non-significant relationship. Behavioural intention (BI) was significantly influenced by ATT, PB, and SE (R2 = 52.5%). Actual use (AU) was explained by BI, SE, and PV (R2 = 41.0%). Overall, the model demonstrates strong explanatory power across constructs.
Hypotheses testing using Lavaan.
Source: Authors’ own compilation.
Note. β: path coefficient; R2: explanatory power; SE: standard error; Z: test statistic; CI: confidence interval.
The global fit indices demonstrated excellent overall model fit (Table 6), which also reports the standard benchmark thresholds widely recommended in SEM literature (Hair et al., 2019; Hu and Bentler, 1998; Kline, 2016). Although the Chi-square test was statistically significant, this result is expected given the sample size (N = 500) and is therefore interpreted alongside other fit indices, consistent with standard CB-SEM practice. Figure 2 depicts the measurement and structural models estimated using lavaan.
Global fit indices and recommended thresholds.
Source: Authors’ own compilation.

Measurement and structural model results (lavaan).
In covariance-based SEM, global fit indices such as the Chi-square test, CFI, TLI, RMSEA, and SRMR assess how well the model-implied covariance matrix reproduces the observed covariance structure, thereby supporting theory confirmation and model adequacy evaluation. In contrast, PLS-SEM does not emphasise global goodness-of-fit, as its primary objective is variance explanation and prediction rather than exact model reproduction. Consequently, model evaluation in seminr focusses on explained variance (R2), path coefficient significance, and measurement quality, reflecting its prediction-oriented philosophy.
Seminr (PLS-SEM)
The PLS-SEM results showed that almost all hypothesised paths were significant (Table 7). Social effects (SE) significantly predicted perceived value (PV), behavioural intention (BI), and actual use (AU), but not attitude (ATT). Perceived benefits (PB) strongly influenced PV, ATT, and BI, while perceived value (PV) was the strongest predictor of attitude and also significantly affected actual use. Attitude significantly predicted behavioural intention, which in turn strongly predicted actual use. The model demonstrated moderate explanatory power (R2 = 0.316–0.450), with effect sizes indicating that PV, ATT, and BI were the most influential predictors. All VIF values were well below recommended thresholds, confirming no multicollinearity issues. Unlike lavaan (CB-SEM), seminr does not report global model fit indices, reflecting its prediction-oriented nature. Figure 3 depicts the measurement and structural models estimated using seminr.
Hypotheses testing using Seminr.
Source: Authors’ own compilation.
Note. β: path coefficient; R2: explanatory power; f2: effect size; VIF: variance inflation factor; B.: Bootstrap; Std Dev.: standard deviation; T Stat.: test statistics; CI: confidence interval.

Measurement and structural model results from seminr.
Because PLS-SEM focusses on prediction, the PLSpredict (Shmueli et al., 2019) procedure available in seminr was utilised to evaluate the model’s out-of-sample predictive accuracy in addition to its in-sample fit. Tables 8 and 9 present the indicator-level and construct-level predictive results, respectively. The predictive performance of the PLS-SEM model was evaluated using both in-sample and out-of-sample metrics through PLSpredict, and the results were compared with a linear regression benchmark (LM). Across all indicators, the PLS model showed only slightly higher RMSE and MAE values in the out-of-sample condition compared to the in-sample condition, indicating minimal loss of predictive accuracy when applied to new data. When compared with the LM benchmark, PLS performs competitively, showing marginal differences across most indicators. At the construct level, the differences between in-sample (IS) and out-of-sample (OOS) metrics were very small, with overfitting values ranging from 0.009 to 0.022, which suggests that overfitting is negligible and the model generalises well. Overall, the results support that the PLS-SEM model provides acceptable and stable predictive power for all endogenous constructs (PV, ATT, BI, and AU).
Indicator-level predictive performance using PLSpredict.
Source: Authors’ own compilation.
Note. PLS: partial least squares; IS: in-sample; OOS: out-of-sample; LM: linear model; RMSE: root mean squared error; MAE: mean absolute error.
Construct-level predictive performance using PLSpredict.
Source: Authors’ own compilation.
Note. MSE: mean squared error; MAE: mean absolute error.
Comparative summary
To enhance methodological transparency, a comparative overview of the CB-SEM model estimated using lavaan and the PLS-SEM model estimated using seminr was conducted. Table 10 summarises key methodological, estimation, and reporting characteristics of the two modelling approaches as implemented in the present analysis. Consistent with their established methodological orientations, lavaan places greater emphasis on global model fit and theory confirmation, whereas seminr focusses on explained variance, prediction-oriented assessment, and flexibility with respect to distributional assumptions.
Comparison of CB-SEM (lavaan) and PLS-SEM (seminr).
Source: Authors’ own compilation.
It is important to note that this distinction reflects default implementation rather than methodological limitation. The lavaan package supports robust estimators (eg, MLR and WLSMV) and bootstrapping procedures, allowing CB-SEM models to accommodate non-normal data where required.
Comparative interpretation based on study results
Table 11 reports the coefficients of determination (R2) for the endogenous constructs obtained from both estimation approaches. In this illustrative application, the CB-SEM model yielded moderately higher R2 values than the PLS-SEM model across all endogenous constructs. While PLS-SEM is often characterised as maximising explained variance, prior methodological studies have shown that CB-SEM can produce comparable or even higher R2 values under conditions of well-specified reflective measurement models and favourable data characteristics (Dash and Paul, 2021; Deng and Yuan, 2023; Rožman et al., 2020; Vuković, 2024).
Explained variance (R2) comparison across estimation methods.
Source: Authors’ own compilation.
It is important to emphasise that these differences should be interpreted within the context of the present tutorial demonstration. When indicators exhibit high reliability and the data-generating process aligns closely with the model specification, CB-SEM is capable of efficiently estimating latent covariance structures, which may be reflected in higher explained variance for endogenous constructs (Hair et al., 2021). The observed R2 patterns therefore serve to illustrate how different estimation philosophies may yield slightly different summaries of model performance when applied to the same model and dataset.
Summary of findings
Overall, both estimation approaches produced substantively consistent results, with similar path directions and comparable levels of statistical significance. The comparison highlights several practical distinctions relevant to applied researchers:
Taken together, these results underscore that CB-SEM and PLS-SEM serve complementary methodological purposes. In this illustrative example, CB-SEM was well suited for explanatory model evaluation, whereas PLS-SEM provided a convenient framework for prediction-oriented assessment and detailed measurement diagnostics. The comparison is intended to guide researchers in selecting an appropriate approach based on their research objectives, data characteristics, and reporting priorities, rather than to advocate the superiority of one method over the other.
Discussion and practical implications
The aim of this study was to provide a practical and accessible comparison of two widely used R packages for structural equation modelling – lavaan and seminr – rather than to benchmark competing estimation paradigms. The paper is therefore best understood as a reproducible methodological tutorial demonstrating how the same conceptual model can be specified, estimated, and interpreted within different SEM frameworks.
Within this illustrative context, the results highlight the complementary strengths of covariance-based and variance-based approaches. Although both methods yielded substantively consistent conclusions regarding the hypothesised relationships, they differ in emphasis, evaluation logic, and practical implementation. The lavaan implementation, grounded in CB-SEM, is particularly suited to confirmatory, theory-driven research where global model fit and theoretical coherence are central. Its comprehensive reporting of fit indices (e.g. CFI, TLI, RMSEA, and SRMR) allows researchers to assess how well a proposed model reproduces the observed covariance structure.
By contrast, seminr reflects the prediction-oriented logic of PLS-SEM, prioritising variance explanation, measurement diagnostics, and access to latent variable scores. Its workflow and automated reporting features make it especially attractive in exploratory, applied, and decision-oriented research contexts where predictive relevance is of primary concern (Hair and Alamer, 2022; Ringle et al., 2023). More broadly, the comparison highlights how differences between the two approaches extend beyond estimation philosophy to practical aspects such as model specification, workflow design, and interpretation of outputs within the R environment.
Although PLS-SEM is often characterised as maximising explained variance (R2), the present illustrative application produced slightly higher R2 values for the CB-SEM solution. Under conditions of adequate sample size, strong indicator reliability, and well-specified reflective measurement models, such outcomes are theoretically plausible and have been observed in prior comparative work (Deng and Yuan, 2023; Vuković, 2024). In this tutorial setting, these differences serve to illustrate how data characteristics and estimation logic interact, rather than to support general claims about estimator superiority.
Taken together, the findings emphasise that methodological choices should be guided by research purpose, theoretical maturity, and measurement design rather than methodological preference alone (Hair and Alamer, 2022; Ringle et al., 2023). When applied in alignment with their underlying assumptions, both approaches can provide valuable and complementary insights.
Methodological contribution
This study contributes to the SEM literature by offering a fully reproducible, side-by-side comparison of lavaan and seminr within a single open-source environment (R). By documenting data generation, model specification, estimation steps, and software versions, the paper lowers the barrier for applied researchers, particularly those new to SEM, who seek practical guidance in selecting and implementing appropriate modelling tools.
Rather than advancing new methodological claims, the contribution lies in translating conceptual distinctions between CB-SEM and PLS-SEM into transparent and replicable analytical workflows. In doing so, the study moves beyond abstract methodological comparisons by demonstrating how these differences manifest in practice, complementing prior discussions in the methodological literature (Dash and Paul, 2021; Vuković, 2024).
Limitations and future directions
Several limitations should be acknowledged. First, the analysis relies on synthetic data with reflective indicators and a relatively simple model structure, chosen to maximise tutorial clarity rather than methodological generalisation. Second, the study does not constitute a formal simulation design comparing estimator performance under varying conditions.
Future research could extend this comparison by incorporating formative constructs, nonlinear relationships, multi-group analyses, and systematic variations in sample size or distributional assumptions. Applying the same comparative framework to empirical datasets would further assess the generalisability of the illustrative differences observed here and build upon existing comparative investigations (Dash and Paul, 2021; Deng and Yuan, 2023; Vuković, 2024).
Conclusion
This paper set out to provide a clear, reproducible, and practically oriented comparison of two widely used R packages for structural equation modelling – lavaan (CB-SEM) and seminr (PLS-SEM) – by specifying and estimating the same conceptual model using both approaches. Rather than advocating for a particular modelling paradigm, the study illustrates how similar theoretical models can be operationalised within different SEM frameworks, each grounded in distinct methodological philosophies and analytical priorities.
The results demonstrate that, while both approaches yielded substantively consistent conclusions regarding the hypothesised relationships, they differ meaningfully in terms of estimation logic, model evaluation, and reporting conventions. These differences are reflected not only in their underlying assumptions but also in how models are specified, estimated, and interpreted in practice within the R environment. In particular, lavaan provides a covariance-based framework with comprehensive global fit assessment, while seminr emphasises variance explanation, measurement diagnostics, and accessible workflows for applied research.
Importantly, the comparison does not suggest that one approach is universally superior to the other. Instead, it highlights the importance of aligning methodological choices with the primary research goal, data characteristics, and measurement design. By offering a hands-on, side-by-side demonstration within a single open-source environment, this study provides applied researchers, particularly those working in R, with practical guidance for making informed and transparent methodological decisions.
It is also important to note that the present study does not constitute a simulation-based evaluation of estimator performance, nor does it aim to establish general claims about the relative strengths of CB-SEM and PLS-SEM. The use of synthetic data serves a pedagogical purpose, enabling the modelling steps and software behaviour to be demonstrated in a controlled and reproducible manner. Future research could build on this tutorial foundation by incorporating systematic simulation designs, applying the comparison to diverse empirical datasets, or integrating predictive validation and hybrid approaches that combine SEM with machine learning techniques.
Overall, this paper aims to serve as a practical entry point for researchers seeking to understand and implement SEM in R, and to support more informed, transparent, and methodologically appropriate use of SEM tools in applied research.
Footnotes
Appendix A: Synthetic data generation using lavaan
Synthetic data generation for SEM demonstration.
Acknowledgements
The authors acknowledge the use of ChatGPT-5.2 for copy-editing and language refinement in specific parts of the manuscript. All conceptual, analytical, and interpretive components were solely developed by the authors.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement
The data analysed in this study can be obtained from the corresponding author* upon reasonable request.
