Abstract
Accurately predicting lower body dimensions in older women is essential for improving garment fit and wearing comfort. However, challenges in data acquisition and the morphological variability among older adults complicate this task. To address these issues, this study adopts a classification followed by modeling strategy, using a small set of key anthropometric inputs to estimate critical lower-body measurements relevant to pants pattern construction. Anthropometric data were collected from 217 women aged 60–80 in Northeast China, capturing 34 lower body parameters. Principal component analysis was performed to reduce dimensionality and extract six primary body shape factors. A two-step clustering method was then applied to determine the optimal number of body types and to analyze their morphological characteristics. Subsequently, stepwise linear regression models were developed for each body type using stature, weight, and abdominal girth as predictor variables. The results demonstrate that the proposed models achieved good prediction accuracy. Overall, this research provides a data-driven foundation for customized pants pattern development and intelligent production of garments tailored for older women.
Introduction
With the accelerating global trend of population aging, the number of older women has increased significantly. In 2019, the global population aged 60 years and older reached 1 billion, and this figure is projected to double to 2 billion by 2050, accounting for 22% of the total global population.1,2 As a vital consumer group in the apparel industry, older women exhibit notable variability in body morphology, commonly characterized by abdominal protrusion, flattened hips, and altered lower-body proportions.3,4 These changes pose challenges to conventional sizing systems, often resulting in poor fit issues such as tight waistbands, gaping at the hips, or strain in the crotch area, thereby compromising comfort and mobility.5–7
Accurate anthropometric data serves as the foundation for apparel pattern design and are essential for constructing well-fitting garment templates.8,9 Body measurement techniques can be categorized into manual contact methods and non-contact three-dimensional (3D) scanning technologies. Traditional tools such as tape measures are heavily dependent on the operator’s skill and experience, leading to subjective and less reproducible outcomes that are difficult to scale for large populations.8,10 In contrast, 3D body scanning offers high-precision and repeatable data collection. Nevertheless, its high cost, operational complexity, and limited acceptance among older individuals constrain its widespread adoption in older adult garment development.5,11 Thus, developing lower-body size prediction models based on a few easily accessible measurements is essential to advancing intelligent pants pattern design for older people.
Various prediction-based approaches have been proposed for estimating body dimensions. For example, Tan et al. 12 constructed multiple regression models using height, bust, and waist girth (WC) to predict 28 body dimensions in women. Li et al. 13 classified female lower-body shapes based on width–thickness ratios and derived girth estimates from key landmarks to optimize custom pants block generation. Zanwar et al. 14 applied multiple linear regression (MLR) to establish predictive mechanisms using limited parameters, thereby improving measurement efficiency. More recently, machine learning algorithms such as artificial neural networks (ANN) and generalized regression neural networks (GRNN) have been employed in anthropometric modeling with improved fitting accuracy.15–17 Other studies have used 2D image-based techniques; for instance, Gu et al. 18 extracted frontal and lateral width–depth features from photographs to predict girths.
Although these methods have shown promising results in younger or standard-bodied populations, they often underperform in older women due to complex morphology and high inter-individual variability. Even among individuals of similar height and weight, older women may exhibit substantial differences in waist, abdomen, hip shapes, and posture curvature. Therefore, integrating body type classification into the modeling process—via a classification followed by modeling approach—offers a viable strategy to enhance model accuracy and individual adaptability.12,19 This study collected lower-body 3D body scan data and manual measurement data from 217 women aged 60–80 in Northeast China and developed a corresponding prediction framework. The goal is to estimate key pants design dimensions from a limited set of easily measurable inputs, providing a data foundation for personalized pants development for older women. As shown in Figure 1, the study comprises the following steps:
Representative lower body variables are extracted using a hybrid approach combining 3D body scanning and manual measurements.
Dimensionality reduction via principal component analysis (PCA) is performed to identify major shape characteristics, followed by two-step clustering (TSC) for body type classification.
Multiple linear regression models are developed using stature (H), weight (W), and abdominal girth (AC) to predict 12 essential lower-body dimensions for each body type, along with performance evaluation of the models.

Framework of the proposed prediction method.
Methods
Data collection
This study recruited 217 older women aged 60–80 from Northeast China as the research subjects. Ethical approval was obtained from the university’s institutional review board, and all participants were informed of the study’s purpose and procedures. Written informed consent was obtained from each participant prior to data collection. The study was conducted between November 2020 and January 2021. Three-dimensional body scanning was conducted using the VITUS scanner. The scanning environment was maintained at 25 ± 3°C to ensure consistency with standard nude measurement conditions. The data acquisition procedures followed the ISO 20685-1:2018 standard “3D scanning methodologies for internationally compatible anthropometric databases,” ensuring consistency and comparability of measurement results. 20
To address privacy concerns, participants wore ultra-thin, non-restrictive privacy garments during scanning. All scan data were anonymized by removing personally identifiable information, and the resulting files were securely stored in an encrypted, access-controlled system. The research team members responsible for scanning and measurement received formal training in equipment operation and research ethics, including participant privacy protection, standardized anthropometric techniques, and data confidentiality protocols.
To ensure the accuracy of certain key dimensions that are difficult to extract directly from 3D body scan data, supplementary manual measurements were taken using a flexible tape measure. All measurement definitions followed the Chinese National Standard GB/T 16160-2017. 21 A total of 34 lower-body feature variables were collected, including height, girths, widths, depths, lengths, and angles, as summarized in Table 1. Figure 2 illustrates the anatomical landmarks corresponding to each measurement variable. Each landmark is encoded with letters and numbers: letters represent specific anatomical regions, while numbers denote the horizontal plane position. Specifically, the number “1” corresponds to the right side of the body, “3” to the left side, “2” to the anterior midline, and “4” to the posterior midline.
Details of lower-body feature variables.

Key measurement points and coding for lower-body analysis.
Data preprocessing
To ensure the accuracy and reliability of the sample data, a series of preprocessing and filtering procedures were performed using SPSS 26.0. The preprocessing included three main procedures: outlier detection and removal, correction of measurement errors, and assessment of data normality. Together, they ensured that the dataset used for modeling was both statistically valid and technically robust.
Boxplots were first employed to detect potential outliers and missing values. Data points falling outside 1.5 times the interquartile range (IQR) were flagged as outliers. These values were cross-validated against the original 3D body scan files. If confirmed as errors, they were either corrected or excluded from the dataset. Missing values were handled based on the source and nature of the variables. When corresponding manual measurements were available, they were used to replace missing 3D body scan data. Variables with unresolvable or irrecoverable missing data were excluded from further analysis.
The normality of all 34 lower-body measurement variables was evaluated using Quantile–Quantile (Q–Q) plots to verify the applicability of parametric statistical methods in subsequent analysis. In the majority of cases, the points closely aligned with the diagonal line in the plots indicated an acceptable approximation to the normal distribution. Figure 3 presents a Q–Q plot for abdominal girth as an illustrative example. After data preprocessing, a total of 209 valid samples were retained for modeling. The Q–Q plot assessment confirmed that all retained variables met the assumptions of normality, supporting their suitability for PCA and regression analysis.

Q–Q plot of abdominal circumference for the normality test.
Extraction of principal components
Principal component analysis is a multivariate statistical method that reduces data dimensionality while preserving the integrity of the original information. It transforms multiple correlated variables into a smaller set of uncorrelated components. 22 In this study, the number of principal components retained was determined based on two criteria: (1) each component must have an eigenvalue greater than 1.0; and (2) components were selected in descending order of rotated factor loadings until the cumulative explained variance reached a sufficient threshold to represent the lower body shape information adequately.
Cluster analysis
This study employed the TSC method based on the Bayesian information criterion (BIC) to enable automated classification of lower body shapes in older women. Two-step clustering is a hybrid clustering algorithm capable of handling both continuous and categorical variables, and it determines the optimal number of clusters automatically using statistical criteria (e.g. AIC or BIC) rather than relying on arbitrary selection.23–25 The TSC algorithm proceeds in two stages: the first phase uses distance measures to pre-cluster the data. In contrast, the second phase applies a probabilistic model—similar to Latent Class Analysis—to determine the optimal subgroup structure.26,27 In this study, the TSC algorithm was implemented using IBM SPSS Statistics 26. Key body shape indicators derived from the PCA results, including horizontal girths, vertical lengths, and angular measurements, were used as input variables for clustering. The BIC values were computed for different clustering solutions, and the solution with the lowest BIC was selected as the optimal number of clusters.
Regression modeling
It is necessary to account for variations in body shape characteristics to improve the accuracy of lower body dimension predictions. Therefore, based on the previously identified three body shape clusters, this study developed separate linear regression models for each body type. Linear regression is a well-established and extensively validated modeling approach widely adopted in apparel design and pattern drafting applications.28–30
The general mathematical form of the linear regression model is expressed as follows:
In the equation,
The predictive accuracy of the regression models was evaluated using the root mean square error (RMSE), a standard measure for assessing model performance. The RMSE is calculated using the following equation:
In the equation,
Results
Statistical analysis of body shape characteristics in older women
The descriptive statistics of 209 valid samples of older women aged 60–80 years from Northeast China were analyzed using SPSS to obtain the minimum, maximum, mean, range, and standard deviation of key lower body dimensions and shape indicators, as summarized in Table 2. The results indicate substantial inter-individual variability across age segments. Notably, WC (mean = 86.86 centimeters (cm), SD = 7.80), HC (mean = 97.47 cm, SD = 4.88), and AC (mean = 96.42 cm, SD = 6.69) exhibited considerable dispersion, highlighting the challenge of applying standard sizing systems to this demographic.
Descriptive statistics of anthropometric measurements (
Analysis of body shape variability
To further explore the underlying structure of lower body measurements in older women from Northeast China, an initial analysis was conducted on 34 variables. Based on each variable’s practical significance and relevance to body shape characterization, 27 variables were selected for PCA. As shown in Table 3, six principal components were extracted, explaining 24.392%, 19.118%, 16.759%, 8.928%, 7.11%, and 4.656% of the total variance, respectively. The cumulative variance contribution reached 80.963%, indicating that these six components effectively represent the overall structure of the original dataset.
Explained variance and rotated component loadings from principal component analysis.
The rotated factor loading matrix is shown in Table 4. Principal component 1 exhibits high loadings on height-related variables, including total height, WH, AH, HH, and CH, and is defined as the Height Factor. Principal component 2 demonstrates strong associations with girth-related variables, such as AC, HC, and TC, reflecting the Horizontal Girth Factor. Principal Component 3 comprises variables representing differences among the waist, abdomen, and hip regions, including ratio indices and abdominal protrusion angle. Collectively, these features describe the cross-sectional morphology in the torso region. Thus, this component is the Waist–Abdomen–Hip Cross-Sectional Shape Factor. Principal Component 4 features high loadings on hip protrusion angle and torso tilt, characterizing the sagittal body profile—specifically the forward-backward curvature—and is defined as the Lateral Contour Factor. Principal Component 5 consists mainly of the side hip and side abdominal angles, describing the anterior curvature of the lower torso and is referred to as the Frontal Contour Factor. Principal Component 6 includes the BA and lower hip protrusion, indicating the degree of lower body inclination, and is named the Postural Inclination Factor. Collectively, these six components align well with the anatomical structure of the human body and correspond closely to the 3D spatial design requirements in pants patternmaking—length, girth, and curvature—thereby establishing a robust basis for subsequent clustering analysis.
Rotated component matrix from PCA of 27 lower-body anthropometric variables.
Cluster analysis of lower body shape in older women
Based on PCA, measurement feasibility, and variables’ representativeness for body shape description, six key morphological features were extracted from six principal components as classification variables. These features included AH, HC, HC--WC, SA, LVA, and BA. Based on the factor loadings of the principal components and the BIC, a TSC method was employed to categorize the samples into three distinct body types. The resulting model yielded a silhouette coefficient of 0.3. Although this value indicates a moderate level of cluster overlap, it is within an acceptable range for anthropometric studies, particularly given the continuous nature of body shape variation and substantial intra-group heterogeneity among older women. Further analysis of cluster centroids revealed clear distinctions among the three types in terms of HC--WC, HC, and AH, providing a solid structural basis for developing accurate body type-specific predictive models.
A summary of the clustering results is presented in Table 5. The clusters are ordered from top to bottom by descending proportion, namely Type 1, Type 2, and Type 3, which account for 45.45%, 27.75%, and 26.79% of the total sample, corresponding to 95, 58, and 56 participants, respectively. The importance of the classification variables, ranked in descending order, is as follows: HC--WC (1.00), HC (0.92), AH (0.63), SA (0.58), LVA (0.2), and BA (0.06). The first four variables contributed significantly more than the others and served as the primary discriminators of lower body shape.
Sample distribution and centroids of clustering variables from lower-body anthropometric analysis.
To further illustrate the morphological differences between the clustered body types, Figure 4 presents the intermediate shape of three subdivision types. By comparing the morphological characteristics of the three body types, it can be found that there are apparent differences in the front and side morphology of each body type. Based on the centroids of the clustering variables and intermediate shape, the three lower body shape types among older women can be characterized as follows:
Type 1 (Short, slender, with prominent buttocks): This group features the smallest HC and AH, but relatively large values for hip-to-waist difference, upper hip protrusion angle, and lateral abdominal angle. The BA is the highest among all groups. This body shape is petite yet curvaceous, with the buttocks projecting upward and backward, a pronounced waist indentation, and a forward-leaning posture.
Type 2 (Tall, robust, with protruding buttocks and backward tilt): This type exhibits the most significant overall body dimensions. It has the highest measurements for HC, AH, hip-to-waist difference, hip protrusion angle, and the smallest body axis angle. These features reflect a typical “large frame with prominent buttocks and a backward-leaning posture.”
Type 3 (Medium height, flat profile, with upright buttocks): Individuals in this group show intermediate values for AH and HC but the lowest values for hip-to-waist difference, hip protrusion angle, and lateral abdominal angle. The body contour is relatively flat, the buttocks are not prominently projected, and the standing posture is more vertically aligned.

Average frontal and lateral morphology of body shape types (Types 1–3).
Regression modeling of various body types
Selection of independent variables
The core objective of body size prediction is to estimate key dimensions that are difficult to measure directly by using a small number of easily accessible body measurements. This approach aims to reveal the quantitative relationships between variables. A reasonable selection of independent variables improves the model’s fitting performance and stability and enhances its practicality and operability. In this study, the selection of independent variables for regression modeling was grounded in the results of PCA, which served as the basis for dimensionality reduction and variable screening. The analysis revealed that components related to girth and height accounted for the most significant proportion of variance, indicating their dominant role in explaining lower body shape differences among older women. H and AC exhibited high loading values within the rotated component matrix within their respective factors. Given their strong explanatory power and ease of measurement in practical settings, these two variables were identified as the primary predictors in the regression models. Furthermore, W was incorporated as an additional independent variable due to its evident impact on the overall body contour. Correlation analysis confirmed that W was significantly associated with multiple lower body dimensions. As W is also easily measurable and has been repeatedly cited in the literature as a key determinant of body shape, it was included to enhance both the interpretability and predictive performance of the models. To summarize, H, AC, and W were ultimately selected as the independent variables for developing regression models tailored to each lower body shape category. These variables provide a practical and reliable foundation for predicting key dimensions related to pants pattern design in older women.
Regression modeling
Multiple linear regression models were developed for each of the three identified body shape types to enhance the prediction accuracy of critical lower body dimensions. Using H, W, and AC as predictors, regression models were constructed for 12 lower body dimensions closely related to pants pattern construction. Taking the WC model of body Type 1 as an example, the regression coefficients and associated statistics are presented in Table 6, and the regression equation is shown in equation (3):
To provide a more comprehensive presentation of the modeling results, Table 7 summarizes the regression equations for all predicted dimensions across the three body type categories. These equations highlight how H, W, and AC contribute differently to specific dimensions in each cluster, reflecting the morphological diversity among older women.
Regression models of lower body characteristics of three lower-body types.
Significance testing of the regression equation
To evaluate the statistical significance of the constructed regression models, an analysis of variance (ANOVA) was conducted for each body type category. As an example, the ANOVA results for the WC regression model of the Type 1 body shape are presented in Table 8. The model produced an
Evaluation of prediction errors
To evaluate the fitting performance of the regression models, 20% of the samples from each body type were randomly reserved for validation. Tables 9–11 present a detailed comparison between the predicted values (P) generated by the regression models and their corresponding original measurements (O), with sample IDs indicated numerically. Specifically, 19 samples from Body Type 1 (Table 9), 12 samples from Body Type 2 (Table 10), and 11 samples from Body Type 3 (Table 11) were included in the validation sets. These tables illustrate the closeness of fit between P and O for all predicted parameters, providing direct evidence of model accuracy. The overall prediction errors were further summarized using the RMSE, which is reported in Table 12 and discussed below.
Error comparison between predicted and original data for body type 1 (unit: cm).
Error comparison between predicted and original data for body type 2 (unit: cm).
Error comparison between predicted and original data for body type 3 (unit: cm).
RMSE for 12 lower body dimensions across 3 body types (unit: cm).
The RMSE values between the predicted and actual measurements for the 12 pants-related dimensions across the 3 body types are presented in Table 12. As shown, Body Type 2 exhibited the most stable prediction performance, with all RMSE values below 1.6 cm. Body Type 1 showed comparable results, while Body Type 3 demonstrated the highest prediction errors for variables such as AH, HH, and CL, with the maximum RMSE reaching 1.72 cm.
Discussion
This study aims to establish a lower-body structural size prediction method for older women based on body shape classification. By simplifying the measurement process and enhancing model adaptability, it provides accurate support for pants pattern design. To this end, we combined PCA and TSC to classify the lower-body shapes of older women into three representative categories. Subsequently, we built separate linear regression models for each group to predict 12 key structural dimensions.
In this research, the use of TSC not only enables automatic determination of the optimal number of clusters but also handles complex situations involving both continuous and categorical variables. Compared with the
The results demonstrated that satisfactory prediction accuracy was achieved across all body shape types, although the model performance varied among different categories. Evaluated using RMSE as the metric, Type 2 (tall, full-hipped) demonstrated the most accurate predictions, with the lowest RMSEs across most of the 12 key dimensions. This may be attributed to this group’s higher internal shape consistency and more distinct morphological features favoring linear regression modeling. Type 1 (short, curvy) also exhibited stable and accurate results, suggesting that linear models can effectively accommodate pronounced body contours and provide appropriate pattern designs for this body type. In contrast, Type 3 (medium height, straighter shape) showed larger prediction errors in variables such as AH, HH, and CL, with the maximum RMSE reaching 1.72 cm. This indicates that linear models have limitations when dealing with smoother contours or more complex body structures, which may result in decreased accuracy in pattern design. Moreover, the notable differences in prediction error among body types suggest that factors such as fat distribution and posture variability may influence the accuracy of size prediction. These findings are consistent with the previous research by Yan et al. 34 which also reported reduced prediction accuracy in atypical body shapes. Notably, the constructed regression models rely solely on 3 easily accessible variables—H, W, and AC—to predict 12 key structural dimensions essential for pants design. Most of the predicted values achieved a RMSE within ±2 cm, demonstrating a sound balance between prediction accuracy and engineering feasibility.
Despite the encouraging results, some limitations remain. First, the study sample was limited to older women from Northeast China, which may not fully represent the body characteristics of the national population or other ethnic groups, thus limiting the model’s generalizability. Second, the current regression models are linear, which may constrain their performance when predicting sizes for individuals with less distinct contours or more complex body structures. Third, although the selected input variables are easy to obtain and facilitate practical application, they fail to capture other potentially influential features of lower-body structure, such as hip width, leg length, and pelvic orientation. This omission may reduce the model’s predictive accuracy for atypical or transitional body types.
Conclusion
This study addresses the challenges of body shape diversity and poor sizing compatibility in traditional garments for older women by developing a predictive framework for pants structural dimensions based on a classification followed by modeling strategy. A total of 217 older women from Northeast China were recruited, combining 3D body scans with manual anthropometric measurements to establish a systematic pipeline from body type classification to size prediction. Six principal components were extracted through PCA, representing critical morphological factors related to height, girth, and body shape features. Based on these components, a TSC algorithm was employed to categorize the sample into three representative body types. Separate stepwise linear regression models were then constructed for each body type, using only 3 easily obtainable input variables—H, W, and AC—to predict 12 pants construction-related dimensions, including WH, HH, AH, HC, and CL.
Model evaluation showed that most predicted variables across the three body types achieved prediction errors within ±2 cm, demonstrating strong practical applicability. In practical apparel design, a circumference difference of 1.5–2.5 cm is generally regarded as an acceptable tolerance range. 35 Accordingly, the prediction errors obtained in this study fall within this reasonable industry allowance, which can effectively reduce fit problems such as an overly tight waistband or excessive looseness at the hips, thereby underscoring the practical applicability of the proposed method. The predicted dimensions can be directly applied as inputs for pants pattern construction. The proposed framework simplifies the modeling process and reduces both labor and technical barriers in pattern design. The findings of this study not only provide reliable input dimensions for pants pattern construction but also offer data support for optimizing pants structures and enabling personalized design across different body shape groups.
Future research can proceed in several directions: (1) expanding the sample size and regional coverage to include elderly women from a broader range of areas with more diverse body shapes, thereby enhancing the representativeness and applicability of the model; (2) exploring nonlinear modeling approaches (e.g. SVR, ANN, and ensemble learning), in combination with transfer learning or few-shot learning strategies, to improve the model’s performance in handling complex or transitional body shapes; and (3) integrating 3D anthropometric parameters with image recognition techniques to investigate the potential of multimodal inputs for apparel pattern modeling.
Footnotes
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Liaoning Provincial Department of Education scientific research project (JYTMS20230401).
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
