Abstract
The estimation of biological sex is a critical step in the assessment of the biological profile of an anonymous skeletonized individual. In certain recovery circumstances, the most dimorphic skeletal areas, such as the pelvis, are absent or fragmented; in that case, other bones of the skeleton, including the clavicle and scapula, can be used to predict sex. The purpose of this research is to generate new models for the estimation of sex with clavicular and scapular measurements using a study-sample of 129 individuals with clavicle (65 males and 64 females) and 112 individuals with scapula (50 males and 62 females) from the Lisbon Identified Skeletal Collection (Portugal). A decision tree classifier (C4.5) and logistic regression (LR) were employed to create univariable and multivariable sex prediction models. Accuracy under cross-validation of the classification models is high (up to 93.8%), with minimal bias (<5%), particularly in the multivariable models. The proposed LR models facilitate the probabilistic estimation of biological sex, accounting for the significant overlap in the expression of sexual dimorphism.
Introduction
The assessment of sex is a fundamental technical step to establish the biological profile of unknown individuals, along with age at death and stature.1,2 In the last decades, forensic methods to estimate biological sex have been tested extensively and the consensus is that the pelvis is the most dimorphic anatomical area of the human skeleton, conferring accuracy results close to 100% when the anatomical structure is complete.3,4 Skeletal sexual dimorphism is particularly conspicuous in the pelvic girdle due to evolutionary constraints related to parturition. Unfortunately, the pelvis is often missing or too fragmented to be morphologically evaluated or measured. For example, the key pelvic areas described in 1969 by Phenice 5 – the ventral arc, the subpubic concavity, and the medial aspect of the ischiopubic ramus – are fragile, and often damaged or not recovered in both forensic and archaeological contexts.6,7 Other anatomical regions of the skeleton, including the long bones and skull, have been successfully used to predict sex of human remains.2,8,9
Skeletal sex estimation is generally centered around morphological or metric methods. Molecular methods are accurate but expensive and technically demanding.2,9 Morphological methods evaluate sexually dimorphic discrete traits in a bone, and are usually implemented in the pelvis, 10 skull 11 or long bones such as the humerus. 12 Metric methods can be applied in virtually all the skeleton bones, although with variable degrees of accuracy,13–22 being more objective and easily replicated. 23 Several factors can affect the degree of sexual dimorphism or bone proportions in different populations, such as genetics, climate, physical activity or mobility and nutritional status. 24
Studies focusing on the clavicle and scapula have a long tradition in the anthropological research landscape,25–27 and these two bones of the shoulder girdle, together or autonomously, have been used to generate metric methods for sex estimation in different population samples, e.g.28–39 with accurate results. Sexual dimorphism of the scapula and clavicle was also evaluated in Portuguese reference samples, but the research focus has been mostly descriptive, methodologically constrained or not particularly concerned with sex prediction.40–44 This paper aims to fill a void in the literature by assessing the potential of the clavicle and the scapula for skeletal sex estimation in a sample of modern identified skeletons from Portugal.
Materials and methods
Sample
The sample selected for this study is part of the Lisbon Identified Skeletal Collection (aka Luís Lopes Collection), curated at the National Museum of Natural History and Science, Lisbon, Portugal. This collection comprises more than 1700 skeletons with documented assigned sex at birth (biological sex), place and year of death. In 678 cases, the age at death, occupation, and cause of death are also known. The context of the amassment of the collection, the type of occupations of male individuals and other details such as the percentage of cases with dental treatments are suggestive that most individuals belonged to the middle class.45,46 A random sample, following the criteria of sex and age representation, and completeness of the bones, was selected. In total, 129 individuals with clavicles (65 males and 64 females) and 112 individuals with scapula (50 males and 62 females) were co-opted into the study-sample. Ages at death ranged from 20 to 88. All individuals in this sample were born between 1849 and 1928, and died between 1898 and 1969.
Data collection
Both right and left bones with fused epiphyses were measured, but only data from almost complete bones were retrieved. Six measurements (in mm) of the clavicle were used: the maximum length (C1); circumference in the middle of the bone (C2); anterior-posterior diameter at mid-bone (C3); top-bottom diameter at mid-bone (C4); maximum breadth of the sternal end (C5); and maximum breadth of the acromial articular end (C6). Similarly, in the scapula were selected six measurements: maximum length (S1); maximum scapular breadth (S2); scapular line length (S3); infraspinous line length (S4); height of the glenoid fossa (S5); width of the glenoid fossa (S6). Measurements are depicted in Figure 1 and defined in Table 1. All measurements were taken using a Mitutoyo© sliding caliper and a standard osteometric board. Due to differential preservation, sample sizes differed slightly for each bone, measurement, and sex group. In a random sub-sample of 20 individuals, the measurements were collected twice by the same observer (2 weeks interval between measurements) and by two different observers. Intra- and inter-observer errors were estimated through the Technical Error of Measurement (in mm, TEM) and the Relative Technical Error of Measurement (in %, rTEM). 47 Measurements with an rTEM ≤ 5% were considered precise. 48 The data that support the results of this study are available upon request.

Representation of the scapular and clavicular measurements; maximum length of the scapula (S1); maximum scapular breadth (S2); scapular line length (S3); infraspinous line length (S4); height of the glenoid fossa (S5); width of the glenoid fossa (S6); maximum length of the clavicle (C1); circumference in the middle of the clavicle (C2; maximum breadth of the sternal end (C5); and maximum breadth of the acromial articular end (C6). The anterior-posterior diameter at mid-bone of the clavicle (C3) and top-bottom diameter at mid-bone of the clavicle (C4) are taken at the same location as C2.
Osteometric variables of the clavicle and scapula, with their abbreviation, respective name, and definition.
Statistical analysis
Descriptive statistics were ascertained for continuous variables. The values of skew and kurtosis were used to assess the normality of the variables’ distribution, 49 and a Levene's test for the evaluation of homoscedasticity. An independent samples t-test was employed to assess the null hypothesis that the means of scapular and clavicular measurements were equal in both sex groups.
The models for sex prediction were contrived through logistic regression (LR) by way of stepwise variable selection. Both univariable and multivariable methods were created. LR is a classical non-parametric algorithm used for classification via modeling the probability of occurrence of one of two mutually exclusive classes of a binary dependent variable. The mathematical notation of the logistic function is of the form:
Additionally, sectioning points were generated with a decision tree classifier (C4.5). This algorithm iteratively visits individual decision nodes, picking the best possible divide, and constructing decision trees employing the concept of information entropy.53,54
The statistical classifiers’ training was attained with a 10-fold cross-validation procedure for overfitting or underfitting avoidance, and to safeguard the generalization of the models to unrelated data sets.
The prediction performance of the cross-validated models and sectioning-points were evaluated through the overall accuracy (a measure of the total correspondence between the documented and the predicted sex, presented as a proportion), recall (the proportion of instances classified as a certain class divided by the total of instances present in that class; the same as sex-specific accuracy, for example, the proportion of females properly grouped), bias (the absolute difference between the recall in males and the recall in females), and the Kappa statistic (compares the observed accuracy with an expected accuracy that takes random chance in consideration).54,55
The statistical analyses and graphical depictions were executed with WEKA (Waikato Environment for Knowledge Analysis, v. 3.8.6) 56 and R programming software. 57
Results
Intra- and inter-observer measurement errors were negligible, with all the measurements showing an rTEM ≤ 5%. Regarding intra-observer error, TEM ranged from 0.14 (S6) to 0.74 (C1), while rTEM ranged from 0.22% (S1) to 2.87% (C5). TEM varied from 0.39 (C4) to 1.51 (S4), and rTEM from 0.38% (S1) to 4.45% (C5) in the inter-observer assessment. As expected, inter-observer error was greater than the intra-observer error,48,58 and the clavicular measurements were relatively more prone to errors.
Descriptive statistics for the clavicular and scapular measurements are summarized in Table 2. Sex differences in all measurements of the clavicle and scapula are statistically significant at the 0.001 significance level. The sectioning points generated with the C4.5 classifier for each measurement of the clavicle and scapula, and their performance metrics are provided in Table 3. Accuracy under cross-validation of these dichotomous trees varies between 0.691 (C5: maximum width of the sternal end of the clavicle; sectioning point: 25.4 mm; Kappa: 0.377) and 0.832 (C1: maximum length of the clavicle; sectioning point: 146.0 mm; Kappa: 0.664) in the clavicle, and between 0.830 (S2: maximum width of the scapula; sectioning point: 95.9 mm; Kappa: 0.648) and 0.885 (S1: maximum length of the scapula; sectioning point: 147.0 mm; Kappa: 0.763) in the scapula. Bias is usually larger for the clavicular sectioning points, ranging from 0.004 (C4: top-bottom diameter at mid-bone) to 0.442 (C5: maximum width of the sternal end), while for the scapula it varies between 0.071 (S4: infraspinous line length) and 0.226 (S3: scapular line length).
Descriptive statistics for clavicular and scapular measurements in both sexes; Lisbon Collection (LC).
SD: standard deviation; 95% CI: 95% confidence interval; N: sample size;
Sectioning points and goodness of fit for the C4.5 tree models generated with the clavicle and the scapula.
Both univariable and multivariable models (each of these models stand for
Logistic regression models with clavicular and scapular measurements.
C: only clavicular measurements; S: only scapular measurements; CS: clavicular and scapular measurements.
Goodness of fit for the univariable and multivariable logistic regression models.
Discussion
Skeletal sex is usually the first parameter of the biological profile to be assessed as most age at death or stature estimation methods are sex-specific.2,24 During human evolution, sexual dimorphism has become less pronounced, but in all modern human populations males are still on average taller, heavier and more robust than females. 59 Less dimorphic, but more resilient, skeletal elements have been used to predict sex in different recovery contexts, including the scapula and the clavicle. Scapular conservation is unbalanced; as the distal part of the body is fragile, while the glenoid cavity is very resistant to taphonomic factors. The clavicle shows an overall good preservation, and both bones are usually well represented in archaeological and forensic settings.6,7
Sexual dimorphism in the human scapula and clavicle is mostly size-related8,28,29,35–37 – but see Scholtz et al. 60 –, with males generally presenting larger bones than females. Sexual size dimorphism is a recurrent biological phenomenon in different species, with the differences in the size of males and females stemming from different evolutionary mechanisms, including sexual selection and competition for resources. 61 The results presented here mimic previous works focused on the scapula and clavicle, with significant differences between sexes in all osteometric dimensions, categorically substantiating the potential of these bones to accurately predict the assigned sex in human skeletal remains. Notwithstanding, not all of the measured variables reliably depict patterns of sexual dimorphism in the scapula and clavicle while also providing good models for the prediction of sex.
As expected, univariable models featuring sectioning-points obtained with a decision-tree algorithm do not offer suitable performances, with some models showing good accuracy but unacceptable bias (for example, Model C1: measurement = maximum length of the clavicle; sectioning point = 146.0; accuracy = 0.832; bias = 1.128) and others showing a poorer accuracy but acceptable bias (for example, Model C4: measurement = superior-inferior diameter at mid-clavicle; sectioning point = 9.0; accuracy = 0.725; bias = 0.002). Measurement S5 (height of the glenoid fossa, sectioning point = 33.7) displays the most promising results of the scapula, although bias is slightly above what is desirable in a forensic context. 62
This substandard performance of univariable models has been observed before in other sex estimation methods generated in Portuguese reference collections.58,63 Interestingly, univariable LR models are both accurate and unbiased, showing the capability of this classical classifier algorithm to optimize the separation of traits that overlap considerably between the sexes.58,64,65 Logistic regression is a simple and elegant classifier that provides accurate results in forensic research contexts, similar or better than other statistical approaches,58,64,66 supporting a probabilistic prediction of sex that conforms with the clinal manifestations of sexual dimorphism and the rules established in the aftermath of the Daubert v Merrell Dow Pharmaceuticals Inc. judicial case.2,65 Multivariable models depict even better performance metrics, with the hybrid models that employ measurements of both the clavicle and the scapula showing the best results. Bone-specific accuracies are slightly lower, with multivariable models based on scapular measurements showing a superior goodness-of-fit than their clavicular counterparts, a result template previously observed by other researchers.33,35 Thus, and even though both bones provide good sex prediction capabilities, the scapula presents marginally better performance metrics.
As the clavicle and the scapula are both part of the shoulder girdle, being functionally related, 67 it is curious to note that, although there are several studies with a focus on the sexual dimorphism of one36,38,39,68,69 or the other,28,30,31,34,37,40,60,69,70 fewer have investigated and generated sex prediction models with the two bones simultaneously.32,33,35,71 One of the advantages of using different combinations of measurements of the two bones stems from the diverse and heteroclite nature of real-case recovery scenarios. The scapula is a somewhat fragile bone, but that is only particularly obvious in the thin and flat areas above and below the spine, while the acromion, the coracoid process, and the glenoid fossa typically preserve well. Fittingly, logistic regression models that employ measurements of the glenoid fossa – alone or in combination with other scapular or clavicular measurements – succeed as good options for an exact and balanced sex estimation. The relevance of the glenoid fossa measurements for the estimation of sex is also highlighted in previous studies.28,31,35,37,72
The proposed sex estimation models present an accuracy under cross-validation ranging from 0.819 to 0.938, on par with methods that employ skeletal regions that are more sexually dimorphic, including the skull, the femur, and other long bones, e.g.18,22,58,73–75. Results are also comparable with accuracy figures from sex prediction methods devised with measurements of the clavicle and the scapula in other populations, e.g.28,29,31–33,35,36,38,70,71 further validating the suitability of these bones for sexing anonymous human skeletal remains. In closing, both the scapula and the clavicle exhibit sexual dimorphism, and can be employed in both forensic and bioarchaeological contexts to predict sex accurately and unbiasedly when the pelvis is fragmented, or missing.
Final remarks
This study provides results that substantiate previous research asserting the relevance of two bones of the shoulder girdle – the scapula and the clavicle – for sex estimation. The prediction models generated in a sample from the Lisbon Identified Skeletal Collection allow to estimate sex with a cross-validated accuracy up to 93.8%; thus, these bones can be used to sex unknown individuals when other more dimorphic bones are exceedingly fragmented or simply not available. Proposed data analyses include decision tree classifier sectioning-points and logistic regression models – the latter allows for a probabilistic assessment of sex. The models are also heteroclite, enabling the use of different variables of the clavicle and scapula, in concert or isolated, thus safeguarding different recovery circumstances.
Footnotes
Acknowledgements
The authors are grateful to the Research Centre for Anthropology and Health (financed with national funds by FCT – Fundação para a Ciência e Tecnologia, under the project UID/SADG/00283/2019) and the Centre for Public Administration and Public Policies (financed with national funds by FCT – Fundação para a Ciência e Tecnologia, under the project UIDB/00713/2020).
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
