Abstract
Body weight, height, and other simple, noninvasive anthropometric measures are the cornerstones of epidemiological research. Body composition determinants such as fat and lean tissue masses and their distributions are better associated with metabolic conditions, such as diabetes, than anthropometrics alone. However, body composition is generally more challenging to measure. This analysis article comments on the manuscript by Cichosz et al that appeared in this issue of the Journal of Diabetes Science and Technology, where a machine-learning approach was developed to predict body composition using measured anthropometric parameters for potentially easier estimations of risk factors of metabolic diseases in the future.
Like it or not, our bodies come in different shapes and sizes. Anthropometric measurements have been performed throughout the history of medicine, such as body weight, height, segmental lengths, circumferences, and skinfolds. They are noninvasive to the study participant, quick to perform, inexpensive to obtain by a trained tester, and can be obtained relatively easily. However, they often do not paint the whole picture of what is inside of the body or how they may be linked to health risks. Our bodies are mainly composed of water, fat, proteins, and minerals on a molecular level, and fat, lean, and bone on a tissue level. The right compositions is important: too much fat leads to obesity; too little fat is called lipodystrophy; reduced skeletal mass is seen in cachexia due to cancers and sarcopenia; and low bone mass is linked to osteoporosis. Among the three major tissue components, fat is the most variable between and within individuals. The subcutaneous adipose tissue is the crucial storage site for the majority of excess energy within the body. At the same time, lipids can also be found in the abdominal cavity (visceral adipose tissue) and in liver and muscles. The latter fat depots have been related to cardiovascular diseases, 1 non-alcoholic steatohepatitis, 2 and type 2 diabetes. 3
Current research methods for body composition assessments include 1) underwater weighing, 2) bioelectrical impedance analysis (BIA), 3) stable isotope dilution, 4) dual-energy X-ray absorptiometry (DXA), 5) computed tomography (CT), and 6) magnetic resonance imaging (MRI). A DXA scan uses two energy levels of X-rays to differentiate bone, lean, and fat tissues and has excellent accuracy precision for both total fat and lean tissues (reproducibility of 1%-2% to 0.5%-2% 4 ). Even though the cost of a new DXA scanner is considerable ($75,000-230,000), the benefits are vast and the risks are minimal. The exposure to ionizing radiation of a total body scan is low (one to four microseivert, or about one normal day of background exposure 5 ), and it is quick to perform (10-15minutes). While many body composition experts hesitate to call DXA the “gold-standard” for body fat and lean tissue measurements, many researchers have gravitated toward using DXA measures as better biomarkers of metabolic risk factors compared to anthropometric and demographic parameters. However, in places where a DXA is not available (e.g. rural communities) and/or for whom it is inappropriate (e.g. pregnancy, developing children), the noninvasive and nonintrusive anthropometrics remain important health assessment tools.
In the current issue of Journal of Diabetes Science and Technology, Cichosz et al 6 analyzed the publicly available National Health and Nutrition Examination Survey (NHANES 1999-2006) database containing cross-sectional measurements of DXA body composition and detailed anthropometric measurements in over 18,000 heterogeneous US adults and youths. The goal was to develop algorithms to predict body compositions accurately using only anthropometrics such as weight, height, and circumferences as inputs.
The concept of modeling lean and fat mass is not new. Using the adult populations from the same NHANES database, Lee et al 7 were able to explain >85% (R2) of the between-individual variability based on different combinations of anthropometric measures using linear multiple regression. Cichosz et al 6 took an innovative machine-learning approach and developed an Artificial Neural Network (ANN) with a multilayer perceptron network. Each layer contained multiple nodes (or neurons), and coefficients (weights) and transfer functions were applied to the input data to regulate the strength of complex relationships between individual nodes and layers. The ANN was first trained by using 70% of the measured DXA data, and the ANN performance was evaluated by a validation dataset comprised of the remaining 30% of measured DXA data not used to develop the model. For such an endeavor, a large cross-sectional and heterogeneous dataset is required making this an ideal use case for the NHANES database.
The successes in the Cichosz et al’s 6 ANN model can be seen not only in high predictive R2 values and low standard errors of estimation (SEE) for total fat and lean body mass cross-sectionally but also in the similar performances for total trunk fat mass which may be more closely linked to diabetes. However, like any advances, there are several unaddressed questions for future research to tackle. Compared to the simple regression models that can be readily applied to equations for field use, the ANN is a “black-box”. The current model has 14 demographic and anthropometric input parameters, ten hidden layers, and ten nodes per layer, each with unpublished weights and transfer functions. As such, it is currently only a proof-of-concept of what can be done rather than a plug-n-play model. It is our hope that this ANN model will soon be released in a publicly available and user-friendly format (e.g., shareware or open-source software). Moreover, instead of using adjusted beta coefficients, adjusted R2, and P values in multiple regression models to evaluate the strengths of different anthropometric contributions, Cichosz et al 6 demonstrated a qualitative measure of importance by adding each of the input parameters to the ANN model and evaluating SEE values. A logical future step could be to simplify the ANN topology by reducing the parameters that did not influence SEE, such as arm length and circumferences, leg length, and thigh circumference, and reduce the number of hidden layers and/or nodes to result in a less complex and more generalizable model without sacrificing accuracy.
More importantly, future research could also determine the precision of the ANN model in estimating the changes in body composition in longitudinal and interventional studies. Similar approaches should be explored to model health parameters in addition to body compositions, such as blood pressure, fasting lipids, glucose, and even insulin sensitivity. With these efforts, we may be able to estimate an individual’s risks of diabetes and other metabolic diseases using field-ready anthropometric measurements where/when research-grade body composition methods are not available or applicable.
Footnotes
Acknowledgements
I would like to thank Drs. Samuel LaMunion and Amber Courville for their editorial assistance.
Declaration of Conflicting Interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The following are the funding sources: Z01 DK071013, Z01 DK071014.
