Abstract
This is a descriptive study using healthcare claims data from patients with T2DM from public and private healthcare insurance companies providing services in Puerto Rico in 2013, aimed to estimate the prevalence of comorbidities in this population. Descriptive analyses were performed by sociodemographic, and type of service variables using frequency and percent for categorical data or means (+/-SD) or median (IQR) for continuous variables. Chi-square, Fisher exact or two-sample t-tests were used for comparisons. A total of 3,100,636 claims were identified from 485,866 adult patients with T2DM. Patients older than 65 years represented 48% of the study population. Most patients were women (57%) and had private health insurance (77%). The regions of Metro Area (17%) and Caguas (16%) had the higher number of persons living with T2DM. The overall estimated prevalence of T2DM was 17.4%. The number of claims per patient ranged from 1 to 339. A mean of 6.3 claims (SD±9.99) and a median of 3 claims (Q1 1- Q3 8) per subject were identified. Of the 3,100,636 claims most (74%) were related to the diagnosis of diabetes (59%) and associated to outpatient services (88%). The most prevalent comorbidities were hypertension (48%), hyperlipidemia (41%), neuropathy (21%); renal disease (15%), and retinopathy (13%). A high prevalence and co-prevalence of comorbidities and use of healthcare services were identified in patients with T2DM, especially in older adults. Since most comorbidities were due to diabetes-related conditions, this analysis highlights the importance of early diagnosis and adequate management of T2DM patients to avoid preventable burden to the patient and to the healthcare system.
Keywords
Introduction
Type 2 diabetes mellitus (T2DM) is a progressive and chronic metabolic condition in which the ability to respond to the hormone insulin is impaired, resulting in abnormal metabolism of carbohydrates, elevated glucose levels (hyperglycemia), and could be associated with significant microvascular and macrovascular complications. Due to the high prevalence and potential complications, T2DM is considered as a growing public health problem. The International Diabetes Federation (IDF) estimated a global prevalence of T2DM among adults aged 18 – 99 years in 8.4% of the population. They expect a global increase up to 9.9% of the population in 2045. 1 Globally, about 79% of people with T2DM live in low and middle income countries. Regionally, the highest age-adjusted diabetes prevalence in adults was found in the North American and Caribbean Region at 10.8%.1,2 In the National Diabetes Statistics Report of 2020, the Centers for Disease Control and Prevention, estimated that as of 2018, 34.2 million Americans (10.5 percent of the U.S. population), have diabetes, including 7.3 million who are undiagnosed. 3 Also, a high prevalence was found among Hispanics (10.3%). Of those, Mexicans had the highest prevalence (14.4%), followed by Puerto Ricans (12.4%). In Puerto Rico, the report of 2020 of the Behavioral Risk Factor Surveillance System (BRFSS) estimated a T2DM prevalence of 15.8%. This prevalence increased significantly with age.4,5
T2DM have been associated with the development of severe complications and might co-exist with multiple chronic conditions.6-8 In Puerto Rico, there are limited real-world data describing the prevalence and co-prevalence of comorbidities in patients with T2DM. In a study, performed with a sample of 452 persons from the San Juan Metropolitan area, 15.2% of the participants reported to have T2DM. Of those, 74.4% reported hypertension, 53.7% reported hypercholesterolemia and 39% reported hypertriglyceridemia as comorbidities. 9 Among those with a diagnosis of diabetes, 80.5% reported to have three or more comorbidities. 9 These comorbidities might lead to an increased risk of mortality and functional decline, health resource utilization, and healthcare expenditures in this population.9-12 Therefore, better understanding of the comorbid conditions affecting these patients is essential to guide clinical decisions in order to prevent or facilitate the early detection and management of these comorbidities in this population.10-12
The primary objective of this study was to estimate the prevalence and co-prevalence of common comorbidities in patients with T2DM living in Puerto Rico (PR). Also, we aimed to evaluate the prevalence of the comorbidities stratified by age (18-44; 45-64; 65-74, ≥ 75 years), sex (male and female), geographical region (metropolitan, north, south, east, west) and resources utilization (hospital admissions, emergency room visits, and outpatient visits) and to quantify the number of claims associated with common comorbidities within the study population.
Methods
Study design
This study was a secondary analysis of a large database compiled by the Puerto Rico Department of Health (PRDoH) which includes information on the use of insurance claims from the public and private sector in Puerto Rico during the year 2013. The database includes information voluntarily provided by nine health care insurance companies. According to the PRDoH represents 95.9% of the insured persons in Puerto Rico at the time. This database was developed to increase available resources to better understand the burden of important conditions in Puerto Rico. The main database consisted of 51,349,185 claims received for 2013 representing 2,524,059 people. These figures represent approximately 70.2% of the population at that time. 13
The database included claims for health services provided during the 2013 year and billed to one of the insurers. In each claim, socio- demographic information such as: sex, age, and municipality of residence as well as the diagnoses assigned by the doctor in that visit or service were included. In addition, categorizes the type of healthcare insurance as public or private and the type of healthcare services received as hospitalizations, emergency room visits or outpatient visits. Each medical service recorded has a service date an associated diagnosis based using the International Classification of Disease, Ninth Revision, Clinical Modification (ICD-9) diagnosis codes. Each claim could have up to six ICD-9 diagnoses. 14
Study population
The study population was T2DM patients 18 years of age or older with at least one claim associated to a clinical encounter included within the PRDoH database between January 2013 to December 2013.
Inclusion and exclusion Criteria
We included all subjects ≥ 18 years old with ICD-9 codes 250.x0 and 250.x2. (T2DM). The study subjects are considered to have T2DM if they have an ICD-9 diagnosis code in their database recorded in any of the first five diagnoses associated at each claim associated to a clinical encounter. Patients with type 1 diabetes mellitus (T1DM; (ICD-9 code: 250.x1 and 250.x3 and patients with any secondary diabetes mellitus (DM; ICD-9 code: 249.x) were excluded from the analysis.
Study variables
Sociodemographic, clinical variables, and comorbidities were defined as follow: age groups (18-44 45-64, 65-74, ≥ 75 years), sex (Male, Female), geographic region (using standard healthcare regions of the Puerto Rico Department of Health), health care insurance (private, public) and type of health care service received (hospital admissions, emergency room visits, outpatient visits).
Comorbidities of interest
Although not limited, comorbidities of interest in this study included hypertension, hyperlipidemia, neuropathy, renal disease, retinopathy, urinary tract infection, cardiovascular diseases, hypoglycemia, congestive heart failure, overweight or obesity, dementia, major depressive disorder, liver disease, genital mycotic infection, anxiety disoreder, malignant neoplasm of respiratory and intrathoracic organs, mild cognitive imparime, and pancreatitis. These were selected based on the burden in the healthcare system and potential of complications for patients with T2DM
Database preparation and cleaning process
The main objective of this process was to identify potential errors in the database including incorrect data, incomplete data, duplicates, or irrelevant information. Also, for conducting this analysis, several transformations of the database were required. The database preparation process was divided into two phases. Phase I included familiarization with the structure of the database, develop a codebook, find missing values, formatting, and evaluation of the quality of the data in each variable. In this phase we also created two new variables to reclassify the municipalities into eight or six regions of the PRDoH. The cleaning of the first four ICD-9 diagnoses was also done.
During the phase II of the database preparation, we defined the working definitions for the comorbidities using the ICD-9 codes and completed the cleaning process of the database using the health claims as unit of analysis. The following were the main steps in the data cleaning process: description of selected variables, identification, and elimination of variables with “missing values”, identification of duplicates, validation of diagnoses by sex, validation of diagnoses by age and validation of diagnoses by year of claim.
For this, several algorithms were developed. The initial database contained 5,317,937 claims where a diagnosis of T2DM was recorded. Of these, 4,393 were eliminated for having “missing values” in the selected variables based on the following criteria: absence date of birth: 4,160 missing values, absence type of encounter: 101 missing values, absence date of services: 1 missing value, absence of primary diagnosis: 131 missing values. After that, a total of 5,313,544 claims remained in the database
After eliminating the duplicates and completing the validation process and exclusion criteria, we used the patient id variable to transform the database using the “duplicate drop” and “reshape” in Stata. In these steps, we maintained the first observation and claim for each patient. This allowed us to have a database for the analysis of sociodemographic variables at the patient’s level (patient as unit of analysis). After applying these commands, a new database of patients was created with a total of 485,866 patients.
Statistical analysis
Descriptive analyses of sociodemographic, clinical and utilization variables were performed. Continuous variables were reported as mean (and standard deviation) or median with corresponding percentiles where appropriate. Categorical variables were summarized as frequency and percent of the total study population, and by predefined subgroups. Chi-square or Fisher's exact tests for categorical variables and two-sample t-tests for continuous variables were used for comparisons. A p value < 0.05 was considered for statistical significance.
Prevalence estimates by healthcare region were done using for denominators the Census estimated population of persons >18 years in Puerto Rico living in the respective municipalities in 2013. All analyses and generation of tables, listings and data for figures were done using STATA® version 14.2 or higher (STATACorp., College Station, Texas, USA). 15
This study was approved by the University of Puerto Rico Medical Sciences Campus Ethics Committee (IRB approval – Protocol A3490120) and the need to obtain consent was waived due to the use of anonymized data provided by the PRDoH for this secondary analysis.
Results
Socio-demographic characteristics of patients with Diabetes Mellitus type 2 in Puerto Rico, 2013 (N = 485,866).
*p value < 0.05.

Geographical distribution of estimated prevalence of Diabetes Mellitus Type 2 by Healthcare Regions in Puerto Rico, 2013.
Prevalence of selected comorbidities in patients with Diabetes Mellitus Type 2 by age group, sex, healthcare region and type of health insurance, Puerto Rico, 2013 (N = 485,866).
*p value < 0.05.
**p value < 0.05 for all comorbidities.
HTN = Hypertension; HLip = Hyperlipidemia; NEU = Neuropathy; RD = Renal Disease; RET = Retinopathy; UTI = Urinary Tract Infection; CVD = Cardiovascular Disease; HPG = Hypoglycemia; CHF = Congestive Heart Failure; O/O = Overweight/Obesity; DMT = Dementia; MDD = Major Depression Disorder; LD = Liver Disease; GMI = Genital Mycotic Infection; AD = Anxiety Disorder; MNR = Malignant neoplasm of respiratory and intrathoracic organs; MCI = Mild Cognitive Impairment; PAN=Pancreatitis.
Prevalence and co-prevalence of selected comorbidities in patients with T2DM, Puerto Rico, 2013 (N=485,837).
HTN = Hypertension; HLip = Hyperlipidemia; NEU = Neuropathy; RD = Renal Disease; RET = Retinopathy; UTI = Urinary Tract Infection; CVD = Cardiovascular Disease; HPG = Hypoglycemia; CHF = Congestive Heart Failure; O/O = Overweight/Obesity; DMT = Dementia; MDD = Major Depression Disorder; LD = Liver Disease; GMI = Genital Mycotic Infection; AD = Anxiety Disorder; MNR = Malignant neoplasm of respiratory and intrathoracic organs; MCI = Mild Cognitive Impairment; PAN=Pancreatitis.
Healthcare utilization claims by selected comorbidities and type of encounter in patients with Diabetes Mellitus Type 2, Puerto Rico, 2013 (N=3,100,637).
Summary statistics of claims by Health Insurance among patients with Diabetes Mellitus Type 2, Puerto Rico, 2013 (N=485,838).
*p value < 0.05.
Figure 2 shows the distribution of claims by type of encounter and age group. Claims of younger adult patients (agegroups 18-44 and 45-64 years) were mostly associated with a larger proportion of hospitalizations and emergency room services whereas claims in older patients (≥ 65 years) were mostly associated to a larger proportion of outpatient services. Distribution of type of encounter by age group among patients with Diabetes Mellitus Type 2, Puerto Rico, 2013 (N=3,100,637).
Discussion
In our study, using an administrative database of healthcare utilization developed by the PRDoH, we found a higher estimated prevalence of diabetes than other estimates based in other population-based data5,16 and a high frequency of co-morbidities in those patients with T2DM with more than 99% of the patient receiving services for at least one co-morbidity.
The frequency and types of comorbidities identified in our study are consistent with other studies estimations using administrative data. In most studies, a high prevalence of comorbidities was identified in the T2DM population. One large US study examining electronic medical records found that 88.5% of patients had at least two comorbidities in addition to their T2DM. 17 This was also consistent with the estimates of comorbidities identified in other studies previously done in Puerto Rico 9 Our findings are also similar to other studies using different sources of data to estimate the prevalence of comorbidities including cohorts of patients or based on medical record review.18-22
We also found that most comorbidities requiring use of health services were related to diabetes complications or cardiovascular diseases such as hypertension, hyperlipidemia, neuropathy, and renal disease. This was also a finding consistent with similar studies done in different countries including US, Belgium, and Spain. In these studies, the same diseases were identified as important comorbidities in patients with T2DM.23-26 Of the comorbidities of T2DM patients identified in other studies, only obesity was not identified as an significant comorbidity in our study. This might be related to problems with the disease classifications in the database used. It is possible that this condition, even if present in the patient, was not classified as a diagnosis by the provider of services. Also, since this database was done using utilization data if the services received were not related this comorbidity it might not be identified in the database. These factors, which are related to the source of data used for this database, could produce an underestimation of the prevalence of this condition among patients with T2DM.
In our analyses we found that most patients with T2DM were women with a significant proportion of the population being older than 65 years, with private health insurance. We also identified regional differences in the estimated prevalence of T2DM. The health care region of Caguas had a higher estimated prevalence of T2DM than other areas. A study done by Tierney et all, 27 using data from the BRFSS, also found small geographic variations in T2DM prevalence in PuertoPuerto Rico. In his analysis, after adjustment, some municipalities in the Caguas region including San Lorenzo, Las Piedras, Yabucoa, and Maunabo were amongst the municipalities with a high prevalence of T2DM (14% or more). Nevertheless, the highest prevalence was found in the Northern area of Puerto Rico. That difference could be related to the type of data used to estimate the prevalence. In our study we used healthcare utilization data and not to self-reported diagnosis. The regional differences found in our study could be associated to differences in access to healthcare services, instead of real differences in the prevalence of T2DM.
In our claim analysis, we found that although most claims were from patients older than 65 years, claims in younger adult patients (45-65 years) with T2DM were mostly associated with hospitalizations and emergency room services whereas claims from older patients (>65 years) were more associated to the use of outpatient services. These observations have been documented in previous studies. Among adults aged 18 to 44 in Minnesota, researchers found that compared with older adults with diabetes, adults aged 18 to 44 were more likely to be hospitalized for diabetes as a primary cause than older adults with diabetes. 28 In a recent study among U.S. adults with diabetes, the researchers found that the highest proportion of recurrent hospitalizations for severe hyperglycemia were among young adults of <45 years of age (i.e. 41% <45 years old compared with 37.1% for 45 – 64 years and 21.9% for ≥65 years respectively. 29 This finding could be related to several factors. The use of hospital services in younger adult patients could indicate more severe disease or more complications associated with more recent diagnosis. It is well described that because T2DM onset can be asymptomatic, it could remain undiagnosed for years with recent diagnosis associated with higher use of health care services. Another potential explanation could be poorer T2DM management or less use of preventive services in adults under 65 years, resulting in more complications and worst outcomes.
We also found significant differences in the prevalence of comorbidities and the number of claims based on the type of health insurance of the patients. T2DM patients with public health insurance had significantly lower number of claims than those with private insurance. This might be associated with several factors including underutilization, underdiagnoses or management problems in T2DM patients with public insurance. In addition, these differences could be related to differences in the age and sex distribution of these populations. Disparities in the management of patients with T2DM could result in an increase burden of complications in the affected populations. A study done using the Diabetes Care Survey of the 2010 Medical Expenditure Panel Survey examined the association between quality of diabetes care and type of insurance coverage, race/ethnicity, and socioeconomic status. 30 Their findings suggested that insurance coverage could represent the greatest impact in ensuring equitable distribution of quality diabetes care, regardless of race/ethnicity or socioeconomic status. Overall, it has been shown that having health insurance have been associated with greater chances of receiving better diabetes care and meeting glycemic targets.31–33 Now, results were not conclusive in terms of the correlation between differences in diabetes quality measures between the privately and publicly insured. 33,34 Additional research is needed to determine optimal coverage to maximize care quality. 30
Our study has several limitations. Since we based our analysis in administrative data of healthcare utilization, we can only identify comorbidities associated with use of services during the year of study. This means that even if the patient suffers from a comorbidity if the service received was not related to those comorbidities it would not be showed in the database. This could result in an underestimation of comorbidities that did not require frequent access to healthcare or following up. Another limitation is the potential for misclassification due to the use of diagnostic codes. Our working definitions for the diagnoses of T2DM and comorbidities were based on ICD-9 codes. This allowed us to standardize the criteria for participant’s selection and to classify the comorbidities. Although this was a standardized variable, it depends on the administrative criteria associated with the clinical encounter and not on an extensive evaluation of patient’s medical history. Therefore, the possibility of incorrect diagnosis or overdiagnoses exists if the code was used to perform screening or diagnostic test and not necessary for the management of the disease. In addition, an existing comorbidity might not be coded in the database. This could be the case of obesity, a common diagnosis typically associated with T2DM that was not found as one of the main comorbidities in this study. Another limitation is the fact that this database is based on healthcare utilization data might confound the interpretation of important sociodemographic indicators such as geographic location, limiting its use for the estimation of the prevalence of the disease in these populations.
Since this study was done using administrative healthcare utilization data, thorough attention was done to eliminate potential errors in the database that could lead to incorrect analyses. In the process of database cleaning, we conducted a rigorous process to identify and eliminate duplicates and validate the diagnoses by sex, age, and year of claim to eliminate the potential of misclassification. Given the limited sources of data and the lack of uniformed databases for the study of chronic diseases, such T2DM, in Puerto Rico the use of healthcare utilization data could be a potential resource to better understand the burden of these diseases in Puerto Rico.
Conclusion
A total of 3,100,636 claims were identified from 485,866 adult patients with T2DM. Most patients with T2DM were women, older than 65 years, with private health insurance and the most prevalent comorbidities were hypertension, hyperlipidemia, and neuropathy. Patients with public health insurance had lower number of claims than those with private insurance suggesting underutilization, underdiagnoses or management problems in T2DM patients with public insurance. Younger adult patients (45-65 years) with T2DM use more hospitalization and emergency room services whereas older patients (≥ 65 years) use more outpatient services suggesting more severe disease or poorer T2DM management in adults under 65 years.
Although this study was limited by the type of data used (utilization data) and important comorbidities frequently associated with diabetes such as obesity and overweight were not identified, probably due to underreporting, it provides important information regarding potential health disparities existing among patients with T2DM in Puerto Rico. The high prevalence of comorbidities and use of healthcare services that was identified in patients with T2DM, especially in older adults, requires the evaluation of current interventions to address this population. Since most claims were associated to diabetes-related conditions, this analysis highlights the importance of early diagnosis, the importance of reporting comorbidities to avoid underreporting: for example obesity was not reflected because it may not be recognized as independent diagnosis in the diabetic patient, and adequate management of T2DM patients to avoid preventable burden to the patient and to the healthcare system.
Footnotes
Acknowledgements
Research in this publication was supported in part by Merck Sharp & Dohme Corp., a subsidiary of Merck & Co., Inc., Rahway, NJ, USA and the National Institute on Minority Health and Health Disparities of the National Institutes of Health under Award Numbers 5S21MD000242, 5S21MD000138. The content does not necessarily represent the official views of the National Institutes of Health, the University of Puerto Rico or Merck & Co., Inc., Rahway, NJ, USA. We thank the Puerto Rico Department of Health for granting access to the utilization database.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
This work was supported by Merck Sharp & Dohme LLC, a subsidiary of Merck & Co., Inc., Rahway, NJ, USA.
Appendix
ICD-9 codes for selected diagnoses.
Comorbidities
Code (ICD9)
Cardiovascular disease
430, 431, 433.01, 433.11, 433.21, 433.31, 433.81, 433.91, 434.01, 434.11, 434.91, 436, 437.1, 410.00, 410.01, 410.02, 410.10, 410.11, 410.12, 410.20, 410.21, 410.22, 410.30, 410.31, 410.32, 410.40, 410.41, 410.42, 410.50, 410.51, 410.52, 410.60, 410.61, 410.62, 410.70, 410.71, 410.72, 410.80, 410.81, 410.82, 410.90, 410.91, 410.92, 411.89, 412, 414.00, 414.01, 414.02, 414.03, 414.04, 414.05, 414.06, 414.07, 414.8, 414.9, V45.81, V45.82, 440.21, 440.22, 440.23, 440.24, 440.30, 440.31, 440.32, 440.4, 443.24, 443.89, 443.9
Congestive Heart Failure
398.91, 402.01, 402.11, 402.91, 404.01, 404.03, 404.11, 404.13, 404.91, 404.93, 425.4, 425.5, 425.7, 425.8, 425.9, 428
Pancreatitis
577.0, 577.1
Malignant neoplasm of respiratory and intrathoracic organs
162, 162.0, 162.2, 162.3, 162.4, 162.5, 162.8, 162.9
Dementia
294.20, 331.82 , 290.4 , 331.0,
Major Depressive Disorder
296.2
Hyperlipidemia
272.0-272.4
Hypertension
401.x- 405.x
Liver disease
571.x, 572.x
Mild Cognitive Impairment
331.83
Anxiety Disorder
300.02
Hypoglycemia
251.0, 251.1, 251.2, 250.8 (modified algorithm by Ginde et al)
Renal Disease
403, 404, 582, 585, V56, 586, 583.0, 583.1, 583.2, 583.4, 583.6, 583.7, 588.0, V42.0, V45.1, 250.4
Retinopathy
250.5, 362.0, 362.1, 362.2, 362.83, 362.53, 362.81, 362.82, 379.23, 361.xx, 369.xx,
Neuropathy
250.6, 358.1, 951.0, 951.1, 951.3, 354.0, 355.9, 713.5, 357.2, 337.0, 337.1, 564.5, 536.3, 458.0, 596.54, 356
Urinary tract infection
590.0, 590.1, 599.0, 595.x,
Genital mycotic infection
112.1, 616.1 (female); 607.1 (male)
Overweight/Obesity
278.0, 649.1
Database cleaning process.
