Abstract
This study evaluates mobile apps using a theory-based evaluation framework to discover their applicability for patients at risk of gestational diabetes. This study assessed how well the existing mobile apps on the market meet the information and tracking needs of patients with gestational diabetes and evaluated the feasibility of how to integrate these apps into patient care. A search was conducted in the Apple iTunes and Google Play store for mobile apps that contained keywords related to the following concepts of nutrition: diet, tracking, diabetes, and pregnancy. Evaluation criteria were developed to assess the mobile apps on five dimensions. Overall, the apps scored well on education and information functions and scored poorly on engagement functions. There are few apps that provide comprehensive evidence-based educational content, tracking tools, and integration with electronic health records. This study demonstrates the need to develop apps that have comprehensive content, tracking tools, and ability to bidirectionally share data.
Introduction
Diabetes is one of the world’s fastest growing medical conditions affecting both adults and newborns. Approximately 425 million adults have diabetes; by 2045, this will rise to 629 million. 1 More than 21 million live births were affected by diabetes during pregnancy in 2015—one in seven births. 1 Gestational diabetes is a condition in which a woman without diabetes develops high blood sugar levels during pregnancy. 2 Gestational diabetes increases a woman’s risk of pre-eclampsia and depression.3–5 Babies born to mothers with poorly treated gestational diabetes are at increased risk of having low blood sugar after birth, jaundice and, in the long term, are at higher risk of being overweight and developing type 2 diabetes. 6
Women who have higher than normal blood sugar, but who do not meet the definition of gestational diabetes, are considered to have gestational prediabetes. Nutrition and exercise are key components in improving the health status of pregnant women at risk of developing gestational diabetes to prevent the onset of diabetes and to decrease cesarean section rates.7,8 Most women can manage their blood sugar through diet and exercise. 9 It is also possible that women at risk of gestational diabetes receive inconsistent information from various sources or fail to find appropriate support. 10
Mobile applications (apps) might be an effective way to provide education and behavior tracking tools for pregnant mothers. Mobile health (mHealth) technologies can allow users to monitor their health, encourage healthy behaviors, and provide personalized care. 11 mHealth could play a role in diabetes prevention by empowering users to make positive decisions regarding their lifestyle and chronic condition. 12 Health apps can enable individuals to change health-related behaviors. 13 A study from the Pew Research Center 14 on smartphone usage in the United States showed that 64 percent of Americans have a smartphone, 85 percent of Americans aged 19–29 years are smartphone owners, and 62 percent of smartphone users have used a smartphone to search for health information. It also showed that 15 percent of Americans have limited options for online access other than a cell phone. 14 Of Americans with a household income under US$30K, 24 percent have few data access options other than a smartphone, 19 percent have no broadband at home, and 13 percent are smartphone dependent. 14 The smartphone is increasingly becoming a primary communication medium for younger Americans. For some lower-income Americans, it is the only connection to online services.
The use of mobile technologies for health services will be increasingly important, particularly for reaching young women in low-income families who may have difficulties reaching healthcare services due to distance or cost. Vendor markets provide a varied selection of health apps: a recent report estimated that over 325,000 health apps were available on the major app stores in 2017, with an estimated 3.7 billion app downloads expected in 2017. 15 However, only a few apps have been validated and tested for effectiveness and to date, only about 160 apps have received Food and Drug Administration (FDA) or Conformité Européenne (CE) mark as medical devices. 16 The characteristics of most of the apps on the market are largely unknown.17,18 This is challenging as most apps are not evidence based and fail to differentiate between type 1 and type 2 diabetes. 19 A recent systematic analysis of apps in the “Medical” and “Health & Fitness” categories in the US iTunes App store showed that about 6 percent of apps were related to nutrition, about 3.2 percent were in the area of Gynecology and Obstetrics, and about 2 percent were related to diabetes care. 20 What is unclear is how well the existing apps are meeting the needs of these gestational patients. Many apps in the app store are defined within the “Medical” category; yet, this rating was not provided by medical professionals. 21 This may incorrectly lead app users to assume that apps labeled as “Medical” means that they are medically effective. 21
Recently, significant efforts have been made to develop structured frameworks to enable assessment of medical apps and characterize features related to the various components of quality (e.g. evidence base, usability, trustworthiness, user engagement).11,22 Some apps used for gestational diabetes may be too general to provide useful information specific for this patient population. Some apps may have information specific for this population but may not have any tools to track health status such as blood sugar, diet, or nutrition. In this study, we examined how well the existing mobile apps for diet tracking and prevention of gestational diabetes met the information and tracking needs of target users using a structured framework for app characterization as a first step to evaluate the feasibility of how these tools might be integrated into patient care.
Framework development
We developed an evaluation framework (Table 1) using a reference architecture for the development of mHealth apps that incorporates app-centered measures of quality (e.g. validation, trustworthiness) as well as user-centered measures of quality (e.g. usability, user engagement, provision of information).22–24 The framework was customized for patients with gestational prediabetes with the consultation of an experienced clinician (K.K.) and a researcher who has experience with the gestational prediabetic population (Y.Q.). The framework uses features found in the literature for the prevention of diabetes in pregnancy and also includes behavior change techniques that are known to be effective in changing behavior along with preferred patient features, as found in the literature.13,25–27 For example, features such as personal details, information about medications, symptoms, risk factors, laboratory results are included as relevant to patient health tracking.28,29 Similarly, features such as social/community engagement, goal setting, and gamification were included as relevant to patient motivation and support.27,28 The framework also includes data that might normally be found in an electronic medical record (EMR) to assist the patient in caring for themselves.28,30 The desirable features were separated into five categories for ease of reference (Table 1): (1) features that engender credibility and trust (“Credibility,” 7 features), (2) features that educate and inform (“Information,” 10 features), (3) features that provide interactive tools and behavior tracking (“Engagement,” 11 features), (4) features that speak to usability and design methodology (“Usability,” 5 features), and (5) features that speak to integration of the app with EMRs and other health system technologies (“Integration,” 3 features), for a total of 36 features. Five of these features are numeric in nature (i.e. number of downloads, last updated date, user rating, number of ratings, and cost) and do not contribute to the overall score whereas the remaining 31 features are rated using binary code (present or not present) as reported in detail in the “App search and evaluation” section.
Evaluation criteria for mHealth apps.
Methods
App database
A total of 42,008 Medical (M) and 79,557 Health and Fitness (H&F) apps were identified on the US iTunes app store as of 31 May 2017. The database was created in a previous study by implementing automated software for crawling the app stores and extracting the apps’ attributes from the HyperText Markup Language (HTML) source code. 20 The database included 16 attributes extracted from each app’s webpage: app ID, name, description, version, developer’s name, developer contacts, last update data, device compatibility, iOS compatibility, number of ratings, average ratings, reviews’ content, price (in US$), size, URL, and timestamp (i.e. the date and time of webpage access by the automated software). 20 Figure 1 shows the app identification and selection process from the US iTunes app store. Of the 42,008 M and 79,557 H&F apps in the database, 11,434 M and 18,449 H&F apps had either no description (less than 14 characters) or a description in a language other than English and were removed from the set. Language detection was performed using the Google Cloud Translation API Client Library for Python tool, specifically a port of Google’s language detection library to Python. 20 Since apps in the iTunes store can be assigned up to two categories, we also searched for duplicates. We found 11,190 apps that were in both categories and after removing them, we obtained a database of 80,490 unique apps. Finally, we removed 23,558 apps that were not updated in the past 2 years (as of 25 September 2017) and obtained a final list of 56,932 Apple apps from the iTunes store. The same procedure was run on the Google Play store. Similarly, from an initial number of 13,216 M and 31,301 H&F apps identified on the Google Play store, we removed 1583 M and 3533 H&F apps. These apps were removed because they had either no description or a description in a language other than English along with the 14,496 apps which were older than 2 years. This allowed the team to obtain a final list of 24,905 Android apps from the Google Play store.

Flow chart of apps identified from the US iTunes app store in initial search on 31 May 2017. The same procedure was applied to the Google Play store.
App search and evaluation
We searched for apps by analyzing their descriptions, as provided on the stores, and computed the number of words (the “word counter”) related to one or more of the following concepts: (1) nutrition, (2) tracking, (3) diabetes, and (4) pregnancy. The keywords used are as follows:
Nutrition: diet, nutrition, food, carbs, sugar, glucose, fat, adipose, adiposity, calories, and their variants;
Tracking: track, monitor, behavior, log, follow, manage, record, register, report, count, diary, and their variants;
Diabetes: diabetes, prediabetes, blood glucose, blood sugar, glycemia, diabetes, (impaired) glucose tolerance, (impaired) fasting blood sugar/glucose, HbA1c, fasting blood sugar/glucose, (oral) glucose tolerance test, OGTT, and their variants;
Pregnancy: gestation, pregnancy/pregnant, gravidity, obstetric, expecting mothers, mom-to-be, future mom, and their variants.
Tables 2 and 3 show the search results in the iTunes and Google Play databases, respectively. For each combination of concepts (i.e. nutrition; nutrition and tracking; nutrition, tracking, and diabetes; nutrition, tracking, and pregnancy; and nutrition, tracking, pregnancy, and diabetes), the tables show the number of apps found by the keyword search and the highest, median, and average value for the word counter across the set.
iTunes app store search.
For each combination of concepts (rows 1–2), the table shows the number of apps found by keyword search (row 3) and the highest, median, and average value for the word counter across the set (rows 4–6).
Google Play store search.
For each combination of concepts (rows 1–2), the table shows the number of apps found by keyword search (row 3) and the highest, median, and average value for the word counter across the set (rows 4–6).
From the apps which included nutrition, tracking, pregnancy, and diabetes, we narrowed down the apps and removed apps which were no longer available on the Apple iTunes and Google Play store as of 1 November 2017. We focused only on Apps that were in English and available in Canada, the United States, and Italy since the authors were from these regions. We also removed duplicates from each of the app stores. For each of these apps, reviewers looked at the description to see whether they included information on nutrition relevant to diabetes in pregnant women and if they had any tools for tracking nutrition or blood sugar or exercise. Apps were included if both reviewers agreed on the inclusion or exclusion and if there was a discrepancy, we used a third reviewer. If unclear from the app description, we downloaded the app to verify the inclusion or exclusion. After this manual review, there were 13 apps identified from the Apple iTunes store and 4 more apps identified from the Google Play store that were not present in the Apple iTunes store, for a total of 17 apps. Table 4 lists the app store where each app was screened from. At the time of review, Sugar Sense Diabetes App could be found in both Apple iTunes and Google Play store. Definitions of each of the 31 measures were created with the author consensus to ensure that each reviewer was trained on a standardized approach of what to assess the app on when evaluating. Each reviewer had access to a training guide with these definitions. For example, when assessing “Health Literacy Appropriate,” the reviewers assessed whether the health concept (i.e. Diabetes) is easy to understand to the average app user. Each app was screened independently against the 31 measures in the evaluation framework by two reviewers who downloaded the apps on iPhone and Android devices and reviewed their content and functionality based on the criteria shown in Table 5. Each measure was given a binary score of 0 or 1 by two independent reviewers. Apps received a score of 1 if they included an aspect of that measure. If the two reviewers disagreed on a score, a third reviewer was asked to resolve differences. A third reviewer was asked to resolve differences in binary scores for five measures among five apps reviewed and a consensus was reached. The authors met as a group to review the final selection of apps and the criteria assessments. To create an evaluation score for each app, the sum of the measures was taken and the percentage computed over the full range of 31 (maximum possible score).
Percent evaluation score of each app and app store retrieved from.
Ranking of functionalities and evaluation criteria (ordered by frequency).
PROMS: patient-reported outcome measures; QOL: quality of life; PHR: personal health record; EMR: electronic medical record.
Before we started reviewing, we conducted a calibration exercise with eight randomly selected apps, which were evaluated by eight reviewers. This calibration exercise was important to resolve any discrepancies of the measures and allowed for a standardized approach. All reviewers were trained with this standardized approach, and each of the 17 apps was evaluated by two independent trained reviewers.
Results
Table 4 shows the evaluation score across the 17 apps. The mean evaluation score was 38 percent on functions of information, credibility, engagement, usability, and integration. Diabetes and Blood Glucose Tracker by MyNetDiary and BlueStar Diabetes had the highest ranking, meeting the most criteria with scores of 17 out of 31. My DIETist-Online Dietician Consultation (5 out of 31) and Health Calculator Pro, Perfect Pregnancy and Postpartum Pounds, HealthSoup-Personal Nutritionist, and Amerifit Nutrition Tracker had the lowest scores with 8 out of 31 criteria met.
Overall, the apps scored well on functions such as personal details, physiological measurements, health literacy appropriate (is the health concept easy to understand), use of engagement model, and goal setting (Table 5). These criteria were met through giving information to the user on their diabetes and pregnancy along with app engagement.
Apps scored poorly on giving prompts for contingent rewards, identifying barriers to goals, integration into personal health records (PHRs) and EMRs, giving graded tasks to achieve goals, gamification, location-specific health information, and being recommended by a provider. Most of these functions fall under the category of engagement.
Discussion
Prior to this study, it was not known how many mHealth applications were specifically developed for diabetes in pregnancy, how well these apps meet the information needs of these patients, and how much evidence-based information was available in these apps. Our study has shown that there are few apps that have comprehensive content relevant to women with gestational prediabetes. Only 17 apps had the content we had defined in our search criteria. We also found that many were missing key functionality for tracking nutrition, exercise, and prediabetes (sugar level). Our results have shown that the apps reviewed had the following characteristics, strengths, and weaknesses:
Information needs: Of 17 apps reviewed, 16 apps had partially met all the criteria regarding information needs.
Functionality for tracking nutrition: Of 17 apps reviewed, 16 apps had partially met all the criteria tracking functions. The apps do contain features of calorie and nutrition-tracking. They also contain prompts for self-monitoring, can take physiological measurements, and are culturally appropriate for the user.
Functionality for tracking diabetes: Of 17 apps reviewed, 16 apps partially met all the criteria related to prediabetes. It appears that these apps are to be used in silos for physiological measurements such as blood glucose readings, weight, and height. The most prominent functionality for tracking diabetes which was identified was measuring blood glucose.
Integration with healthcare information systems: Of 17 apps reviewed, only three apps partially met all the criteria related to systems integration. Systems integration can include the following: ability to export data to share with a care provider, ability to import health records into app, ability to message a healthcare provider, and the ability to seek a consult with a provider.
Mobile apps have the potential to support and empower patients to make positive health changes. 31 However, we found the current apps will need to have more relevant content for this particular population group. Apps should be designed in collaboration with health professionals and patients by utilizing characteristics of user-centered design along with rigorous evaluations to test their efficacy in the target population. 32 It also appears that the majority of apps lack the use of behavior change theories which makes it difficult for patients to achieve positive change in managing their chronic condition.33,34
A successful app for diabetes prevention would enable real-time data transfer, involve the healthcare team, and have built-in analytic capabilities to provide tailored recommendations and feedback to motivate the user for continual engagement. 12 However, our study demonstrates the apps that were reviewed; few have the ability to share data with the patient’s primary care provider or a hospital EMR. While there are thousands of apps in the marketplace, very few can electronically share data with hospital medical records or primary care providers which makes them difficult to integrate these apps in the routine care of a health system.35,36 Until apps can more easily electronically integrate with healthcare systems, it will be difficult to analyze how the usage of these apps impacts the clinical outcomes of pregnant women and their babies. A recent randomized control study 37 of a mobile diabetes app for adolescents with type 1 diabetes showed no impact on the self-management behaviors of adolescents in terms of no changes in primary and secondary clinical outcomes. The study suggested the need to integrating the app into routine clinical care to facilitate more frequent feedback. 37
The ability to connect patients with their healthcare providers has the potential to reduce primary care office visits which would work well in capitation-based funding models.21,38 In the future, a standardized interoperability kit for apps should be developed in collaboration with EMR vendors and hospitals to enable important data for diabetes patients be made available, which can ultimately allow clinicians to analyze these data in clinical encounters and intervene when necessary. 28 Hospitals need to decide to build their apps in-house, work with an existing vendor, or prescribe apps to patients to render them effective. 23 In either case, there is a significant cost for systems integration.
Future apps need to have more abilities to export data, and healthcare systems need to have a mechanism to import data. This will require both technical adaptations and policy changes to allow external input of data. The latter poses both a legal risk in that patient consents need to be verified and recorded, and security risks when a hospital or clinic system is opened to external sources. Both can be addressed but will require a clear business case to justify the cost. Without doing pilots, it will be difficult to show the overall return on investment regarding treatment costs and healthcare outcomes.
Limitations
Our research has some limitations that should be considered. First, the criteria we identified do not include all information needs of gestational prediabetes and behavior-based methods. In addition, our search criteria do not include all the possible words related to their concepts. We tried to mitigate this by searching PubMed MesH and an English language thesaurus. Second, the reviewers who analyzed the apps are not experts in pregnancy but did have various backgrounds in healthcare fields. Our reviewers included a physician, a biomedical engineer, researchers, and health informatics professionals. Third, we were unable to download two of the apps since they required a subscription-based service or were not available in Canada, the United States, or Italy. If we were unable to download an app, we assessed it based on the app store descriptions and screenshots of the app functionalities in the Apple or Google Play app store. The reviews for these two apps were conducted based on the written descriptions so there is a possibility that we missed some features. Written descriptions also varied in detail based on the vendor. Fourth, we recognize that a low score in our rating does not necessarily mean that the mobile app may not be effective in terms of clinical outcomes. Only a full clinical interventional assessment can determine the efficacy of the app. Fifth, we only evaluated English language apps but there are a growing number of apps in Spanish, 39 so future studies might examine apps in different languages and examine the reading level for patient education material in English and Spanish. Finally, we used expert reviewers and not patients to evaluate the apps. In future iterations of app reviews, it may be a good idea to include patients to review the apps as they may have a different perspective on the usability and appropriateness of the apps for diabetes prevention efforts.
Conclusion
There are very few apps that we assessed that provide both comprehensive evidence-based educational content and tracking tools for patients with gestational diabetes. This study demonstrates the need to develop apps that have comprehensive content, tracking tools, and ability to bidirectionally share data with the patient’s primary care provider. This will require both technical adaptations and policy changes to allow for data sharing.
Footnotes
Acknowledgements
The authors would like to thank Afroz Sajwani for being an app reviewer both in the calibration exercise and in our downloading and reviewing of the apps from the Google Play store.
Declaration of conflicting interests
The author(s) received no financial support for this research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This review was funded in part with funding through the Agency for Healthcare Research and Quality (R01HS021495, R18HS24869).
