Abstract
Background
Smartphone photography and crowdsourcing feedback could reduce participant burden for dietary self-monitoring.
Objectives
To assess if untrained individuals can accurately crowdsource diet quality ratings of food photos using the Traffic Light Diet (TLD) approach.
Methods
Participants were recruited via Amazon Mechanical Turk and read a one-page description on the TLD. The study examined the participant accuracy score (total number of correctly categorized foods as red, yellow, or green per person), the food accuracy score (accuracy by which each food was categorized), and if the accuracy of ratings increased when more users were included in the crowdsourcing. For each of a range of possible crowd sizes (
Results
Participants (
Conclusions
Nutrition-novice users can be trained easily to rate foods using the TLD. Since feedback from crowdsourcing relies on the agreement of the majority, this method holds promise as a low-burden approach to providing diet-quality feedback.
Keywords
Introduction
Dietary self-monitoring is one of the key components of behavioral weight loss programs.1,2 Adherence to self-monitoring 3 and receiving personalized feedback on self-monitoring behaviors4,5 are both associated with improved weight loss. Diet apps have held promise as a way to increase self-monitoring frequency, but usage tends to decline over time.6–8 Smartphone cameras make just-in-time food recording possible, 9 and researchers have been developing ways to conduct dietary assessment of foods in photos. 10 There has also been increasing interest in finding computerized methods to make dietary assessment easier. 11 However, dietary self-monitoring differs from dietary assessment 12 in that dietary assessment is infrequent and must be highly accurate, whereas self-monitoring must occur every time something is consumed, as the more proximal the feedback is to the desired behavior, the more likely it is for that behavior to be sustained. 13 High degrees of accuracy, however, are not always crucial for dietary self-monitoring. This is because dietary self-monitoring is not used for research data collection and usually involves tracking a general, single factor of interest only, such as an estimate of energy intake, versus a very detailed level of dietary data (e.g. mgs of calcium, µg of selenium).
One option for providing dietary self-monitoring feedback is to use crowdsourcing, which utilizes the input of several users to provide feedback. Crowdsourcing can take on many roles, including collectively raising money (crowdfunding), completing tasks (crowd labor), conducting research (crowd research), and generating new products and ideas (creative crowdsourcing). 14 Crowdsourcing dietary information would be a hybrid of crowd labor and crowd research, allowing users to give quick collective feedback on food and beverages consumed, thus providing users with an overall rating of their diets. This crowdsourcing diet feedback approach also has the potential to reduce the burden and increase the gamification of self-monitoring, 14 which could help make self-monitoring more engaging and rewarding for users.
Previous research has examined the use of meal photos and crowdsourcing for dietary self-monitoring.
15
The Eatery app, which is no longer available to consumers, allowed users to take pictures of their foods with the app, rate their meals using a sliding scale from fit (healthy) to fat (unhealthy), and were then prompted to rate the photographs of foods and beverages from other users. In addition, users received peer feedback as an average healthiness score for their own foods and beverages. This study
15
assessed how closely the crowdsourced ratings of foods and beverages contained in 450 pictures from the Eatery mobile app as rated by peer users (fellow Eatery app users) (
The present study examined the use of the Traffic Light Diet (TLD)16,17 as a diet rating method using crowdsourcing. The goal of the TLD approach is to “provide the most nutrition with the least number of calories,” 18 categorizing foods as red (eat very rarely, low-nutrient-dense, high calorie), yellow (eat in moderation), and green (low in calories, high-nutrient-dense). The TLD has been mainly used in assisting children with dietary self-monitoring to encourage the intake of low-energy-dense foods and promote weight loss.16,19 The TLD approach has also been widely used to assist adults with making healthier food point-of-purchase decisions, such as in cafeterias,20,21 at concession stands, 22 and on food labels. 23 More recently, there has been an interest in using the TLD approach for self-monitoring with adults, as the TLD can be used with low-literacy populations. 24 Previous research has also demonstrated that rating foods with a traffic light system has the potential to promote long-term changes in dietary intake 21 and can provide a salient nutrition label that triggers processes within the brain – as detected by functional magnetic resonance imaging (fMRI) – that are used by adults who are successful at making healthy diet choices. 25
The present study had five main objectives, including examining: 1) if users could accurately crowdsource photos of foods as red, yellow, or green after receiving a brief training on the TLD; 2) if the accuracy of the ratings of foods categorized as red, yellow, or green differed from one another; 3) if the accuracy of the crowdsourced food categories increased by adding more participants to crowdsource the foods; 4) which demographic characteristics, technology use, and/or nutrition knowledge factors were associated with correctly categorizing foods; and 5) how users perceived the difficulty level of using various dietary self-monitoring methods.
Methods
Participants were recruited via Amazon Mechanical Turk (MTurk; www.mturk.com) to complete a survey (www.surveygizmo.com). MTurk is an online system that allows requesters to submit Human Intelligence Tasks (HITs) for online workers to complete in return for monetary compensation. 26 The demographic characteristics of MTurk workers tend to be more diverse than average internet survey populations. 26 For the present study, our sample of eligible participants was limited to US citizens over the age of 18 years, who were MTurk Masters – a group of workers who have demonstrated consistent reliability in completing HITs as determined by MTurk. Participants were paid US$0.50 for completion of the survey, which is similar to or higher than compensation rates used in previous MTurk studies.26–28 The study was approved by a university Institutional Review Board, and participants provided informed consent prior to beginning the survey.
After accepting the HIT, participants were directed to a survey which assessed demographic information, technology ownership (tablet or smartphone), use of diet app or physical activity app or device, and prior training in and knowledge of nutrition. The survey also presented the user with a one-page description on how to rate foods using the TLD. Ten foods were selected that represented all major food groups and were placed in random order within the survey. Participants then categorized the foods as red (potato chips, white-flour bagel, ham luncheon meat), yellow (whole-grain spaghetti with marinara sauce, fat-free plain yogurt, brown rice, black beans), or green (apple, salad, carrots). Additionally, to examine the perceived ease of use of the crowdsourcing approach in context with other potential dietary self-monitoring methods, participants were asked to rate, on a 9-point Likert scale, how easy (1) or difficult (9) each of the following would be for dietary self-monitoring: using a photo-taking/crowdsourcing method, using a mobile diet app, using a book to find calorie values of foods/beverages and calculate energy intake, or wearing a Bite Counter 29 to provide an automated estimate of caloric intake. The Bite Counter is worn like a watch and tracks wrist motion in three planes using a microelectromechanical systems gyroscope. 30 When a pattern of wrist roll motion is detected, a counter is activated to track the number of bites taken at each eating or drinking event, tracking bite frequency but not bite size. Calculations based on the Mifflin-St Jeor formula for resting metabolic rate 31 have been used previously to estimate an individual’s kilocalories per bite (KPB) based on demographic variables. These equations have been tested and refined using both dietary data from 24-hour recalls as a gold standard 32 and by observing 273 individuals eating a meal in a cafeteria, 33 and were found to estimate calories consumed in an individual meal to +/– 50 kcals. Each dietary self-monitoring method included a detailed description to provide participants with a clear overview of what each method would entail.
Two different accuracy scores were calculated. Each
To date, no research has been conducted on the number of participants needed to crowdsource dietary information accurately. It is not known whether only a few users (e.g. 15 users) are needed to provide feedback or several users (e.g. 45 users) are required to come to a majority agreement on the dietary feedback. Therefore, this study also sought to examine if mean participant accuracy scores increased as more participants were included in the crowdsourcing of the food ratings. To achieve this, a random list of numbers from 1 to 75 was generated and assigned to participants. Participants were then sorted in this random order from 1 to 75. Five groups of participants with their corresponding participant accuracy scores were then created: 1) first random 15 participants (
Statistical analysis
Descriptive statistics were used to characterize the sample. Means ( ± standard deviations (SDs)) were calculated for food accuracy scores, participant accuracy scores, and scores reflecting methods and features that would motivate users to consistently engage in dietary self-monitoring. For each of a range of possible crowd sizes (e.g. the five groups of different crowd sizes described in the previous paragraph), 10,000 bootstrap samples were drawn, and a 95% confidence interval (CI) for accuracy was constructed using the 2.5th and 97.5th percentiles. CI width describes the uncertainty with which a given crowd size will produce results similar to the study sample. General linear models were used to test whether demographic characteristics (model 1) or technology use and nutrition knowledge (model 2) were associated with participant accuracy score. Frequency distributions were calculated to examine what methods and features participants endorsed as motivating them the most to self-monitor regularly. Analyses were conducted using SAS V9.4 software with a
Results
Demographic characteristics of Amazon Mechanical Turk participants completing crowdsourcing data collection survey.
SD: standard deviation; BMI: body mass index.
Food accuracy score ratings for each food item rated by participants (
Indicates correct categorization of Traffic Light Diet color.
This study also examined the extent to which a smaller number of users included in the crowdsourcing lead to greater variability in the mean ratings obtained. Large crowd sizes such as 75 produced mean ratings falling mostly in a narrow range, with 95% of means in the bootstrap analysis with
Two separate general linear models were used to test whether demographic characteristics (model 1) or technology use and nutrition knowledge (model 2) were associated with participant accuracy score. In model 1, race (
Participants were also asked to rate on a scale of 1 (easy) to 9 (difficult) how they felt it would be to use four different diet self-monitoring methods. Participants rated using a Bite Counter 29 that would automatically track calories as the easiest (3.2 ± 2.2), followed by using the photo crowdsourcing approach (3.5 ± 1.9), a standard diet tracking app (3.8 ± 2.0), and using a calorie book (5.8 ± 2.3).
Discussion
The present study found that users without extensive formal nutrition training could be quickly trained to provide somewhat accurate dietary feedback based on the TLD, 18 which resulted in the majority of each food being categorized as the correct color. Because users frequently categorized each food correctly, this suggests that users receiving feedback on their food choices using this method would receive accurate feedback on their diet (e.g. number of red, yellow, or green foods). The study also found that even a very small crowd size tends to produce ratings within similar ranges as those produced by larger crowd sizes. Crowdsourcing holds promise as an inexpensive, low-burden method to address public health issues. 34 Although crowdsourcing has been used in some areas of health, such as radiology, 35 pathology, 35 and dermatology, 36 it has not been widely used in public health. 34 Dietary self-monitoring requires daily recording of meals, and individuals often find the process to be burdensome, time-consuming, and tedious.37,38 Although research has shown that accuracy is not as important as frequency of self-monitoring for weight loss 39 and that crowdsourcing has demonstrated high accuracy levels in other areas of public health,35,40 there has been little research in assessing the accuracy of dietary intake feedback provided via crowdsourcing.
One such study that examined the accuracy of crowdsourced dietary data looked at the use of the diet tracking Eatery™ app. 15 As discussed previously, the Eatery app used crowdsourcing to provide very rudimentary dietary feedback to users. Researchers found that compared to trained raters, users of the app could provide highly accurate feedback on the foods in the photos. 15 In addition, an app (PlateMate) was developed by researchers that used food photography and crowdsourcing to provide feedback on calories. Estimated calories by crowdsourcing (MTurk) were comparable to those by three expert raters (registered dietitians), with the error rate for the trained raters averaging 172 kcals or 28.7% per photograph versus 198 kcals or 33.2% per photograph for crowdsourced feedback. 41
The present paper has several strengths. The TLD is an evidence-based way to categorize meals to assist with weight loss, 16 and this study is the first examination using the TLD for crowdsourced feedback. Study participants had equal representation of males and females. More than half of participants reported they were actively attempting weight loss, which is higher than what has been reported for normal-weight populations but similar to what is seen in overweight populations.42,43 There were also limitations with this study. Participants were mostly white. Only single food items, and no beverages or mixed dishes (besides spaghetti and sauce), were included in each photo. This study also did not assess engagement over time with this type of dietary assessment. Others who have examined mobile app approaches,6,8 as well as crowdsourcing photo approaches, 44 have found that engagement with self-monitoring tends to decline over time.
Implications for research and practice
There is a need to make dietary self-monitoring more engaging and less burdensome. 45 Because feedback from crowdsourcing relies on the agreement of the majority, this method holds promise as a low-burden approach to providing diet-quality feedback, while also building in gamification and social networking, supporting aspects that may make this approach to dietary self-monitoring more engaging. In addition, using food photography and crowdsourcing for dietary self-monitoring may be an appealing approach for registered dietitians to use with their patients and clients. A practitioner interface could allow nutritionists to view foods consumed and numbers of red, yellow, and green foods eaten each day. Future research should examine if a more detailed training on the TLD would improve accuracy and also explore if this approach can increase engagement in dietary self-monitoring over time, improve dietary quality, and assist individuals with achieving a healthy body weight.
Footnotes
Acknowledgements
The authors would like to thank the Amazon Mechanical Turk participants for completing this study.
Contributorship
GTM, DSM, ERM, and AWH conceived the project. Data acquisition and interpretation were conducted by GTM and ATK. GTM and BEH performed the statistical analyses and implemented required custom software. All authors wrote the manuscript, and were responsible for the research concept and design as well as critical revision of the manuscript, and approved the final version.
Declaration of Conflicting Interests
Authors ERM and AWH have formed the company “Bite Technologies” to market and sell a bite counting device. Clemson University owns a US patent for intellectual property known as “The Weight Watch”, USA, Patent No. 8310368, filed January 2009, granted November 13, 2012. Bite Technologies has licensed the method from Clemson University. ERM and AWH receive royalty payments from bite counting device sales. No other authors have any conflicts to declare.
Ethical approval
The Institution Review Board of the University of South Carolina approved this study.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Cancer Institute of the National Institutes of Health (award number R21CA18792901A1). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Guarantor
GTM.
Peer review
This manuscript was reviewed by Melanie Warziski Turk and Jing Wang, University of Pittsburgh.
