Abstract
Background:
Breastfeeding behaviors and experiences exist on a continuum. What differentiates normal from dysfunctional is defined by frequency and severity. No current validated tool addresses the subjective experience of dyads with a predictive score that can be followed over time.
Research Aim:
To create and validate a self-report tool to assess breastfeeding and evaluate its ability to predict risk of breastfeeding dysfunction.
Methods:
This study used a cross-sectional design to determine the validity of a novel instrument to assess breastfeeding dysfunction. We gave the initial questionnaire to 2085 breastfeeding dyads. We assessed content validity by comparison with other tools. We used exploratory factor analysis with varimax rotation for concept identification and Cronbach’s alpha for internal consistency. We employed logistic regression to assess the tool’s ability to differentiate between normal breastfeeding and breastfeeding dysfunction.
Results:
Factor analysis mapped 17 questions to four concepts to create a score (FLIP; flow, latch, injury [to the nipple], and post-feed behavior). Internal consistency and reliability of the scores in these concepts were acceptable (Cronbach’s alpha ≥ 0.087 for all measures). A logistic regression model that controlled for infant age, with a breastfeeding dysfunction risk classification threshold of 60%, yielded a correct classification of 88.7%, with 93.1% sensitivity, 64.6% specificity, and a 6.5% false positive rate.
Conclusions:
The FLIP score was determined to be a valid and reliable instrument for quantifying the severity of breastfeeding dysfunction in children under 1 year old. Further studies will assess its usefulness in the management of breastfeeding dysfunction.
Keywords
Key Messages
No validated self-report tool exists for the identification of breastfeeding dysfunction that addresses the symptomology of both partners in the breastfeeding dyad.
FLIP exhibits high content validity and predictive performance to identify the risk of breastfeeding dysfunction in dyads with infants under 1 year of age.
FLIP can be used to monitor breastfeeding dysfunction over time and can serve as a tool to assess the effectiveness of a broad range of interventions to improve the breastfeeding experience.
Background
Breastfeeding, defined as the child receiving breastmilk either directly from the breast or expressed (World Health Organization [WHO], 2008), is associated with benefits to both infant and breastfeeding parent (Couto et al., 2020; Horta, 2019; North et al., 2022). Both parental and infant experiences determine its continuation and longevity. These experiences can include both factors external to the breastfeeding dyad and physiological factors in both parent and infant.
External factors contribute to early cessation of breastfeeding. Socioeconomic status, racial disparities in the initiation of breastfeeding, social stigma around breastfeeding in public, lack of access to support systems for continuing breastfeeding upon return to work, and limited access to culturally competent lactation resources, have all been found to affect both whether and how long parents breastfeed (Apanga et al., 2022; Beauregard et al., 2019; Gianni et al., 2019; Hamner et al., 2021; Li et al., 2019; Whipps et al., 2019). The perception of external recommendations and social support from partners and parental grandparents also influence breastfeeding patterns (Whipps et al., 2019; Zhang et al., 2023).
Among physiological factors, a constellation of symptoms indicative of breastfeeding problems, including nipple pain and poor latch, have been implicated as causes for earlier than desired cessation of breastfeeding (Gasparin et al., 2019; Gianni et al., 2019; Morrison et al., 2019). These symptoms may be associated with functional impairment in sucking or milk transfer that is characterized as “breastfeeding dysfunction.” Breastfeeding dysfunction can have several causes and includes both parental and infant factors that stem from anatomical or physiological issues; partial ankyloglossia is an example. Partial ankyloglossia, commonly referred to as ankyloglossia or tongue-tie, refers to a lingual frenulum that is short or thick. It is associated with restricted lingual mobility and impaired breastfeeding (Cordray et al., 2023; Fraga et al., 2020; LeFort et al., 2021). Thus, partial ankyloglossia severe enough to warrant treatment can serve as marker for known breastfeeding dysfunction.
Because breastfeeding behaviors span a spectrum from normal to dysfunctional, various tools including the Mother–Baby Assessment (MBA; Mulford, 1992), the Bristol Breastfeeding Assessment Tool (BBAT; Ingram et al., 2015), The Infant Breastfeeding Assessment Tool (IBFAT; Matthews, 1988), and LATCH (Latch, Audible swallowing, nipple Type, Comfort, Holding; Jensen et al., 1994) are used to evaluate full-term infants for evidence of breastfeeding difficulties. These tools are filled out by the healthcare provider, who must elicit the information from the parent, and depend on the provider’s examination, which requires extra training. While these tools have been applied in the context of identifying dyads at risk for breastfeeding difficulties, they were not designed nor validated for the identification of breastfeeding dysfunction. LATCH was originally designed as a charting aid (Jensen et al., 1994). MBA was designed to assess breastfeeding sessions in much the same way APGAR (Appearance, Pulse, Grimmace, Activity, Respiration) scores are used to assess newborn health (Mulford, 1992). BBAT was designed to assist midwives, lactation consultants, and other healthcare professionals in assessing breastfeeding proficiency (Ingram et al., 2015). IBFAT was constructed to allow parents, midwives, and nurses to assess infant performance during feeding (Matthews, 1988).
In addition to tools that assess behavioral indicators of breastfeeding difficulties, there are tools that focus on evaluating infant anatomy and function including The Frenectomy Decision Rule for Breastfeeding Infants (FDRBI; Srinivasan et al., 2006) and Lingual Frenulum Protocol with Scores for Infants (Lopes De Castro Martinelli et al., 2012). The FDRBI is an assessment performed by a clinician to guide intervention. The clinician evaluates whether there is nipple pain and/or trauma and difficulty latching along with the range of motion of the tongue. Clinicians using the Lingual Frenulum Protocol with Scores for Infants assess breastfeeding difficulty by asking about maternal pain, length of feeding and infant behaviors, for example, fatigue or short duration in a historical context. It then requires an evaluation of a feed as well as infant oral anatomy.
No single tool assesses the complete symptom spectrum of breastfeeding dyads. The FDRBI and the Lingual Frenulum Protocol with Scores for Infants focus primarily on infant behaviors with little consideration given to maternal factors that influence breastfeeding success. LATCH and IBFAT are centered on maternal pain and infant latch, but fail to elicit information about other behaviors associated with breastfeeding dysfunction. BBAT focuses primarily on the infant assessing positioning, attachment, sucking, and swallowing. MBA documents steps in the breastfeeding process and both parent and infant are scored based on whether they successfully complete a step. MBA fails to document specific behaviors or other indicators of breastfeeding dysfunction (e.g., nipple pain). In addition, only BBAT and LATCH have been assessed for both reliability and validity. IBFAT and MBA have only been evaluated for reliability (Bickell et al., 2018), and the FDRBI and Lingual Frenulum Protocol with Scores for Infants have not been validated.
Symptom severity is a subjective observation that depends largely on parental perception, and often can be misleading, particularly if assessed longitudinally. Symptom frequency, on the other hand, has better stability over time (Krabbe & Forkmann, 2012). There is a need for a tool that does not require extensive training on the part of the provider to implement and can quantify symptom severity in terms of symptom frequency to differentiate between normal breastfeeding behaviors and breastfeeding dysfunction. The purpose of this study was to create and validate a self-report tool to assess breastfeeding and evaluate its ability to predict risk of breastfeeding dysfunction.
Methods
Research Design
This study used a cross-sectional design to evaluate the validity of a novel instrument to assess breastfeeding dysfunction. Because the instrument is new, responses were evaluated for content validity, internal consistency, and known groups validity, in order to facilitate its use both clinically and in further research studies assessing intervention effectiveness. The study was approved by the Western Institutional Review Board (IRB number 20181213).
Setting and Relevant Context
Paper surveys were administered in English as part of routine medical office paperwork in a large suburban pediatric practice in the Phoenix, Arizona metropolitan area. Socioeconomic factors (e.g., family income, education, race, and ethnicity), a lack of access to lactation support, and social stigma, all exacerbate the difficulties associated with breastfeeding dysfunction and contribute to early cessation of breastfeeding (Beauregard et al., 2019; Li et al., 2019). The communities surrounding the practice locations had median annual incomes ranging from $60,499 to $97,409 (N = 2,228,368), poverty rates ranging from 6.7% to 17.3% (N = 2,228,368), and college education rates ranging from 22.1% to 60.2% (N = 2,228,368; U.S. Census Bureau, 2023). Minority populations (Black, Asian, Native American, Hawaiian or Pacific Islander, and Hispanic) in these communities ranged from 21.3% to 58.6% (N = 2,228,368; U.S. Census Bureau, 2023). The 2018 to 2019 breastfeeding initiation rate for Maricopa County, where the Phoenix metropolitan area is located, was 89.0% (Centers for Disease Control and Prevention [CDC] 2021). The Arizona Department of Health hosts a website with information about breastfeeding resources in the state for parents and employers. The state also provides a 24-hour breastfeeding hotline (Arizona Department of Health Services [AZDHS], 2022). Access to an International Board Certified Lactation Consultant (IBCLC) was part of the routine care offered in the practice. The CDC periodically assessed public opinions about breastfeeding through the SummerStyles Survey. In 2021, the majority of respondents (69.02%) agreed with the statement “I believe women should have the right to breast feed in public spaces” (CDC, 2023) and Arizona state law protects the right of mothers to breastfeed in public (Arizona Revised Statues—Title 41, 2006)
Sample
The target population was healthy breastfeeding parents over the age of 18 and healthy full-term infants under a year old with or without symptoms of breastfeeding dysfunction seen during routine clinical care. Participants were either primary patients of the practice or had been referred to the practice from outside for lactation consultation. Participants from a variety of socioeconomic backgrounds were included (e.g., those who qualified for the Special Supplemental Nutrition Program for Women, Infants, and Children [WIC]). However, specific information about socioeconomic background, race, ethnicity, and parental age were not collected. Medical record review was used to verify age and breastfeeding status, determine whether the infant had a frenectomy, and when, relative to the survey date, and to identify confounding conditions. We excluded infants with prematurity or physical (e.g., clefting), neurological, cardiac, or genetic problems that could contribute to breastfeeding dysfunction. Since symptoms of breastfeeding dysfunction overlap with normal breastfeeding behaviors and can be caused by suboptimal latch, most dyads reporting breastfeeding difficulties had already been evaluated by an IBCLC before their appointment to rule out suboptimal latch, and all dyads were offered IBCLC support after the evaluation. Participants were not compensated.
The cross-sectional sampling scheme is illustrated in Figure 1. Following survey administration, completed surveys were identified. Note that a completed survey only means that some portion of the survey questions were answered and does not imply that all survey questions were answered. Medical record review was carried out and participants were divided into two groups. Participants who had evidence of partial ankyloglossia severe enough to have progressed to frenectomy were classified as the known breastfeeding dysfunction group (

Study Design.
The 2,085 dyads proved to be a more than sufficient sample size for the subsequent analyses. The Kaiser's Measure of Sampling Adequacy: Overall MSA = 0.893 for the exploratory factor analysis (EFA) indicates an adequate sample size (Shrestha, 2021). The post-hoc power analysis for the logistic regression model for classification revealed a power of > 0.99 for FLIP sub-scores Flow, Latch, Injury, and 0.91 for the predictor Post-feed Behavior. The post-hoc power analysis for the logistic regression on the training data set only revealed similar power results with a power of > 0.99 for FLIP sub-scores Flow, Latch, Injury, and 0.80 for the predictor Post-feed Behavior. The statistically acceptable lower value of power for post-feed behavior was a consequence of that sub-score consisting of fewer questions than other sub-scores.
Measurement
The demographic variables evaluated were assigned sex and infant age at survey. Ages were computed in days based on the date of the survey and the birthdate of the infant obtained from the medical record. Ages were grouped as follows: 0–14 days were assigned to the 0–2 weeks category; 15–30 days were assigned to the 2–4 weeks category; 31–61 days were assigned to the 1–2 month category; 62–91 days were assigned to the 2–3 month category; 92–183 days were assigned to the 3–6 month category; 184–273 days were assigned to the 6–9 month category, and 274– 365 days were assigned to the 9+ months category. We chose these categories as they align with the recommended timing of well-child checks.
The most common parental and infant symptoms were identified through a literature survey and formed the concepts guiding the Results (IQ) construction. The IQ can be found in Supplemental Table 1 (see the online supplemental material). The frequency of symptoms (e.g., how often they occur) rather than the severity of symptoms was chosen for assessment to mitigate bias, as reported severity was more variable and subjective than frequency, depending on the immediate psychological state of the parent (Krabbe & Forkmann, 2012). Except for the standard pain question which rated pain between 0 and 10 (Hawker et al., 2011), and two questions that asked for specific amounts of time, questions asked the respondent to indicate how often a symptom or behavior occurred: never, less than 25% of the time, 50% of the time, greater than 75% of the time, or 100% of the time. This was re-coded to a 5-point Likert-like scale (Jebb et al., 2021), which precludes the existence of outliers. Numeric responses reported as ranges were recoded as the arithmetic average of the end points of the supplied range (e.g., if patient stated 1–3, a score of 2 was recorded) and used in the subsequent analyses.
For the purposes of future practical use and rapid clinical implementation, data for four questions were reformatted to maintain a consistent scale across all questions. The standard pain question (originally on a 0–10 scale) was divided by 2 to remain consistent with the other questions on a 0–5 scale. The question “Do you feel the baby empties your breasts?” was reverse scaled as higher scores were associated with improving rather than worsening symptoms. Responses from the questions “Average time baby is on breast/bottle per feed?” and “How many times did you breastfeed, last 24 hours?” were normalized so that responses in the middle 20% would warrant a score of 1. More extreme observations received higher scores, resulting in the responses within the top 10% and bottom 10% receiving a 5.
Parental Symptoms
Questions on parental symptoms focused on nipple pain and blanching. As nipple pain is a top reason for early breastfeeding cessation (Morrison et al., 2019), we included a pain scale as well as a frequency scale to best quantify it. All other questions related to parental symptoms assessed frequency of occurrence.
Infant Symptoms and Feeding Behavior
Symptoms resulting from infant anatomy included shallow latch, clicking during feeding, curling the upper lip under while feeding, and leaking from the corners of the mouth. We also asked about infant gassiness and spit-up. All questions asked about the frequency of symptoms.
Infant feeding behavior referred to symptoms present during or directly after feeding. These included latching on and off, choking or gagging, chewing, or clamping down on the nipple, fussiness immediately after feeding, and the presence of milk coming out of the nose during feeding. The frequency of symptoms was assessed.
Milk Production
Milk production was addressed through several questions. We asked about the frequency with which the parent felt that the infant emptied the breasts. We also asked how many ounces milk or formula were given via bottle. Finally, we elicited information about the frequency and duration of feeds.
Data Collection
Paper surveys were administered from January 2016 to September 2019 by healthcare providers at the point of care. The breastfeeding parent was asked to fill out the survey as part of the in-office paperwork for the visit. Informed consent was obtained verbally by the individual administering the questionnaire. Surveys were not used by providers to inform patient care. Data were stored separately from patient medical records and securely on-site following all HIPAA regulations. Surveys were de-identified prior to data analysis to maintain confidentiality.
Data Analysis
Descriptive statistics (counts and percentage of total) were computed for assigned sex and age variables for the infants in both groups, as well as for items and four key concept scales (mean and standard deviation). Missing data were omitted for summative statistics and only complete cases (n = 1529) were used for subsequent analyses. All data analysis was conducted using SAS software (Version 9.4). The Kruskal-Wallis test for the association of sub-score with misclassification was calculated in R (Version 4.1.1), using the stats package.
Tool Validation
Content validity was performed to determine how the content of the tool compared with other tools. Validation then proceeded by an exploratory factor analysis to identify consistent underlying concepts addressed by the questions and associate questions with concepts. Questions that did not map to concepts were dropped from the IQ as they were not measuring the content they were intended to measure. Internal consistency was evaluated using Cronbach’s alpha (American Educational Research Association et al., 2014). Known-group analysis was used to evaluate the predictive power of the tool.
Content Validity Analysis
The IQ was compared to other similar tools: LATCH: A breastfeeding charting system and documentation tool (Jensen et al., 1994), Frenectomy Decision Rule for Breastfeeding Infants (FDRBI; Srinivasan et al., 2006), Bristol Breastfeeding Assessment Tool (BBAT; Ingram et al., 2015), and Lingual Frenulum Protocol with Scores for Infants (Lopes De Castro Martinelli et al., 2012) to identify areas of overlap with other validated tools.
Factor Analysis
The IQ contained 28 items (questions) to be evaluated (see Supplemental Table 1 in the online supplemental material). We performed an exploratory factor analysis (EFA) with a varimax rotation on all patient responses, regardless of group. We verified the appropriateness of the sample size (Lawley & Maxwell, 1971) and method (Bartlett’s test of sphericity: 10346.5268, p < .001). Questions were deemed to be related to a construct when their factor loading exceeded 0.35. Items unrelated to any construct were removed from the analysis.
Internal Consistency Analysis
We used Cronbach’s alpha (Cronbach, 1951) to demonstrate the level of consistency in the responses across the entire question set. The sub-scores identified by the factor analysis were also analyzed for internal consistency. A value ≥ 0.6 was considered sufficient.
Known-Groups Validity Analysis
The IQ’s intended use was to identify the presence and severity of breastfeeding dysfunction. The diagnosis of partial ankyloglossia severe enough to warrant frenectomy identifies a known group with breastfeeding dysfunction. The data were split into a test group and validation group. The survey responses for 70% (n = 1234) of patients with breastfeeding dysfunction and 70% (n = 224) of the patients with no breastfeeding dysfunction were used to develop a logistic regression model to estimate the likelihood of breastfeeding dysfunction based solely on the constructs, controlling for the age of the infant by modeling it as a categorical variable. The 70% were selected from the overall sample and not selected proportional to the group sizes. The accuracy of the model was then tested on the remaining 30% of the patients in each group (known breastfeeding dysfunction n = 503 and no breastfeeding dysfunction n = 124) with the threshold of breastfeeding risk set at 60%. Only cases without missing data were used in the analysis. We assessed overall classification rate, area under the curve, sensitivity, specificity, false positive rate, and false negative rate.
Results
Characteristics of the Sample
The infant participants ranged in age from 0 to 264 days. Out of the 2085 infant participants, 993 (47.63%) were assigned female and 1152 (55.25%) were assigned male. The details of age and assigned sex by group classification of the infant participants are given in Table 1.
Demographics of Infants Included in the Study by Group (N = 2085).
Note. BF = breastfeeding.
Content Validity
We identified the questions in our questionnaire that overlapped with concepts covered in similar tools. As seen in Table 2, the IQ covered similar questions in the areas of parental pain and infant feeding behavior to currently validated tools. The correspondence in content ranged from one question (BBAT) to six questions (Lingual Frenulum Protocol with Scores for Infants).
Correspondence of Questions to Existing Tools.
Construct Identification
The IQ consisted of 28 initial questions. Factor analysis identified four key constructs that were related to one another and significantly contributed to the presence of breastfeeding dysfunction. The constructs were defined by question-factor loading of at least 0.35 onto one unique factor. The unique factors were able to be clinically classified in the following constructs: Flow, Latch, Injury, and Post-Feed Behavior. Questions that did not load onto any of the four factors were excluded from the final validated tool. Based on this analysis and general data inconsistencies, 11 questions were removed from the FLIP Tool (see Supplemental Table 2). These included questions related to milk production. The final tool (see Supplemental Table 3) comprised 17 questions grouped into four sub-scores corresponding to the identified constructs. The FLIP tool was scored by summing the scaled score of all questions in a sub-group to generate the sub-score. Sub-scores were then added to generate a total score. The sub-scores had the following potential ranges: Flow: 5–25, Latch: 5–25, Injury: 4–20, and Post-Feeding Behavior: 3–15. The total FLIP score could range from 17 to 85. The questions that contributed to each construct along with the mean and standard deviations of the responses of the total sample and broken down by normal and known breastfeeding dysfunction groups are presented in Table 3 and Supplemental Table 4.
Mean and Standard Deviation of Responses on the FLIP Tool by Group.
Note. SD = standard deviation.
Internal Consistency and Reliability
We evaluated the internal consistency and reliability of the remaining 17 questions as a whole and within constructs by computing the correlation of each question with the total score as well as Cronbach’s alpha. The overall Cronbach’s alpha for raw variables was 0.884. The standardization of variables did not change this. All four constructs had high internal consistency reliability: Flow (Cronbach’s alpha = 0.880), Latch (Cronbach’s alpha = 0.876), Injury (Cronbach’s alpha = 0.880), and Post-Feed Behavior (Cronbach’s alpha = 0.880). Detailed information about the questions in each construct can be found in Table 4 and the details for the sub-scores can be found in Supplemental Table 5.
Internal Consistency and Reliability of Individual Questions.
Known-Groups Comparison
We assessed whether the questionnaire could distinguish between participants with known breastfeeding dysfunction and those who had no evidence of breastfeeding dysfunction. Descriptive statistics for the four sub-scores for the total sample and broken by groups are given in Supplemental Table 4. We built a logistic regression model to predict the probability of breastfeeding dysfunction, accounting for infant age, using 70% (n = 1458) of the data and then tested the classification on the remaining 30% (n = 627). We assessed the model’s performance to classify dyads as having breastfeeding dysfunction with a probability of 60% or higher. The model correctly classified 88.7% of the participants. Analysis of area under the curve (AUC) for the prediction is shown in Figure 2. Sensitivity was 93.1% and specificity was 64.6%. The false positive rate was 6.5%, and false negative rate was 36.8%.

Classification Performance of the FLIP (Flow, Latch, Injury and Post-Feed behavior) Tool With a Breastfeeding Dysfunction Predicted Probability of 60% or Greater.
Misclassification was more pronounced among the no breastfeeding participants compared to participants with known breastfeeding dysfunction (no breastfeeding dysfunction: n = 70, N = 121; known breastfeeding dysfunction: n = 18, N = 505). While all sub-scores were significantly associated with misclassification (Kruskal-Wallis Chi-Square, Supplemental Table 6), participant misclassification was driven by the Latch sub-score (Figure 3). For the “No breastfeeding dysfunction” dyads, 68.6% (n = 48) of the misclassified participants had Latch sub-scores >10, whereas 76.4% (n = 39) of the correctly classified participants had scores ≤ 9. For the "Known breastfeeding dysfunction" participants, 100% (n = 18) of misclassified participants had Latch sub-scores ≤ 10, while 74.9% (n = 365) of correctly classified participants had scores ≥ 15.

Sub-Score Distribution by Known Groups and Classification Outcome for Flow, Latch, Injury, and Post-Feed Behavior (FLIP).
The model predicted the risk of breastfeeding dysfunction. However, quick clinical implementation required translating the risk cut-off of 60% into a score on the tool. We evaluated the relationship between predicted risk, FLIP score, and the Latch sub-score since the Latch sub-score was most strongly associated with misclassification (Figure 4). Dyads with a FLIP score > 47 had a predicted risk of greater than 90% of having breastfeeding dysfunction with Latch sub-score above 15. A score < 28 resulted in most participants having a risk of breastfeeding dysfunction under 60%. Scores between 28 and 47 constituted a group of participants where the risk of breastfeeding dysfunction was predicted to be in the 50%–80% range, with Latch sub-scores suggestive of breastfeeding dysfunction.

The Relationship Between Predicted Risk of Breastfeeding Dysfunction, FLIP Score, and Latch Sub-Score.
Discussion
Our study described the development, validation, and discriminative performance of a 17-item self-report questionnaire to evaluate the risk of breastfeeding dysfunction in breastfeeding dyads with infants under 1 year old. The risk model accounted for infant age and allowed us to translate risk into a FLIP score without needing separate scales for different age groups. The FLIP tool could risk-stratify and triage breastfeeding dyads into those with low risk of breastfeeding dysfunction, those who should be followed closely to determine if symptoms resolve, and those with sufficiently severe breastfeeding dysfunction to warrant an evaluation of underlying causes. The bias of the misclassification to the “No breastfeeding dysfunction” group and the false positive rate of the FLIP score reflected the likely inclusion in the no breastfeeding dysfunction group of individuals who had breastfeeding dysfunction but did not have the biomarker we used to identify known breastfeeding dysfunction. These individuals generally had Latch sub-scores greater than 10. The false negative rate was driven by individuals who had Latch sub-scores of less than 10. This suggested that in addition to paying attention to the total score, particular attention should be given to the Latch sub-score to further refine risk.
FLIP scores translated a dyad’s breastfeeding experiences into objective data that can facilitate longitudinal communication between families, IBCLCs, and other healthcare providers. In contrast to many other tools used in the context of identifying breastfeeding difficulties, the FLIP score can be filled out by parents prior to being seen by the provider, improving efficiency in-office, especially in high-volume practices where providers have short windows of time for providing care. This also provides opportunities for follow-up that do not require an in-person visit to the clinic. Other tools including the MBA (Mulford, 1992), BBAT (Ingram et al., 2015), IBFAT (Matthews, 1988), and LATCH (Jensen et al., 1994) require the provider to observe one or more feeding sessions which limits their usefulness in high-volume practices and requires in-person follow-ups to assess the usefulness of any suggested interventions.
The FLIP tool differentiates between normal symptoms associated with breastfeeding that reflect the learning curve of both parent and infant and those resulting from treatable causes of breastfeeding dysfunction that could affect the longevity of breastfeeding. The total FLIP score and the sub-scores can determine the severity of breastfeeding dysfunction, and the inclusion of both parental and infant symptomology make it a comprehensive evaluation of the breastfeeding dyad, which, outside of the score, provides information to the clinician to pinpoint and respond to difficulties. For example, a parent could indicate that they always have pain when feeding, but no other symptoms. This information could guide the care given to that parent to help resolve the underlying issue. The quantitative nature and psychometric profile of the FLIP score and sub-scores makes the tool suited to the evaluation of interventions to improve breastfeeding dysfunction, including the timing of intervention. While one candidate for such studies would be the evaluation of the effectiveness of surgical intervention for anatomical issues like partial ankyloglossia, FLIP could be used to evaluate pedagogy for patient education in breastfeeding or the effectiveness of methods to correct positioning to improve latch.
Certain symptoms, while clinically important, were removed from the present study due to issues with variability in the data or too few responses. Milk production is important to consider in a dyad with breastfeeding difficulties, as a feedback loop exists between low milk production and difficulty in milk extraction. However, milk production is highly individualized, difficult to quantify without 24-hr weight testing, and perception versus reality of milk supply does not always match (Westerfield et al., 2018). Thus, questions on milk production were removed from the present study.
Another important set of questions that were removed regarded the length of feedings and time between feeds. In our clinical experience, when taken into consideration with other feeding symptoms, excessively short or long feeds may indicate breastfeeding dysfunction. Excessively long feeds or short time between feeds may be due to poor milk transfer, whereas excessively short feeds and long time between feeds may be due to infant fatigue and breast-aversion due to feeding difficulties. Parental milk production and age of the infant also influence the average length of feed, with older infants taking less time to feed than younger infants (Kent et al., 2013). Supplementation of human-milk substitutes and use of nipple shield are also confounding factors. This variation led to these questions being removed from the final questionnaire.
Limitations
The limitations of this study include the wording of the scaled values. As evaluated in this study, the wording could introduce confusion on the part of the parent when rating something they feel happens at a frequency in between two choices. This could explain the ranges we encountered that we then needed to account for by taking the arithmetic mean of the range endpoints. This wording might also introduce self-report and social desirability bias. Future iterations of the tool should use wording that indicates distinct categories: never, rarely (no more 25% of the time), sometimes (25–75% of the time), often (more than 75% of the time), and always. Additionally, while the risk model accounts for patient age, the number of dyads with children older than 6 months was limited. Further studies using FLIP should assess the validity for use in various other populations with a focus on older infants. Next, the tool was only administered in English in the context of a large pediatrics practice with significant lactation support services. Future studies should evaluate the tool in a wider variety of practice settings. Finally, while all dyads seen during the survey period were offered the survey, only completed surveys were included in the analysis. This may have introduced unintended bias into the participant population, as we do not know whether the demographics of those dyads who chose not to complete the survey differ from those who did complete the survey.
Conclusions
We have developed and validated a 17-question self-reported questionnaire, FLIP, that can diagnose breastfeeding dysfunction in breastfeeding dyads. This tool grouped participants into three categories: no breastfeeding dysfunction, wait and watch, and breastfeeding dysfunction, based on the total score. The four sub-scores, Flow, Latch, Injury, and Post-Feed Behavior, can be used to further refine clinical management. This is particularly true for the Latch sub-score, as it had the strongest association with misclassification in the underlying risk model. Future studies will assess its usefulness in managing breastfeeding dysfunction.
Supplemental Material
sj-pdf-1-jhl-10.1177_08903344231209306 – Supplemental material for New Validated Tool to Diagnose Breastfeeding Dysfunction
Supplemental material, sj-pdf-1-jhl-10.1177_08903344231209306 for New Validated Tool to Diagnose Breastfeeding Dysfunction by Rajeev Agarwal, Mars Eddis-Finbow, Jodie Tam, Jennifer Broatch and Kimberly J. Bussey in Journal of Human Lactation
Footnotes
Acknowledgements
We thank Drs Michelle Mancenido, Abhay Vats, and Laurie B. Jones for their discussions of the data. We also acknowledge Alexandra Lopez’s help with data collection, Jilma Joy’s help with data entry, Alex Warthen’s help with content validation, Samuel Hanson’s help with data clean-up and validation, and Qidi Xu’s help with data collection, clean-up, and analysis.
Author Contributions
Rajeev Agarwal: Conceptualization; Data curation; Investigation; Supervision; Writing – review & editing.
Mars Eddis-Finbow: Data curation; Investigation; Project administration; Writing – review & editing.
Jodie Tam: Data curation; Writing – original draft; Writing – review & editing.
Jennifer Broatch: Data curation; Formal analysis; Methodology; Visualization; Writing – review & editing.
Kimberly J. Bussey: Data curation; Formal analysis; Visualization; Writing – original draft; Writing – review & editing.
Disclosures and Conflicts of Interest
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
Supplementary Material may be found in the "Supplemental material" tab in the online version of this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
