Consumer Segmentation for Vitamin Purchases: A Machine Learning Approach

Abstract

This study applies machine learning methods to Nielsen Homescan Panel data to understand the segmentation of U.S. vitamin consumers. Consumer segmentation is crucial for understanding purchasing behavior and optimizing strategies to maximize profits. We employ the RFM (Recency–Frequency–Monetary) framework with demographic attributes, and apply the K-means algorithm to classify households into distinct segments. Considering COVID-19 as an exogenous economic shock, we then compare segment composition and purchasing behavior before and during the pandemic and estimate segment-specific price elasticities. We provide targeted marketing recommendations tied to each segment, offering guidance for businesses to enhance consumer engagement in the post-pandemic dietary supplement market.

Keywords

consumer segmentation RFM framework vitamin market COVID-19

Introduction

Consumer segmentation plays a crucial role in enabling firms to effectively target specific consumer segments and allocate resources efficiently in retail marketing and decision-making. The segmentation is commonly based on the combination of the key metrics, including geographical factors, which classify consumers based on their location; demographic attributes, which consider characteristics like age, gender, income level, education, and household size; behavioral segmentation, which analyzes purchasing patterns, brand loyalty, frequency of purchases, and product usage; and psychographic factors, which focus on lifestyle choices, values, interests, and personality traits (Tynan & Drayton, 1987). The core idea behind market segmentation is that consumers grouped in the same segment exhibit similar characteristics, which translate into aligned buying behavior and demand patterns. Moreover, firms need a comprehensive understanding of how to identify the segments for potential new customers to successfully expand their market and strengthen their competitive position.

Machine learning (ML) techniques have become increasingly popular in customer segmentation. These techniques aid in analyzing purchase behavior, classifying customers, and designing effective marketing strategies. For example, Qadadeh and Abdallah (2018) applied multiple clustering algorithms to segment customers using an insurance company dataset, enabling businesses to identify customer characteristics and implement more effective marketing strategies. Similarly, Abdulhafedh (2021) combined K-means and hierarchical clustering to categorize customers based on their transaction data, helping credit card companies refine their strategic planning. Swenson et al. (2016) took a different approach by applying both unsupervised and supervised learning to segment the healthcare market. Using patient medical records and demographic data, their study aimed to improve efficiency in value-based healthcare services.

Several studies have focused on applying ML methods in the agriculture and food market. Harding and Lovenheim (2017) used the K-Median algorithm to group products based on nutritional levels and examined how nutrition taxes impact food consumption using Nielsen Homescan data. Varela (2013) explored consumer preferences in the orange juice market. The study applied multiple ML techniques to analyze how consumers responded to different flavors, providing valuable insights for product development. Bargoni et al. (2022) used K-means clustering to study strategies adopted by agri-food businesses in Italy during COVID-19. The key contribution of their study is a qualitative segmentation that provides firms with insights into emerging strategic approaches for maintaining competitive advantage.

A common approach in ML-based consumer segmentation is the use of RFM (Recency–Frequency–Monetary) attributes derived from transaction data (Cheng & Chen, 2009; Khajvand et al., 2011; Rahim et al., 2021; Rungruang et al., 2024). Sarvari et al. (2016) found that integrating RFM attributes with demographic information leads to more accurate customer segmentation. Since the Nielsen Homescan data includes both consumer purchase histories and demographics, we utilized this dataset to construct a market segmentation model for vitamin C purchases with the K-means algorithm, incorporating RFM attributes and demographic factors. We analyze the characteristics of each segment to provide insights for firms developing business strategies. Since our objective is to understand segment evolution under economic shocks, we exploit COVID-19 as an exogenous demand shock and compare purchasing patterns in the pre- and during-pandemic periods. This setting is relevant because demand for vitamin C rose notably during the COVID-19 pandemic, given its well-known immune-boosting properties (Ahmed et al., 2023).

This study contributes to market research practice in three ways. First, we use machine learning approach to categorize vitamin C consumers into segments, which, to our knowledge, has not been previously studied in the literature for similar products. Second, we link the clustering to segment-specific own-price elasticities, connecting who the segments are to how they respond to price and providing data-driven target guidance for pricing and promotion. Third, we examine COVID-19 as an economic shock to evaluate how consumer segment composition differs before and during the disruption. Using the centroid distance metric, we develop a practical tool that managers can adopt to monitor and respond to changes in consumer behavior under other market disturbances.

Empirical Analysis

Methodology

K-means clustering is an unsupervised machine learning method that divides $n$ observations into $k$ distinct, non-overlapping clusters. The algorithm aims to minimize the within-cluster variation, typically using the squared Euclidean distance as a measure of dissimilarity. The process begins with an initial set of $k$ centroids, after which each observation is assigned to the closest cluster based on this similarity measure. The objective function is as follows:

\arg \min_{S} \sum_{i = 1}^{k} \sum_{x \in S_{i}} ‖ x - μ_{i} ‖^{2}

(1)

where

μ_{i}

is the centroid of the cluster

S_{i}

x

represents the data points in cluster

S_{i}

, and

k

is the number of clusters.

K-means clustering is applicable when the features are either continuous or binary variables. Therefore, categorical variables are converted into continuous dummy variables and standardized using z-scores. The algorithm follows a two-step process. First, households are randomly distributed into distinct clusters, and the centroid of each cluster is computed as the average of the household characteristics within that cluster. Subsequently, centroids are updated based on the latest cluster assignments. This process repeats until there are no significant changes in their positions. At the final stage, each household is assigned to a unique cluster, ensuring exclusivity and preventing overlapping memberships.

To analyze potential changes in the profile of each market segment before and during COVID-19, we applied K-means separately to the pre-COVID and during-COVID datasets, using March 11, 2020 (the WHO pandemic declaration) as the cut-off date. Each cluster has a centroid, the mean vector of its member observations across all features. Clusters are estimated separately for each period. Clustering labels are ordered within-period by monetary and then frequency scores. Note that labels are not intended to imply one-to-one continuity across periods. We present side-by-side centroid profiles and segment shares for each period. As a descriptive summary, we report an order based centroid distance:

{\tilde{d}}_{k} = {‖ C_{[k]}^{pre} - C_{[k]}^{during} ‖}_{2}, k = 1, \dots, K

(2)

where

C_{[k]}^{pre}

and

C_{[k]}^{during}

are the centroids of the

k

th ordered cluster in the pre- and during-COVID periods, respectively.

To obtain segment-specific price elasticities, we estimate a log–log specification with brand tier fixed effects. For each period $t \in {Pre, During}$ , we estimate:

\log Q_{i t} = α + \sum_{k = 1}^{K} β_{k}^{(t)} D_{i k} \log P_{i t} + γ_{tier (i)} + θ_{q (t)} + u_{i t} .

(3)

where

Q_{i t}

is quantity (mg) for transaction

i

in period

t

;

P_{i t}

is the unit price per mg (unit price is computed as net spend per purchase divided by total milligrams purchased);

D_{i k}

equals 1 if household

i

belongs to segment

k

and 0 otherwise;

β_{k}^{(t)}

is the own-price elasticity for segment

k

in period

t

in this log–log specification;

γ_{tier (i)}

are brand tier fixed effects (national brand versus private labels);

θ_{q (t)}

are year-quarter fixed effects, and

u_{i t}

is the error term. Standard errors are clustered at the household level.

With the estimated elasticities, the revenue change for segment $k$ is

Δ R_{k} \approx R_{k} (1 + ε_{k}) \frac{Δ P}{P}

(4)

where

R_{k}

is baseline expenditure for segment

k

ε_{k}

is the segment-specific own-price elasticity, and

Δ P / P

is the planned price change (constant-elasticity approximation).

Data

Our study uses Nielsen Consumer Panel data from 2018 to 2022, including consumer purchase histories and demographic information. The dataset comprises a total of 22,025 purchase records, which have then been aggregated for household-level analysis. The demographic variables incorporated in this market segmentation analysis are race, household size, female head education level, and income level. Information on race, household size, and female head education level is provided directly in the data set. However, Nielsen Consumer Panel data does not provide exact income values for each household, but instead offers income ranges. We used the median value within each range as a proxy for the actual income. For instance, for households in the income range between $70,000 and $99,999, we used $84,999.5 as their representative income. To classify households into income groups, we follow the federal poverty guidelines (FPL) issued annually by the U.S. Department of Health and Human Services (HHS), which adjust thresholds by household size (Creamer et al., 2022). Households with income above 400% of FPL are defined as “high”, those between 200% and 400% as “middle”, and those below 200% as “low” income group.

In addition, we extracted the recency, frequency, and monetary (RFM) metrics from the customer purchase records. Specifically, Recency (R) measures the time elapsed since a consumer’s most recent purchase. A shorter interval corresponds to a lower recency score. Frequency (F) represents the number of transactions within a specific period. A higher frequency score indicates more frequent purchases. Finally, Monetary (M) reflects the total amount spent by a consumer. A higher monetary score represents greater spending. Next, we normalized the RFM attributes and assigned RFM scores ranging from 1 to 5. A higher recency score indicates a longer time since the last purchase, a higher frequency score reflects more frequent purchasing behavior, and a higher monetary score represents greater spending levels (Hughes, 2005). Variables used in the price elasticity estimation include quantity purchased, unit price paid, brand type, and date of purchase (see Supplementary Material for estimation details).

Empirical Results

To determine the optimal number of clusters applied in the K-means algorithm, the elbow method was employed. This technique is broadly used in clustering analysis, including consumer segmentation (Kansal et al., 2018; Syakur et al., 2018). It aims to identify the point at which adding more clusters no longer significantly reduces the within-cluster sum of squares. As shown in Figure 1, the elbows suggest that three or four clusters are optimal, as indicated by the sharp decline in the cluster sum of squared distances. To ensure that the clusters are compatible and meaningful for both periods, we selected four clusters as the final segmentation.

Figure 1.

Elbow Plots (WCSS vs $K$ ) for the Two Periods

Table 1 represents the RFM and demographic characteristics for each cluster before and during COVID-19. Pre-COVID, Cluster 1 showed high recency, low monetary, and moderate frequency. During COVID, K-means captured consumers with slightly more frequent purchases, though still relatively low spending. The profile during COVID also shows a greater representation of low-income households and female heads with a high school education. Taken together, Cluster 1 remains a low-spending, relatively infrequent segment across periods. Cluster 2 was broadly stable behaviorally and demographically. The monetary score increased only slightly and the frequency remained low, with a persistent concentration of White and Black households and a high share of low-income consumers. The rise in recency may indicate this group consisted of newer or one-time buyers, potentially motivated by external shocks, such as supply disruptions. Therefore, this group likely represents occasional or reactive shoppers.

Table 1.

Description of Demographic and RFM Variables by Cluster Pre- and During-COVID

Period	Description	Cluster 1	Cluster 2	Cluster 3	Cluster 4
Pre-COVID	Recency score	4.19	1.43	3.80	2.88
	Monetary score	1.65	2.12	3.89	4.27
	Frequency score	2.49	2.42	1.80	4.25
	Household size	2.49	2.48	2.39	2.36
	White	0.77	0.83	0.82	0.84
	Black	0.13	0.09	0.09	0.08
	Asian	0.04	0.03	0.04	0.04
	Other	0.06	0.05	0.06	0.05
	Low income	0.42	0.42	0.43	0.40
	Middle income	0.32	0.28	0.28	0.33
	High income	0.26	0.30	0.29	0.30
	Female head (Under college)	0.28	0.31	0.30	0.31
	Female head (college)	0.73	0.69	0.70	0.69
During-COVID	Recency score	1.52	4.02	3.97	1.49
	Monetary score	2.03	2.26	4.20	4.28
	Frequency score	2.28	2.09	4.17	4.36
	Household size	2.43	2.36	2.30	2.34
	White	0.79	0.82	0.79	0.82
	Black	0.13	0.10	0.10	0.09
	Asian	0.03	0.04	0.06	0.03
	Other	0.04	0.04	0.05	0.06
	Low income	0.41	0.46	0.43	0.43
	Middle income	0.34	0.27	0.30	0.33
	High income	0.25	0.27	0.27	0.24
	Female head (Under college)	0.36	0.29	0.32	0.32
	Female head (college)	0.65	0.71	0.68	0.68

Households per segment: Pre–COVID $[434, 639, 906, 903]$ ; During–COVID $[530, 821, 460, 819]$ .

Cluster 3 underwent the most notable shift in the profiling. Pre-COVID, it combined very low frequency with relatively high monetary score. During COVID, the profile suggests that Cluster 3 appears to capture a new group of high-frequency, high-spending shoppers, whose characteristics may reflect more engaged or health-motivated supplement shoppers during the pandemic. Cluster 4 remained the most stable in its behavioral profile: frequency remained high and monetary score was consistently elevated. Pre-COVID, it included a high proportion of White households and college educated female heads and the lowest representation of low-income households among the clusters. During COVID, households assigned in Cluster 4 exhibited similar high spending and frequency, representing a behaviorally consistent, high-value consumer segment. Hence, the segment likely comprises routine or brand-loyal supplement shoppers whose purchasing patterns were less affected by external disruptions.

The centroid difference reflects how the average profile of each segment has changed between the pre- and during COVID periods. Clustering is estimated separately by period, so each solution reflects contemporaneous purchasing patterns. This measure is descriptive and does not imply one-to-one tracking of households or segments. It summarizes how far the cluster centers are in the RFM and demographic feature space. Cluster 3 exhibits the largest centroid difference (2.74), indicating a substantial transformation in the characteristics of the households assigned to this cluster during COVID, consistent with the higher purchase frequency and spending reported earlier. Cluster 4 also shows a notable difference (1.89), reflecting moderate changes in purchasing intensity and a slightly broader demographic composition. By contrast, Clusters 1 and 2 display relatively small centroid differences (0.66 and 0.48), aligning with their more stable RFM and demographic profiles. Overall, the more substantial profile changes during the pandemic are concentrated in the higher spending segments.

We estimated price sensitivities separately by period; labels are ordered within each period by monetary and frequency. Elasticities come from a log–log specification with brand tier (national vs. private label) and year–quarter fixed effects. This helps mitigate bias arising from shifts in assortment composition. While we considered including finer brand-level controls, sample size constraints at the segment level limited estimation feasibility. As shown in Table 2, demand is inelastic across all segments in both the pre- and during-COVID periods. Under this convention, the during-COVID solution shows more negative elasticities for the lower-frequency segments (Clusters 1–2), a modest reduction in sensitivity for the higher-engagement segment (Cluster 3), and little change for the high-value segment (Cluster 4). Taken together, these patterns suggest that uniform discounts are unlikely to raise revenue; if price changes are used, they should be targeted to segments with the largest absolute elasticities. These dynamics align with the behavioral profiles implied by the within-period labels. In the during-COVID solution, the lower-engagement segments (Clusters 1–2) display more negative elasticities, consistent with greater budget orientation under elevated economic uncertainty. By contrast, the higher-engagement segment (Cluster 3) exhibits a modestly less negative elasticity alongside higher purchase frequency, consistent with more routinized or health-motivated buying, and Cluster 4 remains comparatively stable with low price sensitivity, consistent with a loyal, high-value segment whose demand is less affected by price changes. These findings align with prior literature that occasional shoppers tend to be more price-sensitive (Gazquez-Abad & Sanchez-Perez, 2009), whereas more engaged, value-oriented, or health-conscious consumers exhibit lower price sensitivity (Prasad et al., 2008).

Table 2.

Estimated Price Elasticities by Cluster: Pre-COVID vs. During-COVID

Cluster label	Time period	Price Elasticity (SE)
Cluster 1	Pre-COVID	$- 0.713 (0.024)$
Cluster 2	Pre-COVID	$- 0.753 (0.025)$
Cluster 3	Pre-COVID	$- 0.827 (0.025)$
Cluster 4	Pre-COVID	$- 0.794 (0.025)$
Cluster 1	During-COVID	$- 0.826 (0.032)$
Cluster 2	During-COVID	$- 0.845 (0.033)$
Cluster 3	During-COVID	$- 0.770 (0.034)$
Cluster 4	During-COVID	$- 0.787 (0.032)$

Notes. Clusters 1–4 are derived separately across periods; labels do not imply direct equivalence.

Under the constant-elasticity (log–log) specification, a uniform 10% reduction in list price lowers revenue in every segment, whereas a uniform 5% increase raises it. Because all estimated elasticities are inelastic, the implied segment-level effects are small: roughly 1.6 to 2.9 percentage points decrease for the 10% cut and about 0.8 to 1.4 percentage points increase for the 5% increase. From a managerial perspective, inelastic demand argues against across-the-board discounting. If price changes are considered, modest list-price increases are expected to raise revenue, and any promotional activity should be tightly targeted to the segment with the largest absolute elasticity within the relevant period. If price changes are pursued, modest list-price increases or tightly targeted, short-lived promotions aimed at the most price-responsive segment are recommended.

Discussion and Conclusion

This study applies K-means clustering to segment U.S. vitamin C buyers. The analysis reveals distinct shopping patterns, with clear differences in price sensitivity across four segments. In addition, we show that economic disruption (e.g., the COVID-19 pandemic) is associated with within-period shifts in expenditure composition and modest changes in price responsiveness. We link behavioral clustering to segment-specific own-price elasticities by period and translate those elasticities into expected revenue changes under common price moves, so recommendations are anchored in financial relevance. Because market conditions differ across periods, segments and elasticities are estimated separately for the pre- and during-COVID windows, providing a shock–aware template that can be reused for other disruptions without assuming one–to–one segment continuity. The results offer actionable guidance for firms in the dietary supplement market.

Having established the segment structure and price responsiveness, we translate these findings into managerial actions. For lower-engagement or occasional segments (limited spending; infrequent purchases), the objective is reactivation and habit formation. Effective tactics include win-back outreach based on loyalty histories, targeted coupons, limited-time offers, and bundle/threshold promotions (e.g., buy-one-get-one), to stimulate frequency. Where feasible, memberships or subscriptions (auto-replenishment, reorder reminders) can nudge repeat purchasing and reduce reliance on deep discounts (Jayaraman et al., 2013; Meyer-Waarden, 2008; Peker et al., 2017).

For higher-value segments (those concentrating spend within a period), the priority is to protect margin while reinforcing perceived value. Pair price decisions with value drivers beyond price, such as reliable assortment, service benefits, and clear quality or efficacy cues (Grewal et al., 2011; Steenkamp et al., 2010). Personalized recognition can help sustain frequency without conditioning the segment on deep discounts (Peker et al., 2017). Among habitual, health-oriented buyers, premium presentation, product innovation, and relevant wellness content support willingness to pay (Camanzi et al., 2024; Valls et al., 2012). Given inelastic demand across all segments, broad discounts are not expected to raise revenue; if prices are adjusted, modest list-price increases or narrowly targeted, limited-duration promotions should be directed to segments with larger absolute elasticities and meaningful expenditure shares within each period.

This study also addresses the lack of structured consumer segmentation evidence in the vitamin category by combining RFM- and demographic-based clustering with segment-specific elasticities to inform pricing and promotion. A limitation of this study is temporal coverage: only Nielsen Homescan data through 2022 had been integrated when the analysis was finalized; consequently, the study captures behavior during and immediately after the pandemic but not longer-run post-pandemic dynamics. Additionally, while we account for brand-type differences using brand-tier fixed effects, our ability to estimate more granular brand-level effects was limited by sample size within each segment. Future work could track patterns beyond 2022, incorporate larger panel subsets, examine brand loyalty and segment migration, and consider alternative weighting schemes (e.g., WRFM) when recency is less informative during shocks (Vaidya & Kumar, 2006).

Footnotes

ORCID iDs

Kuo Liu

Yu Yvette Zhang

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

Abdulhafedh

(2021). Incorporating k-means, hierarchical clustering and pca in customer segmentation. Journal of City and Development, 3(1), 12–30.

Ahmed

Hossain

Chakrabortty

Arafat

K. I.

Hosen

M. J.

Khan

M. M. R.

(2023). Impacts of vitamin c and d supplement on covid-19 treatment: Possible patho-mechanisms and evidence from different countries. The Egyptian Journal of Bronchology, 17(1), 13. https://doi.org/10.1186/s43168-023-00186-6

Bargoni

Bertoldi

Giachino

Santoro

(2022). Competitive strategies in the agri-food industry in Italy during the COVID-19 pandemic: An application of K-means cluster analysis. British Food Journal, 124(12), 4782–4799. https://doi.org/10.1108/bfj-07-2021-0738

Camanzi

Ahmadi

K. S.

Prosperi

Collewet

El Khechen

Michailidis

A. C.

Charatsari

Lioutas

E. D.

De Rosa

Francescone

(2024). Value seeking, health-conscious or sustainability-concerned? Profiling fruit and vegetable consumers in Euro-Mediterranean countries. British Food Journal, 126(13), 303–331. https://doi.org/10.1108/bfj-12-2023-1151

Cheng

C. H.

Chen

Y. S.

(2009). Classifying the segmentation of customer value via rfm model and rs theory. Expert Systems with Applications, 36(3), 4176–4184. https://doi.org/10.1016/j.eswa.2008.04.003

Creamer

Shrider

E. A.

Burns

Chen

(2022). Poverty in the United States: 2021. US Census Bureau.

Gazquez-Abad

J. C.

Sanchez-Perez

(2009). Characterising the deal-proneness of consumers by analysis of price sensitivity and brand loyalty: An analysis in the retail environment. International Review of Retail Distribution & Consumer Research, 19(1), 1–28. https://doi.org/10.1080/09593960902780922

Grewal

Ailawadi

K. L.

Gauri

Hall

Kopalle

Robertson

J. R.

(2011). Innovations in retail pricing and promotions. Journal of Retailing, 87(1), S43–S52. https://doi.org/10.1016/j.jretai.2011.04.008

Harding

Lovenheim

(2017). The effect of prices on nutrition: Comparing the impact of product- and nutrient-specific taxes. Journal of Health Economics, 53(1), 53–71. https://doi.org/10.1016/j.jhealeco.2017.02.003

10.

Hughes

A. M.

(2005). Strategic database marketing. McGraw-Hill Pub. Co.

11.

Jayaraman

Iranmanesh

Kaur

M. D.

Haron

(2013). Consumer reflections on “buy one get one free”(bogo) promotion scheme—an empirical study in Malaysia. Research Journal of Applied Sciences, Engineering and Technology, 5(9), 2740–2747. https://doi.org/10.19026/rjaset.5.4800

12.

Kansal

Bahuguna

Singh

Choudhury

(2018). Customer segmentation using k-means clustering. In 2018 international conference on computational techniques, electronics and mechanical systems (CTEMS) (pp. 135–139). IEEE.

13.

Khajvand

Zolfaghar

Ashoori

Alizadeh

(2011). Estimating customer lifetime value based on rfm analysis of customer purchase behavior: Case study. Procedia Computer Science, 3(1), 57–63. https://doi.org/10.1016/j.procs.2010.12.011

14.

Meyer-Waarden

(2008). The influence of loyalty programme membership on customer purchase behaviour. European Journal of Marketing, 42(1/2), 87–114.

15.

Peker

Kocyigit

Eren

P. E.

(2017). LRFMP model for customer segmentation in the grocery retail industry: A case study. Marketing Intelligence & Planning, 35(4), 544–559. https://doi.org/10.1108/mip-11-2016-0210

16.

Prasad

Strijnev

Zhang

(2008). What can grocery basket data tell us about health consciousness? International Journal of Research in Marketing, 25(4), 301–309. https://doi.org/10.1016/j.ijresmar.2008.05.001

17.

Qadadeh

Abdallah

(2018). Customers segmentation in the insurance company (tic) dataset. Procedia Computer Science, 144(1), 277–290. https://doi.org/10.1016/j.procs.2018.10.529

18.

Rahim

M. A.

Mushafiq

Khan

Arain

Z. A.

(2021). RFM-based repurchase behavior for customer classification and segmentation. Journal of Retailing and Consumer Services, 61, Article 102566. https://doi.org/10.1016/j.jretconser.2021.102566

19.

Rungruang

Riyapan

Intarasit

Chuarkham

Muangprathub

(2024). RFM model customer segmentation based on hierarchical approach using FCA. Expert Systems with Applications, 237, Article 121449. https://doi.org/10.1016/j.eswa.2023.121449

20.

Sarvari

P. A.

Ustundag

Takci

(2016). Performance evaluation of different customer segmentation approaches based on RFM and demographics analysis. Kybernetes, 45(7), 1129–1157. https://doi.org/10.1108/k-07-2015-0180

21.

Steenkamp

J. B. E.

Van Heerde

H. J.

Geyskens

(2010). What makes consumers willing to pay a price premium for national brands over private labels? Journal of Marketing Research, 47(6), 1011–1024. https://doi.org/10.1509/jmkr.47.6.1011

22.

Swenson

E. R.

Bastian

N. D.

Nembhard

H. B.

(2016). Data analytics in health promotion: Health market segmentation and classification of total joint replacement surgery patients. Expert Systems with Applications, 60(1), 118–129. https://doi.org/10.1016/j.eswa.2016.05.006

23.

Syakur

M. A.

Khotimah

B. K.

Rochman

Satoto

B. D.

(2018). Integration k-means clustering method and elbow method for identification of the best customer profile cluster. In IOP Conference Series: Materials Science and Engineering. (volume 336, p. Article 012017). IOP Publishing. https://doi.org/10.1088/1757-899x/336/1/012017

24.

Tynan

A. C.

Drayton

(1987). Market segmentation. Journal of Marketing Management, 2(3), 301–335. https://doi.org/10.1080/0267257x.1987.9964020

25.

Vaidya

O. S.

Kumar

(2006). Analytic hierarchy process: An overview of applications. European Journal of Operational Research, 169(1), 1–29. https://doi.org/10.1016/j.ejor.2004.04.028

26.

Valls

J. F.

Sureda

Andrade

M. J.

(2012). Consumers and increasing price sensibility. Innovative Marketing, 8(1), 52–63.

27.

Varela

(2013). Application of multivariate statistical methods during new product development: Case study—Application of principal component analysis and hierarchical cluster analysis on consumer liking data of orange juices. In Mathematical and Statistical Methods in Food Science and Technology (187). Wiley. https://doi.org/10.1002/9781118434635.ch11