Abstract
Objective
Telemedicine platforms played a crucial role during the COVID-19 pandemic, alleviating issues related to the shortage and unequal distribution of healthcare resources. The purpose of this study is to identify key factors affecting the service quality of telemedicine platforms in China, with the dual objectives of advancing patient wellbeing and informing evidence-based service innovations for industry stakeholders.
Methods
To quantitatively assess the impact of these key factors on health and wellbeing from the perspective of healthcare consumers, a total of 25,499 valid online reviews were collected from telemedicine platforms. To establish a service quality evaluation framework, this study proposes a novel approach that combines the Servqual quality assessment model with a CNN-BiLSTM deep learning model enhanced by an attention mechanism.
Results
Analysis of the full sample shows that healthcare consumers are most concerned about the quality of services provided by telemedicine platforms, with the most important being the professional competence of doctors, a critical factor for promoting consumer health and wellbeing. The proposed hybrid deep learning approach demonstrates superior performance in sentiment classification accuracy, outperforming conventional methods by 11.11 percentage points. This methodological innovation enables more precise identification of consumer sentiment patterns across service dimensions.
Conclusion
The novel quality assessment framework introduced here provides actionable insights for advancing telemedicine platforms, driving progress toward precision healthcare and consumer-centric wellbeing. Furthermore, it enables healthcare consumers to select telemedicine services aligned with their personalized needs.
Introduction
With the rapid development of the internet and the increasing demand for medical services, internet-based healthcare services and health communities are becoming more prevalent, disrupting traditional methods of accessing health information and disease treatment.1,2 Many countries have elevated telemedicine to a national strategy, and China's development in this area began in the mid-1980s, gradually establishing a national telemedicine network. By the end of 2001, there were 300 online hospitals in the country. Since 2010, telemedicine has seen widespread application in eastern cities such as Beijing, while western regions like Guizhou have also begun to focus on this field. Finally, China successfully achieved telemedicine coverage for 13,000 medical institutions and all impoverished counties, significantly improving the accessibility of healthcare services. 3 Bolstered by emerging technologies such as 5G, artificial intelligence, and big data, China's telemedicine services have demonstrated capabilities that are on par with, and even surpass, internationally advanced standards. Focusing on the developmental challenges of China's telemedicine platforms can provide valuable insights for other countries in deploying telemedicine.
Especially during the COVID-19 pandemic, residents are increasingly accessing health information through the internet and mobile applications. This shift is moving them away from solely relying on hospital visits.4–6 On the one hand, online consultations by healthcare consumers on telemedicine platforms reduce face-to-face contact between doctors and healthcare consumers, helping to reduce the spread of infectious diseases. 7 On the other hand, the cross-regional service characteristics of telemedicine platforms broaden the service population, enabling healthcare consumers to contact doctors anytime and anywhere to consult on health issues and disease treatments.8–11 Despite the significant contributions of telemedicine platforms, issues including user privacy breaches, physician diagnostic errors, and cumbersome platform operations not only affect user loyalty and stickiness but also attract widespread social attention.12–15 Assessing telemedicine platforms based on user reviews and promptly improving services and functionalities pose major challenges. These efforts are necessary to evaluate various quality indicators and achieve long-term sustainable development of the platform. 16
Service quality is the overall assessment of the superiority of a service by users. Due to the intangibility of services, regulating service quality is more challenging than regulating tangible products.17,18 Therefore, with the rapid development of telemedicine platforms, the academic community is increasingly focusing on service quality issues.
Previous studies have largely focused on subjective social survey methods such as questionnaires and grounded theory. However, these subjective social survey methodologies are contingent upon the voluntary participation of respondents, which may result in inadequate sample representativeness. Respondents might obscure their genuine sentiments to align with societal norms or the anticipations of others, thereby undermining the objectivity of the data. Furthermore, the static nature of questionnaire design hampers the dynamic capture of evolving user needs, and the protracted data collection period impedes the real-time identification of service quality issues. As the number of online comments about specific service platforms increases and becomes more accessible, research has shifted focus. Studies based on online comments and other objective approaches are growing.19,20 For example, Liu et al. 21 investigated the relationship between the tone of doctors’ voices during online health consultations and healthcare consumer satisfaction, finding that healthcare consumer satisfaction was influenced by the speed of the doctor's speech during consultation. Currently, most studies on telemedicine platforms rarely develop a multidimensional model to analyze service quality from aspects such as needs fulfillment, security, responsiveness, and user interface. Although the focus of related research varies, overall, it tends to limit practicality and effectiveness. Therefore, it is necessary to further study the service quality of telemedicine platforms through online user reviews, to provide references for optimizing the service quality and sustainable development of these platforms.
To quantitatively assess the impact of these key factors from the perspective of healthcare consumers, we collected 25,499 healthcare consumer reviews from the telemedicine platforms of “Good Doctor Online” (GDO) and “Doctor Dingxiang” (DDX). In this study, we analyzed the collected healthcare consumer comments through the topic generation model and deep learning-based sentiment analysis, comparing the impact of different dimensions on the quality of telemedicine platform from a real healthcare consumer perspective. Moreover, through the topic relevance matrix
22
and the Servqual quality assessment model,23,24 we sorted out the different aspects that healthcare consumers focus on and constructed a comprehensive and reasonable telemedicine platform quality assessment index system. In our work, a hybrid CNN-BiLSTM-MHA model was proposed, which integrates the local feature extraction capability of convolutional neural networks, the temporal modeling strength of bidirectional long-short-term memory networks, and the multilevel semantic focus enabled by multihead attention mechanisms. This model significantly enhances the accuracy of sentiment classification, achieving an improvement of approximately 11% compared to single models. Consequently, the quality evaluation method for the telemedicine platform presented in this study is rendered more reliable. The specific research questions addressed in this paper are as follows:
RQ1: How to find the factors that affect the quality of telemedicine platforms through healthcare consumer comments? RQ2: How to aggregate topics generated from health consumer reviews using the Servqual model? RQ3: How to evaluate the service quality of telemedicine platforms through sentiment analysis?
This article covers several parts, as follows: the second section describes the literature review. The third section details the proposed methodology for quality assessment of telemedicine platforms. The fourth section applies the new method to an actual case analysis, evaluating its effectiveness and practicality. Lastly, the discussions and conclusions are presented in the fifth and sixth sections, respectively.
Literature review
Role and benefits of telemedicine platforms in modern healthcare
The telemedicine platforms are extensions of traditional doctor–patient relationships on the internet, enabling patients to consult with healthcare professionals anytime and anywhere regarding health issues and disease treatments. 25 Healthcare consumer consultations conducted through telemedicine platforms represent an innovative approach to meeting the growing demand for medical services, allowing users to overcome time and geographic constraints and providing more options for both doctors and patients.26,27 The primary benefits include reducing infection risks and improving efficiency.
In terms of infection control, telemedicine platforms avoid face-to-face contact between healthcare providers and healthcare consumers, thereby lowering the risk of transmitting infectious diseases such as COVID-19, which can easily spread among close contacts.28,29
On the other hand, telemedicine platforms significantly enhance consultation efficiency. 30 For example, during the COVID-19 pandemic, shortages of medical equipment and overwhelmed hospitals were major issues. However, telemedicine platforms not only saved substantial costs and time spent on lockdown measures but also enabled excellent physicians to diagnose diseases quickly and conveniently.13,31
Importance of online reviews in evaluating telemedicine platforms
Online reviews are a valuable resource for revealing user opinions on service quality, and telemedicine platform-related online reviews can effectively assist other consumers in making choices. 32 The quantity, quality, content type, and even the length of review content influence users. 33 These online reviews cover aspects ranging from healthcare service quality, doctor professionalism, and staff attitude to platform usability, and some users even engage in conversations through reviews. Classifying and conducting sentiment analysis on these online reviews can help us effectively understand the focal points and issues of user concern. 34 Sudirjo et al. 35 analyzed online customer reviews on Tokopedia to identify factors influencing purchase decisions on Tokopedia online stores. Li et al. 36 used a machine learning-based conditional survival forest model to categorize online restaurant reviews from two popular tourist destinations into five categories: location, taste, price, service, and ambiance, to predict restaurant survival rates and determine which online review features are the best indicators of restaurant survival.
Materials and methods
The methodological framework of this study is illustrated in Figure 1. The study in this article consists of the following steps: firstly, the collected healthcare consumer comments are introduced and data preprocessing is performed on them. Subsequently, Term Frequency-Inverse Document Frequency (TF-IDF) 37 and Latent Dirichlet Allocation (LDA) topic models 38 are utilized to extract the service quality indicators of telemedicine platforms that healthcare consumers are concerned about. After analyzing the healthcare consumers comments for sentiment tendency through sentiment analysis techniques, quality assessment of the telemedicine platforms is conducted based on the extracted service indicators.

Overview of the research of service quality assessment including three main modules. (1) Preliminary work and LDA topic clustering, (2) sentiment analysis based on deep learning with comments, and (3) service quality evaluation based on comments sentiment analysis. LDA: Latent Dirichlet Allocation.
Data collection and preprocessing
This article collects data on users information and comments from two platforms with relatively excellent operation of online health communities in China: GDO and DDX. According to the 2023 China Internet Healthcare Development Report, GDO and DDX rank among the top three platforms in terms of user activity on China's telemedicine services, collectively covering over 80% of the online consultation user base and thereby demonstrating broad representativeness. Specifically, GDO primarily offers comprehensive diagnostic and treatment services across multiple departments, including internal medicine, surgery, and others, while DDX focuses on health education and chronic disease management. The combination of these two platforms effectively encompasses the core service modalities of telemedicine.
The data includes information such as healthcare consumer name, healthcare consumer comments and time of healthcare consumer comments. This study strictly adhered to data privacy protection protocols. Direct identifiers, including user aliases and contact information, were systematically removed from the original user-generated comment dataset. Furthermore, potential quasi-identifiers that might enable user identification through data linkage or inference were anonymized using generalization techniques. All data are securely encrypted and stored on controlled servers, with access restricted to authorized members of the research team. Partial data obtained from four different channels are presented in Table 1.
Composition of review datasets from two telemedicine platforms.
DDX: Doctor Dingxiang; GDO: Good Doctor Online.
Through manual classification, each comment is categorized based on its sentiment orientation, labeled as positive, neutral, or negative. In order to ensure the accuracy of manual tagging, the study in this article adopts multiperson joint tagging to make multiple judgments on controversial comments, retaining comments with the same tagging by three people, and finally obtaining more than 10,000 comments from the two platforms as the training corpus (Table 2).
Examples of manual tagging of comments for emotional tendencies.
Capture topics of Servqual service quality assessment model
In order to obtain the service quality topics that healthcare consumers are concerned about as the feature dimension of telemedicine platforms, the study adopts healthcare consumer comments topic mining based on the Latent Dirichlet Allocation (LDA) topic model, and due to the specialization of medical information, further classification is achieved manually by combining the medical general knowledge with the LDA topic clustering.
After tokenizing and performing other preprocessing on the review data, this study used the doc2bow function from the Gensim library to convert the review texts into a bag-of-words model for text vectorization. Subsequently, evaluating the perplexity (a metric that measures the model's ability to predict unseen data) of models with varying topic counts to determine the optimal number of topics. Finally, three medical professionals independently reviewed and validated the generated topics, confirming that the optimal number of topics for our model is 14.
By explicitly setting the number of topics to be identified and implementing the LDA model, this study conducted a cluster analysis of healthcare consumer comments. Upon completing the clustering process and conducting an exhaustive analysis of the elicited topics, we observed intersections among some topics.
As shown in Table 3, to enhance the discriminability between topics, the identified topics were further refined and appropriately named with the assistance of experts in the field of telemedicine. This study, by comprehensively considering the intertopic correlation and the Servqual service quality assessment model, endeavors to reconstruct and refine the core indicator system for evaluating the service quality of telemedicine platforms. Given the fundamental differences between telemedicine and conventional services, the study revises the definitions of each dimension of the evaluation model to ensure a better alignment with the unique aspects of telemedicine services.
Results of the LDA clustering of healthcare consumer comments.
LDA: Latent Dirichlet Allocation.
Sentiment analysis model
For online reviews, sentiment analysis can be employed to determine healthcare consumers’ sentiments toward various issues.39–41 Identifying the polarity of text within sentences or documents, determining whether expressions are neutral, positive, or negative, constitutes a primary objective of sentiment analysis. Addressing this aim, this paper introduces a sentiment analysis model based on a hybrid CNN-BiLSTM-MHA architecture. Compared to single models, the hybrid model holds the potential to enhance the accuracy of sentiment analysis. As illustrated in Figure 2, the proposed model comprises five main components: an input layer, a CNN module, a Bidirectional Long Short-term Memory Network (BiLSTM) layer, a multihead attention layer, and an output layer.

Structure of the proposed hybrid deep learning model.
The role of the input layer is to accept normalized healthcare consumer comment data. In the convolutional layer, the feature extraction module can automatically identify key local features in comments. The BiLSTM layer is tasked with processing temporal sequence information, analyzing the logical relationship before and after a sentence to accurately understand complex sentence structures such as’ although the response is fast, the diagnosis is not accurate. In this study, a BiLSTM model was constructed using TensorFlow as the fundamental architecture, employing the Adam optimizer and tanh activation function. The BiLSTM layer processes feature vectors extracted by the CNN layer through bidirectional learning, capturing contextual information in time-series data.
The structure of the BiLSTM model can be expressed by the following formula:
In the equations,
Meanwhile, the attention layer serves to capture the inherent correlations between input healthcare consumer comments and related data, thereby enhancing the classification accuracy of healthcare consumer comment polarity.
Statistical analysis
The statistical analysis in this study employed a comprehensive approach to evaluate telemedicine platform quality. Descriptive statistics were calculated for sentiment scores across service quality dimensions, with means and standard deviations reported to quantify central tendency and variability. The TF-IDF algorithm was utilized to determine indicator weights, while LDA topic modeling extracted key service quality themes from 25,499 user reviews. Sentiment analysis performance metrics (accuracy, precision, recall, and F1-score) for the proposed hybrid deep learning model and comparison models were expressed as mean ± standard deviation across multiple validation runs. Quality assessment scores for platforms were calculated using weighted sentiment values with 95% confidence intervals. Statistical significance of indicator weights was verified through hypothesis testing (p < 0.001 for all secondary indicators). All text processing and deep learning implementations were conducted using Python's natural language processing libraries and TensorFlow framework.
Analysis and results
Analysis of service quality evaluation indicators
As services possess intangible attributes, establishing a scientifically robust system for evaluating service quality necessitates the consideration of various factors influencing service standards. Simultaneously, these indicators should be representative, striving for both brevity and inclusiveness to assess the quality of telemedicine platforms from diverse perspectives. Figure 3 illustrates the correlations between topics obtained through LDA extraction in the previous section, and

Correlation matrix between topics of the healthcare consumer comments obtained through LDA. LDA: Latent Dirichlet Allocation.
Tangibility involves factors such as the physical environment, equipment, and personnel image provided by the service provider. In the context of remote healthcare, which utilizes the internet to deliver services, this study considers the esthetic appeal of the remote healthcare platform's interface, system stability, and ease of operation as indicators of the platform's tangibility dimension. Reliability refers to the service provider's ability to fulfill service commitments punctually and accurately. For example, it entails timely completion of promised tasks and the ability to deliver services as promised. In this study, the professional competence of physicians, diversity of platform functionalities, and level of customer service are regarded as indicators of the platform's reliability dimension. Responsiveness refers to the speed and proactiveness of the service provider's response to customer requests, inquiries, and issues. In the context of remote healthcare, service speed such as the interaction speed of physicians and the promptness of postconsultation medication delivery can serve as indicators of the platform's responsiveness dimension. Assurance pertains to the service provider's knowledge, skills, reputation, and how they establish trust in service quality with customers. In this study, the platform's information accuracy, usefulness, and the ethical conduct of physicians are considered as indicators of the platform's assurance dimension. Empathy denotes the level of care and understanding shown by service providers toward customers, including the attention paid to and fulfillment of customers’ individual needs and expectations. In the process of telemedicine consultations, the demeanor of physicians during interactions, the platform's commitment to privacy protection, and the discount of price are indicative of the platform's concern and understanding toward patients, thus serving as dimensions of the platform's empathy.
Following the identification of primary and secondary indicators for assessing the quality of telemedicine platform services, the weights associated with these indicators were determined through the TF-IDF algorithm's analysis of feature words. This culminated in the formulation of the indicators for evaluating the quality of telemedicine platforms, as delineated in Table 4. As shown in Table 4, the weights of the five indicators of Quality of System, Quality of Service, Speed of Service, Quality of Information, and Attitude of Service are 0.1330, 0.3412, 0.2538, 0.1199, and 0.1521, respectively. It can be seen that for healthcare consumers, the quality of services provided by the platform is the most important, especially the professional competence of doctors.
Weights of primary and secondary service quality indicators.
Table 4 demonstrates that physicians’ professional competence is the cornerstone of service quality. The platform should enhance the verification of physician credentials and provide ongoing training to ensure that physicians possess advanced medical skills and effective communication abilities. Furthermore, expanding services, such as online prescription issuance and health record management, can diversify the platform's functionalities to meet the varied needs of users.
Regarding service speed, the platform could introduce an intelligent triage system or optimize physician scheduling mechanisms to reduce user wait times. In addition, improving postsale pharmaceutical delivery by partnering with logistics companies to establish expedited channels will ensure timely medication distribution.
Concerning system quality, the platform should conduct regular stress tests and optimize compatibility across mobile and web interfaces. Engaging user experience designers for interface iteration can also reduce advertising interference and enhance navigational logic.
For information quality, strengthening the review mechanism for medical content is essential. Establishing an expert team to periodically audit platform content will help ensure that the information provided is both scientifical and practical. To improve the usefulness of the information, a personalized health knowledge push feature should be developed to offer customized recommendations based on users’ consultation histories.
The three secondary indicators of service attitude carry similar weights; hence, the platform should implement a physician service attitude evaluation system that integrates user feedback into performance assessments. Simultaneously, reinforcing the confidentiality of personal information—by clearly communicating the scope of data usage and offering privacy settings—will enhance user trust. Finally, to support low-income groups, the platform should introduce public welfare consultations subsidized jointly by the government and the platform.
Table 5 presents the final assessment results obtained from applying the quality evaluation method developed in this study to the platforms “Good Doctor Online” and “Doctor Dingxiang”. The quality assessment calculation is derived from the emotional score of the theme and the weight of its indicators. The quality assessment calculation formulas are shown as follows:
Results of the quality assessment of the telemedicine platforms.
Where
Each comment is classified for sentiment polarity using a hybrid deep learning model, and a comprehensive score for each service dimension is calculated based on topic weights. On this basis, the variance and covariance matrix of each indicator is computed using the Bootstrap resampling technique, ultimately yielding the 95% confidence interval (CI) for each dimension. Numerical calculations are primarily performed using Python's scipy.stats module.
Table 5 shows that the overall platform quality score of GDO is 3.812, which marginally outperforms the overall platform quality score of DDX, which is 3.577. In three aspects: Quality of System, Quality of Service, and Speed of Service, the score of GDO are 4.161, 4.199, and 3.340 which superior performance compared to DDX, showcasing its advantages in platform stability, customer response, and operational efficiency. However, in terms of information quality, DDX surpasses GDO, indicating that it provides more accurate and comprehensive medical information. The performance of the two platforms is comparable in terms of service attitude, with deficiencies observed in professionalism and empathy. Overall, both platforms exhibit certain shortcomings in service speed, information quality, and service attitude, and neither can satisfy healthcare consumers in terms of postpurchase medication delivery and discounts of price.
Sentiment classification based on mixed deep learning model
Sentiment analysis algorithm as the core of the quality assessment method of telemedicine platform proposed in this article, its algorithm performance is related to the effectiveness of the quality assessment method of this article. To verify the performance of the proposed methodology in this article even further, several other deep learning algorithms were used on the same dataset for comparison experiments, and the results are shown in Table 6.
Predicting results of different classification models (mean value ± standard deviation).
BiLSTM: Bidirectional Long Short-term Memory Network; CNN: Convolutional Neural Network; RNN: Recurrent Neural Network.
The performance comparison of different models is presented as mean ± standard deviation, including four core metrics: Accuracy, Precision, Recall, and F1-score. The results of the different classification models in Table 3 show that the precision and F1-score of the proposed methodolog in this study is 91.28% and 91.25%, respectively, which is higher than the 89.91% and 89.33% of the CNN-BiLSTM model and the 88.77% and 87.69% of the BiLSTM-Att model.
The CNN extracts local textual features through convolutional kernels but lacks the capability to model long-range semantic dependencies, resulting in insufficient recognition of complex emotional expressions. The RNN, while modeling temporal sequences, suffers from the vanishing gradient problem, making it difficult to capture long-text dependency relationships. The BiLSTM effectively captures contextual information but demonstrates insufficient sensitivity to local critical features. The BiLSTM-Att enhances key positional weighting through an attention mechanism, yet its single-head attention fails to comprehensively cover multidimensional semantic correlations. Although the CNN-BiLSTM integrates CNN's local feature extraction with BiLSTM's sequential modeling, it lacks mechanisms to establish interconnections between features across different hierarchical levels or spatial positions. From the experimental results, the sentiment analysis algorithm proposed in this article greatly assists in evaluating the service quality of telemedicine platforms. By combining the strengths of CNN and BiLSTM, and introducing a multihead self-attention mechanism, the CNN-BiLSTM model with the multihead self-attention mechanism surpasses other models in critical performance metrics like accuracy, precision, recall, and F1-Score. The multihead attention mechanism can analyze online comments content more comprehensively, simulating the human ability to focus on key points during reading, while capturing multidimensional information such as service speed and doctors’ professional competence. With the parallel processing capacity of multiple heads, the model can perform comprehensive analyses from different perspectives, resulting in a more complete and rich data representation.
Discussion
The main purpose of this study was to establish a new quality assessment method for telemedicine platforms. To better analyze the emotions of healthcare consumers during the use of telemedicine services, we collected 25,499 online reviews from the telemedicine platforms. This study focuses on analyzing the data from these comments to extract and summarize the factors influencing the evaluation of service quality on telemedicine platforms. Additionally, employing methods such as deep learning, sentiment analysis is conducted on the textual content of comments. Subsequently, a model for evaluating the service quality of telemedicine platforms is constructed, providing an assessment of the service quality offered by telemedicine platforms.
Jonkisz et al. investigated the application of the Srevqual model in assessing healthcare quality in Asia, demonstrating its cross-domain applicability. 24 This study expands the boundaries of traditional service quality evaluation by integrating the Srevqual model with deep learning techniques. In contrast to the classical Srevqual framework proposed by Parasuraman et al., 17 this research restructures the dimensions specifically for telemedicine characteristics, making it better aligned with the practical needs of digital healthcare services. Furthermore, it transcends the dependence on subjective data inherent in traditional questionnaire-based surveys. 23
As services possess intangible attributes, establishing a scientifically robust system for evaluating service quality necessitates the consideration of various factors influencing service standards. Simultaneously, these indicators should be representative, striving for both brevity and inclusiveness to assess the quality of telemedicine platforms from diverse perspectives. This study, by comprehensively considering the inter-topic correlation and the Servqual service quality assessment model, endeavors to reconstruct and refine the core indicator system for evaluating the service quality of telemedicine platforms. Given the fundamental differences between telemedicine and conventional services, the study revises the definitions of each dimension of the evaluation model to ensure a better alignment with the unique aspects of telemedicine services. In this study, we collected healthcare consumer comments from two telemedicine platforms. By using LDA topic clustering and the Servqual model's five quality dimensions, we identified that healthcare consumer comments predominantly revolve around aspects such as Quality of System, Quality of Service, Attitude of Service, Quality of information, and Speed of Service. These identified comments topics are considered as service features that healthcare consumers are concerned about and are used to evaluate the quality of services provided by the platforms.
This study adopted a hybrid deep learning model, CNN-BiLSTM-MHA, for sentiment analysis of healthcare consumer comments. Comparative experiments reveal that, compared to traditional deep learning methods, the model constructed in this paper exhibits significant performed well in the precision and F1-Score of 91.28% and 91.25% which was higher than the single model by approximately 11%. By introducing multiple attention heads, the model can focus on different parts of the comments text from various perspectives, capturing richer semantic information. This multihead attention mechanism enables the model to achieve better accuracy and generalization capabilities in sentiment classification tasks. From the experimental results, the sentiment analysis algorithm proposed in this article greatly assists in evaluating the service quality of telemedicine platforms. The study found that combining CNN with BiLSTM, and introducing a multihead self-attention mechanism, the CNN-BiLSTM model with the multihead self-attention mechanism surpasses other models in critical performance metrics like accuracy, precision, recall, and F1-Score. The multihead attention mechanism can analyze online comments content more comprehensively, enhancing the model's ability to understand healthcare consumers’ needs and service experiences, thereby providing more targeted improvement suggestions for the platform. With the parallel processing capacity of multiple heads, the model can perform comprehensive analyses from different perspectives, resulting in a more complete and rich data representation.
Limitations
The data samples utilized in this study originate from GDO and DDX platforms, which may introduce certain limitations. Users of both platforms are predominantly distributed in eastern developed cities, while western regions exhibit lower user proportions. This geographical imbalance may result in underestimation of remote areas’ sensitivity to logistics efficiency (such as medication delivery delays) and information accuracy (including dialect-related communication barriers). Additionally, the platforms’ primary user base consists of young and middle-aged individuals (20–45 years old), with relatively low representation of elderly users (>60 years). Consequently, older demographics’ requirements for operational simplicity (e.g. interface complexity) may not be adequately reflected. Regarding platform characteristics, DDX specializes in health science popularization whereas GDO emphasizes online consultations. These distinct service models could lead to differentiated user priorities, the former potentially emphasizing information quality and the latter prioritizing response speed.
Conclusion
This study introduces a novel method for evaluating the quality of telemedicine platforms, using the Servqual quality assessment model in conjunction with the CNN-BiLSTM-MHA deep learning model. The swift proliferation of such platforms during the COVID-19 crisis has provided substantial assistance to China in addressing issues of medical resource scarcity and imbalances in supply and demand, while also supporting healthcare consumers’ health and wellbeing. However, uncontrolled and aggressive expansion could potentially lead to future challenges, including pricing irregularities and fraudulent marketing practices within the telemedicine sector. Consequently, assessing platform quality through healthcare consumer comments, which authentically capture their sentiments on aspects influencing their health and wellbeing, emerges as indispensable for the sustainable evolution of telemedicine services.
The evaluation framework and hybrid deep learning model developed in this study are not only applicable to telemedicine platforms in China, but also provide a viable methodology for global scenarios with similar characteristics. The urban–rural service disparities revealed in this research show parallels to the pain points of telemedicine implementation in developing regions such as India and Brazil. Through word embedding transfer learning techniques, the proposed model can be effectively transferred to analyze user reviews in other languages, thereby offering a cross-cultural service quality assessment tool for multilingual regions. The framework enables dynamic adjustment of weight coefficients across different evaluation dimensions to accommodate varying policy requirements and regulatory intensities in different national contexts.
Our findings have important practical implications for participants in telemedicine platforms. This study found that healthcare consumers’ evaluation of the quality of remote medical platform services can be mainly divided into five parts, with the most important being Quality of Service, a critical factor for promoting consumer health and wellbeing. For practitioners of telemedicine platforms, it is essential to standardize and control service pricing while rigorously ensuring service quality to promote both health and wellbeing. With the expansion of healthcare needs, the new methodology proposed in this study for assessing the quality of telemedicine services can provide valuable insights into the development of healthcare services. Based on the weights of service quality dimensions and identified score deficiencies, it is recommended that the platform implement phased improvements according to the priority levels outlined in Table 7 below. This article validates the proposed method through examples, but the collected data may have sample bias issues. In future work, the method can be validated by expanding channels to collect data.
The schedule for telemedicine platforms improvement tasks.
AI: artificial intelligence.
Footnotes
Ethical considerations
Ethical approval is not applicable to this study as no human participants or animals are used.
Consent to participate
Written informed consent was obtained from all participants prior to their inclusion in the research. The consent process explicitly outlined the study's purpose, procedures, potential risks, and benefits, as well as participants’ right to withdraw at any time without penalty. All data were anonymized prior to analysis to ensure confidentiality. Participants were informed that anonymized findings may be published in an open-access format, freely accessible to the public. No personally identifiable information (e.g. images, names, medical records) is included in this manuscript.
Author contributions
XJ and YY involved in conceptualization; XJ in methodology, data curation, and writing original draft preparation ; CC in software and investigation; XW in validation; YY in formal analysis, resources, supervision, project administration, and funding acquisition;; XT in writing review and editing; ZL in visualization. All authors have read and agreed to the published version of the manuscript.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Social Science Fund of China, (grant number No.23BXW011).
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement
The data that support the findings of this study are available from the corresponding author upon reasonable request.
