Abstract
Healthcare analytics has been a rapidly emerging research domain in recent years. In general, healthcare solution design studies focus on developing analytic solutions that enhance product, process and practice values for clinical and non-clinical decision support. The objective of this study is to explore the scope of healthcare analytics research and in particular its utilisation of design and development methodologies. Using six prominent electronic databases, qualifying articles between 2010 and mid-2018 were sourced and categorised. A total of 52 articles on healthcare analytics solutions were selected for relevant content on public healthcare. The research team scrutinised the articles, using established content analysis protocols. Analysis identified that various methodologies have been used for developing analytics solutions, such as prototyping, traditional software engineering, agile approaches and others, but despite its clear advantages, few show the use of design science. Key topic areas are also identified throughout the content analysis suggesting topical research priorities in the field.
Introduction
Healthcare analytics is an emerging application development area in the medical informatics field. Typically, analytic solutions utilise computer-based analysis techniques for supporting both clinical and non-clinical decision-making. Electronic healthcare applications can maximise service quality by producing insights from data while minimising cost, or optimise operational health decision-making outcomes, and such systematic solutions have already been extensively adopted. 1 Big volumes of internal and external data, diversified medical data sources and reporting requirements have driven a push to utilise robust analytic system solutions for electronic health records, clinical decision support and personal or hospital data management. 2 Such systems provide support for managerial decision-making both for clinical care and for effective hospital operations, while supporting evidence creation by native data for health care decisions within hospital or organisational contexts.
Healthcare analytics solutions require thoughtful design and validation, especially when patient privacy, quality of care, security and evidence-based treatments are directly affected. Although solution design and development projects have been implemented through the use of established development methodologies, more impact would result from attention to how an appropriate abstraction of the design artefact makes a more general knowledge contribution and analytic solution innovation, with researchers exhorted to report in ways that contribute both to research and professional practice. Traditional solution design methodologies have failed to offer provision for new knowledge development and innovation. For example, Augustine 3 introduced in-house methodologies for developing a Hadoop-based analytic solution so that organisations can achieve better insights into businesses, increase their income and profitability, revealed from their big data. Blount et al. 4 used a prototyping method for developing a healthcare analytics system for real-time analysis. These solution development methodologies focused on specific solution development aspects rather looking at knowledge contribution and solution innovation. Despite much useful work, piecemeal examples do not clarify transferrable principles. A systematic literature analysis both enhances knowledge for researchers and practitioners, and suggests future directions for research and development. In healthcare analytics, however, systematic analysis is patchy, with most reviews focussing either on domain-specific problems or specific technological interventions (e.g. Bates et al. 5 for clinical analytics innovations; Belle et al. 6 for identifying major obstacles; Mehta and Pandit 7 for outlining scope of big data analytics). Thus, it is important to conduct a systematic analysis that may bring insights regarding the use of design methodologies and relevant further requirements for future approaches.
Methodologies used for healthcare analytics solution design often offer guidance, particularly for analysing problems or data samples so that appropriate solution modelling can be designed for implementing system solutions. 8 However, most healthcare analytics studies use methodologies specifically to meet requirements of coping with data collection, noise minimisation, data classification and processing. For instance, Sindhu and Hegde 9 designed an analytic solution framework for handling data velocity problems in healthcare sector utilising an in-house methodology that consisted of various data classification steps. Blount et al. 4 proposed a healthcare analytics approach designed using a prototyping method for addressing the issue of patient identification. Although the study evaluated the solution with potential user groups, new knowledge from that study is not defined that may add further value for other similar designs. Mellor et al. 10 used an agile development methodology for conducting a healthcare analytics design study through the four main phases of introducing concepts, testing potential, testing efficacy and accessing of operational value. For the healthcare analytics design, in many cases, these methodologies are supportive in guiding the design work and provide data that need to be acquired in meeting the particular demands of the design.
Beyond the capacity of such traditional methodologies, design science research (DSR) has gained momentum in information systems (ISs) for designing contemporary solutions since Nunamaker et al. 11 first introduced this paradigm as an effective design methodology. Hevner et al. 12 described how DSR is particularly relevant for modern-day IS research, because it helps IS researchers confront two of the major long-term issues within IS design: (1) the absence of rigour in designing innovative artefacts and (2) the nature of IS research outputs, many of which produce irrelevant knowledge that is not practically applicable to real-world problem solutions. 13
As a contemporary IS method, DSR provides explicit approaches by reinforcing – not only IS design innovation (e.g. product and process perspectives) but also methodological innovation. 12 DSR requires both a rigorous contribution to knowledge and a development of knowledge relevant to stakeholders’ practices. Nevertheless, to date, relatively few studies have explicitly applied the DSR approach in designing healthcare analytics. We therefore wished to explore in more systematic detail how analytics innovations are being developed, and the value of design science approaches within this domain.
Studies to date lack focus on methodological insights guiding further IS research and development. In providing an analysis of trends in healthcare analytics research, we aim to widen the scope of previous reviews by promoting a suitable basis for the use of IS design methodologies in this field. Extending previous analyses, we classify relevant healthcare analytics literature from 2010 to mid-2018 using content analysis. Rather than selecting journal sites, we used six multi-disciplinary academic databases to source our sample more comprehensively.
Study background
Hospital and healthcare organisations or service providers face issues of sourcing appropriate evidence for their decision-making. The service providers also met with many decision-making (both operational and strategic) or non-urgent situations where the utilisation of computer-based data analysis may be paramount for offering appropriate insights. Healthcare analytics applications show promise in addressing these challenges. Khalifa and Zabani 14 (p. 411) view health analytics as ‘decision support systems . . . enabling knowledge professionals (physicians, nurses and health administrators, health policy makers and pharmacists) to gain vision and make more effective and efficient evidence-based healthcare decisions’. Analytics with large, varied datasets is enabled by technologies such as Hadoop, NoSQL Databases and Spark. 15 Many studies report improvements in technologies or techniques, for example, simulations, machine learning and statistical modelling. For instance, Qureshi 16 described an analytics architecture that combines cloud technologies for centralised storing and sharing of records with predictive analytics techniques using data mining algorithms. Other articles focus on hospital data’s potential to generate new knowledge suggesting innovative and actionable insights. 17 Accordingly, here, we represent healthcare analytics as comprising technologies of business intelligence, predictive analytics, knowledge discovery and big data underpinning applications designed for hospital or organisational healthcare decision support.
Healthcare analytics research is relatively new and shows continuing growth in the health informatics systems (HISs) field (the term HIS refers to the collection, storage, management, processing and transmission of information within the health sector. 18 The field of HIS is rapidly growing as analytics offer applications to gain facts and insights towards making informed healthcare or clinical decisions), and although still evolving, many recent analytics solutions show robust potential. One such is real-time predictive analytics, used in various healthcare application areas such as disease monitoring. 17 For designing effective analytics solutions, a comprehensive understanding of the problem domains, analytics solution approaches and their types and scope will offer useful knowledge for both academics and health research practitioners. Without looking at the existing trends of research, it is impossible to articulate this understanding to drive an agenda for future research, and as healthcare analytics grows, it brings a need for fundamental methodological knowledge and effective theoretical constructs.
Systematic literature reviews in healthcare analytics research
Systematic literature review (SLR) studies have been limited in the healthcare analytics area. A systematic literature analysis can provide not only future directions for research but can also enhance methodological knowledge for researchers and practitioners. However, there has been a lack of such systematic analysis in this particular field. The majority of academic literature review articles published either focus on pathology-specific problems or on specific technological interventions. For instance, De la Torre Diez et al. 19 reviewed 46 articles published between 2005 and January 2016 in the area of big data research in healthcare analytics. Their study confirmed the rapid growth of analytics research due to the massive increase in digital data in the health sector. The study also reinforces the importance of addressing issues of massive healthcare data growth, especially in designing methods and analysis tools, interpreting analytical results and determining potential issues using analytical results. Apart from this, while other reviews exist, these do not cover generalised knowledge contributions from a IS design perspective. Table 1 illustrates some key literature review studies.
Although these studies provide understanding on targeted aspects of healthcare analytics, they lack a focus on developing methodological insights. This study attempts not only to provide a fuller landscape of current trends in healthcare analytics research, but also to widen the scope of the previous reviews by developing a suitable frame of reference that promotes the use of modernised design methodologies in the field of healthcare analytics. We used a multi-disciplinary, bottom-up approach for collecting our sample articles by involving the top six academic databases to provide more comprehensive sourcing than looking only at selected journal sites as defined in the section below.
Details of some example literature review studies.
Methodological issues of healthcare analytics design
Traditional system development and prototyping provide little support to healthcare analytics design theory and for its evaluation. Issues such as lack of relevance and knowledge enhancement, along with articulation of reusable design principles are typical. DSR methodologies provide enhanced relevance and rigour of analytics solutions, and results subsequently used to form new design knowledge for developing and evaluating future solutions, by articulating domain-specific concepts or in-house practices. Hevner and Chatterjee 24 described how design science supports a ‘pragmatic research paradigm that calls for the creation of innovative artifacts to solve real-world problems. Thus, design science research combines a focus on the IT artifact with a high priority on relevance in the application domain’ (p. 9). Extending from Simon’s 25 conceptualisation, such work has established design research as a legitimate way of doing IS research.
As a contemporary methodology, DSR can offer benefits to the design of particular IS artefact innovations (such as decision support systems 12 ). We therefore contend that an analytics designer may require to employ approaches that are different from traditional methods because contemporary development requires knowledge-based activities as well as knowledge building that are significantly different from the traditional IS approach. 26 DSR naturally has much in common with traditional development, and component techniques used apply across many methodologies, especially in software engineering approaches: the essential difference is its explicit attention both to rigour and to relevance, and its requirement for guided reporting of an articulated knowledge contribution of general value. Jackson et al. 27 described the ‘Agile software development’ practices that became an industry standard for analytics application design in healthcare. This methodology supports with efficient methods of collaboration and effective ways of conducting analytics solution design. However, it is suggested that while agile is important to adapt to successfully support this hybrid development domain, the typical time allowed for a research data science project conflicts sharply with the standard agile development cycle. 27 Software engineering methods are used for their helpful steps to elicit data analytics requirements as well as for specifying functional requirements for using data to improve business and clinical outcomes within healthcare organisations. 6 The software engineering methodology is mainly based on traditional waterfall approach that is based on phases for application design and implementations. Some of the in-house methodologies also followed through the qualitative method for ensuring a participatory design for instance, so that practitioners can be engaged in the development or evaluation of the healthcare prototype. 28
The purpose of the study was to explore on methodological insights within the healthcare analytics literature, and assessing design methods used in the practical problem context of healthcare. First, we explored the development literature to reveal the presence of DSR as a research methodology. For this, we selected relevant criteria and applied a qualitative content analysis in order to generate themes inductively to match the DSR components. The findings are presented through the seven guidelines of DSR proposed by Hevner et al., 12 described below.
Study methodology
A SLR can reveal generalisable insights and patterns for guidance and improvement of a field’s body of knowledge. 29 For instance, the result of SLR can assist indicating fruitful directions for the growth of IS research and practice. SLRs not only organise cognate studies but can also sustain the evolution of evidence-based guidelines for practitioners. 30 Here, we use SLR to derive deeper understanding of IS design in health analytics following Brereton et al.’s 30 three phases of Planning (defining research question and protocol), Conducting (identify and assess relevant studies) and finally Document Review, where the synthesis of findings is written up.
Our main purpose is to gain insights related to the use of design methodologies in developing health analytics solutions for the service providers. Our scope includes health (healthcare) analytics research covering design or development issues, application design and theory design and evaluation studies. Our bottom-up approach for collecting sample articles took a multi-disciplinary perspective. Rather than identifying particular journals, major online databases were selected to source the articles. Figure 1 illustrates the entire process of sample collection and analysis.

Proposed review methodology for sample collection and analysis.
From a total of 133 sample articles, we initially classified into four groups such as application design (52), use of analytics (62), design issues of analytics (7) and literature review (12). We selected the application design sample articles (52) for this study for serving the purpose of the study. The overall sample analysis showed a growth trend in healthcare analytics studies (Figure 2 below). The trend analysis curve shows a cumulative progression of healthcare analytics research from 2010 to mid-2018 (these data reflect the number of articles collected up to June 2018: it was anticipated that the total number for 2018 would exceed that of the previous year). This trend is statistically significant with an adjusted correlation value (R2) of 80 per cent.

Research growth in healthcare analytics over the past 8 years.
Figure 2 shows each year’s percentage of journals and conference publications. The first row of the X-axis indicates publication years while the second row is showing the number of published articles found in the years with the third bar is indicating the total articles per year. Against these, the Y-axis represents the percentage of the journal and conference articles in the relevant years. Overall, Figure 2 indicates that the number of published articles is increasing exponentially over the past years. More specifically, the trend line illustrates a continuous increase for both, with journals trending relatively higher, now accounting for a larger percentage of outputs, having started from a lower base. Since conference outputs are also increasing in absolute terms, and if considered as a leading indicator of future journal numbers, continued growth can be expected in the medium term, and the relative percentage of journals confirms the field’s maturation as an emerging discipline.
As indicated in Figure 1, the search was performed across six electronic databases identifying sample articles between 2010 and mid-2018. Simultaneously, we also searched ‘Health Analytics’ and ‘Healthcare Analytics’ in a specific journal database from 2010 to mid-2018 to satisfy PRISMA (PRISMA provides guidance on evidence-based minimum sets of items required for conducting SLRs. PRISMA targets healthcare meta-analyses, but can also be used as a basis for reporting review findings) conditions. Search strings defined subjectively 31 were applied to each database, comprising the terms {‘health’ or ‘healthcare’} and {‘analytics’ or ‘intelligence’ or ‘predictive’}, aiming to achieve the greatest volume of relevant articles. However, we excluded articles not in English and also book chapters, newspaper articles, unpublished articles and non-scientific articles. Figure 3 illustrates the percentage of articles sourced from each database.

Database sources for healthcare analytics research articles.
Initially, we assessed titles and applied the pre-defined inclusion and exclusion criteria to each article, obtaining a set of potentially relevant articles. Following this, the full text of each article was obtained and their contents were critically evaluated by team members manually reading each article. Out of the 373 (shown in Figure 1) freely available full articles from 6 databases, 50 duplicates were eliminated as were another 79 articles not satisfying our research focus and lacking of target details and another 10 articles were eliminated as newspaper or unreviewed online content. Finally, another 82 articles were eliminated due to unrelated facts found by reviewing manually in title, keyword, abstract and full text. Finally, after our classification coding in four groups, we excluded (out of 133 articles) another 82 articles that were inappropriate to addressing our research question as a result, we end up with 52 articles that were selected for the review analysis.
Using qualitative and quantitative research techniques for analysing data, content analysis applies to exploring content directly from any human interaction process, verbal, visual and written documents. 32 A qualitative content analysis provides a summary of the original information, and while both deductive and inductive approaches are widely implemented, inductive analysis is appropriate when ‘there are no previous studies dealing with the phenomenon or when it is fragmented’. 33 We analyse the 52 solution application design articles using an inductive method to classify and categorise attributes.
The sample was chosen to gain insights on issues and themes as well as design methods, and although Elo and Kyngäs 33 note that no exact analysis rules are prescribed, we were guided by their three phases: preparing, organising and reporting. In the preparation phase, the act of categorising aims to form groupings based on related and common characteristics. Elo and Kyngäs 33 describe categorisation as including the interpretation process that informs the grouping of categories used to describe the phenomenon that has been analysed. We grouped the sample into five groups: model, method, construct, instantiation and design theory, as March and Smith 34 described these five artefact types for IS solutions design. 35
In the organising phase, after identifying 52 articles, team members went through each to agree the issues, key themes and how the design was conducted. We looked for their design process description, evaluation methodologies and rigorously described processes, as these components relate to the explicit DSR guidelines of Hevner et al. 12 Hevner et al. provided a seven design guidelines as the design research protocols for conducting DSR studies. The DSR protocols enabled analysis of existing design research identifying various aspects, such as type of solution artefacts. We adopted a qualitative approach for codifying the findings using the review protocol by Arnott and Pervan 36 (see Supplemental Appendix C for details).
Findings
Our analysis identified eight major foci in healthcare analytics studies, both clinical and non-clinical (Figure 4). The eight major foci in existing studies were prioritised according to the multiple response analysis, for example, prioritising emphasises; common analytics application areas for public healthcare problem domain and also considering both clinical and non-clinical emergent aspects of decision-making purposes. Our view was following to objectivism paradigm so that we can produce insights that were reinforced by existing studies. Real time monitoring was most common, being suited to automatic assessment from large data sets, perhaps collected from wearable devices. Establishing a common computing platform was also well represented, since different types of data, structured and unstructured, are relevant in a comprehensive patient record. General enhancements to care processes, for example, safety, service quality and visualisation of patterns for supporting clinical decisions were also typical motivations, including discovery of patterns leading to clinical insight.

Eight key purposes of healthcare analytics studies.
Our analysis, however, mainly focused on finding about use of methodologies in healthcare analytics research. Figure 5 below characterises the type of design methodologies showing their area of applications. The findings suggest that majority of the studies adopted software engineering methodologies as they are diversified and many standards are fulfilled using the software development methodologies. Surprisingly enough, we found a little use of agile methodologies in designing healthcare analytics solution. The reasons behind lower number of cases that used agile methodologies can be because we focused on searching sample articles that have ‘Healthcare’ and ‘analytics’ in the keyword or in title of the sample article. Second, our content analysis couldn’t find relevant keywords and steps of sub-processes that are related to agile methodology while most of the existing studies used in-house methodologies to combine traditional development and agile methodologies for making it suitable for specific context. However, our analysis was based on manually checking each sample to find out what steps of methodologies they use and how for designing the analytics solution for classifying them. Practically, the software engineering methodologies, analytics and in-house methodologies were vitally used by the designers in the literature. Figure 5 illustrates six vital methodologies (including DSR) identified in existing healthcare analytics studies.

Six vital methodologies (including DSR) identified in healthcare analytics studies.
Healthcare analytics solutions typically support decision-making using collected data sets, so it is important to utilise systematic procedures for incorporating realistic contextual understandings into the design. Before beginning the content analysis, we identified some studies where DSR was the chosen research method, although only few studies used DSR methodologies as we experienced throughout the analysis. Table 2 below shows the types of healthcare analytics artefacts in public healthcare context.
Example healthcare analytics research that used DSR as methodology.
DSR: design science research; BI: business intelligence; HBIS: hospital-based business intelligence system; EMR: electronic medical record; IT: information technology.
DSR embodies guidelines and supporting activities that lead to an effective problem formulation for designing a solution artefact but also offers guidance for communicating its knowledge contribution, improving research value.35,41 The existing research presented in Table 2 used established DSR approaches, typically defining target problems, with impacts, scope and foundation of a solution and engaging representative users through techniques such as interviews, focus groups, observations, prototyping and workshops, where business practices and requirements are identified. DSR’s real-world focus ensures design and development results in useful and valid functionality. Hevner 38 argued that research relevance should be maintained through defining important environmental requirements before design of a solution, and suggested that any of several evaluation methods (observational, descriptive, analytical, testing or experimental) should be used rigorously to evaluate a solution artefact. 12 Evaluation techniques include case study, simulations, scenario analysis and field studies to investigate the design artefact’s utility, efficacy or usability. The above-mentioned research used such methods for artefact evaluation within its practical business environment (e.g. Mazor et al. 40 used a simulation model; Ahangama and Poo 39 used interviews, while others used Hevner’s 38 design cycles and use-cases (e.g. Figure 6)).

Example DSR use by Kakhki et al. 42 (p. 737) for developing healthcare analytics.
Thus, knowledge can be gained with problem relevance and converted into analytics solution designs enabling innovative features. DSR can offer a comprehensive methodological support for better addressing application design needs (illustrated in Figure 6). Our findings, however, showed that the vast preponderance of studies have used traditional approaches of limited academic value and general relevance. Figure 7 below illustrates the types of healthcare analytics artefacts in the literature that indicate more insightful details.

Types of healthcare analytics solutions as artefacts.
Discussion and conclusion
Our main objective was to reveal methodological emphases in healthcare analytics solution design research from the extant literature. Any SLR sample is delimited by choice and by the application at which it is targeted, but we believe ours is representative for its purpose. We found several areas of focus, and a key set of methodologies in use. We argue, however, that healthcare analytics solution design studies required modernised methodologies as the traditional approaches do not clearly support innovation and new knowledge creation relevant to clinical and non-clinical practitioners.
As our recommendation, we believe that DSR as a modern design methodology, offers improvements over traditional methodologies in designing IS artefacts, providing methodologies with roots in engineering and the artificial sciences. DSR is thus particularly relevant for innovative analytics solution designs because it better supports designers/researchers in establishing grounding knowledge and in embedding practical aspects into the design of artefacts to solve real-world problems. For effective healthcare analytics artefact design, we believe that DSR methods can provide support for (1) articulating problem definition, (2) suggesting suitable solution approaches, (3) validating the solution process, (4) guiding relevant evaluation and (5) requiring dissemination to specify usable knowledge. DSR also requires consideration of an artefact’s mutability, and continuing relevance in dynamic and evolving contexts. In analysing the IS design methods, although relevant information technology (IT) artefact development and dissemination was present implicitly in many articles using more traditional or less formal approaches to design and development, only a handful of articles explicitly applied DSR. For the subfield of healthcare analytics to reify its topical scope and to advance its theoretical base, we suggest that attention to the methodological imperatives of DSR will be of continuing benefit.
This study aimed to produce better understanding of the methodological insights for healthcare analytics solutions, by examining the existing design trends. The findings implied that the healthcare analytics is an emerging solution design field to which DSR is particularly better situated for supporting with design guidelines for healthcare analytics solutions. We anticipated that the DSR would be a more effective methodology as our analysis represented overall in Supplemental Appendix C. The seven design guidelines are considered as key supporting pillars for assisting in conducting the effective design of healthcare analytics solutions. Our view is objective as we did rely on the outcome using the evidence of the analysis.
One of the delimitations of the literature review was keyword-based searching that was utilised for collecting the sample articles initially on healthcare analytics. As our focus concerns IS design methodologies in healthcare analytics solution design, we separated articles with a ‘big data analysis technique’ emphasising those more directly relevant to healthcare solutions. For example, Mehta and Pandit 7 provided a review of 58 articles specifically covering big data analytics in healthcare. Following this, we did search using keywords such as ‘big data’ or/and ‘healthcare’; ‘health and big data’ and ‘analytics’ or ‘big data analytics in healthcare’, and found 1264 articles (789 journal and 475 conference articles (from 2010 to 30 June 2019)); we could have included this large sample for better outcome but to keep our focus only on healthcare analytics in general, we reserved this sample for a separate future analysis. Second, we analysed the selected sample articles using manual process; therefore, it was time consuming and created possibilities of human errors that may lead for potential risks to the validity of this review study. We somehow handled this issue through carefully adopting the protocols of the content analysis.
Supplemental Material
Appendix_14_June_2019-JHI – Supplemental material for Methodologies for designing healthcare analytics solutions: A literature analysis
Supplemental material, Appendix_14_June_2019-JHI for Methodologies for designing healthcare analytics solutions: A literature analysis by Shah J Miah, John Gammack and Najmul Hasan in Health Informatics Journal
Footnotes
Acknowledgements
Authors are thankful to reviewers and editor for providing constructive feedback to improve the article.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship and/or publication of this article.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
