Abstract
Medical care services can be organized into a network. Understanding the structure of this network cannot only help analyze common clinical protocols but can also help reveal previously unknown patterns of care. The objective of this research is to introduce the concept and methods for constructing and analyzing the network of medical care services. We start by demonstrating how to build the network itself and then develop algorithms, based on principal component analysis and social network analysis, to detect communities of services. Finally, we propose novel graphical techniques for representing and assessing patterns of care. We demonstrate the application of our algorithms using data from an Emergency Department in New York State. One of the implications of our research is that clinical experts could use our algorithms to detect deviations from either existing protocols of care or administrative norms.
Introduction
Medical care services (MCS) can be organized into a network that, if analyzed appropriately, can reveal common medical practices, such as clinical pathways. In this article, we introduce methods for constructing such a network and demonstrate how to explore the resulting communities to detect organized patterns of care. Clinical experts could utilize our techniques to confirm adherence to existing clinical protocols or identify atypical correlations of MCS.
A model of the delivery of MCS
Figure 1 portrays a basic model of the delivery of MCS where a physician, or a group of physicians, evaluates the patient and then recommends appropriate services.

A basic model of the delivery of MCS.
From the model in Figure 1, we appreciate two possible relationships in the delivery of MCS:
The existing research about the relations in the delivery of MCS has focused on
A simulation of a small network of MCS
In Figure 2, we attempt to simulate a few scenarios of medical care delivery to demonstrate how a network of shared MCS can arise. We start by imagining a small health care clinic that only provides five services ⓐ, ⓑ, ⓒ, ⓓ, and ⓔ. In addition, we assume that there are four different patient visits. We do not make a distinction as to whether the same or a different patient visited, or whether the patient(s) saw the same or a different physician.

A simulation of a small network of MCS: (a) Visit 1, services ⓐ and ⓑ delivered. (b) Visit 2, services ⓐ and ⓓ delivered. (c) Visit 3, services ⓐ, ⓑ, and ⓒ delivered. (d) Visit 4, service ⓒ delivered twice.
In Figure 2(a), a patient comes in and receives services ⓐ and ⓑ. In Figure 2(b), a patient receives services ⓐ and ⓓ. In Figure 2(c), a patient receives services ⓐ, ⓑ, and ⓔ. In Figure 2(d), a patient receives service ⓒ twice. By the fourth visit, we can already appreciate that a network has formed. Each node is a distinct type of service, and the number of links or edges between two nodes signifies how much the corresponding services are related. To analyze the network of MCS, we propose algorithms based on the social network analysis (SNA) and principal component analysis (PCA).
Literature review
SNA related to the network of providers
Landon et al. 1 apply SNA methods to study the properties of patient-sharing networks among physicians. In their model, physicians are nodes, and patients are edges. A more weighted link is indicative of a higher number of shared patients. In a different study, Landon et al. 2 use a similar model to evaluate the effects of the network of physicians on patient outcomes and health care spending. Wang et al. 3 also use SNA techniques to construct a collaboration network of physicians and then evaluate the network effects on costs and quality outcomes. In this study, physicians are also nodes, but edges represent patient admissions. Similar patient-sharing network models are studied by several authors.4–6 Creswick et al. 7 study a slightly different network of providers. In their study, the authors attempt to understand a communication network among clinical and administrative staff in an emergency department (ED). The authors find that communities are formed around different professions. But, they also notice that other communities result from other reasons such as problem-solving, medication advice, and basic socialization. Boyer et al. 8 utilize a similar model to analyze the properties of a network of communication among health care professionals in a French public hospital. Tang and Yang 9 study the centrality of users in an online health care social network. The authors develop a userRank algorithm to assess the influence of users based on the nature of links and the content. Meltzer et al. 10 apply SNA methods to model effective quality improvement teams in health care. In this study, the authors recommend a strategic selection of team members based on their centrality in the network. For example, the authors indicate that a team member with a higher “net degree” will potentially receive more influence from outside, the desired outcome but also one that may dampen the cohesion of the team. Poghosyan et al. 11 in their study conduct a systematic review of studies that apply SNA methods to social networks in health care. One of their highlights is that professional and personal networks do have the potential to affect the delivery of medical care, such as in the adoption of new drugs. Chambers et al. 12 also do a systematic review of applications of SNA methods in health care. From their investigation, the authors recommend that researchers focus on SNA-interventions-based models.
SNA related to the network of MCS
There is limited literature about the applications of SNA methods to study the network of MCS. The work that we are aware of includes Carroll and Richardson, 13 where the authors discuss the use of SNA methods to model the dynamics of health care services. Specifically, the authors emphasize the application of SNA models to evaluate a careflow network. The suggested network includes nodes of patients, carers, health care administrators, social workers, and other relevant actors. The links in the network express the exchange of information and resources among the nodes. The authors indicate that using SNA to study the structure of such network could help reveal potential opportunities to improve patient safety and quality outcomes. Niyirora and Klimek-Yingling 14 also apply SNA methods to analyze connections in health care services. The authors demonstrate how to use SNA centrality measures to identify the most important services in an ED. In this article, we extend the work of these authors by proposing SNA and PCA–based algorithms to evaluate the communities of MCS and thus help clinicians confirm adherence to existing clinical protocols or detect atypical connections.
SNA in medical care
SNA, which relates to graph theory,15,16 is routinely applied to network problems in medical care. For example, in neuroscience, SNA is commonly utilized in the construction and analysis of graph theoretical models of the brain. 17 Sanz-Arigita et al. 18 and Supekar et al. 19 use graph theoretical models to identify the biomarkers of Alzheimer’s disease. Similar approaches are undertaken in Bassett et al. 20 to analyze the biomarkers of schizophrenia. Bath et al. 21 apply graph-theoretical models to study patterns and clusters of illness using public-health data.
PCA and other clustering techniques
PCA is a statistical technique that is often employed for purposes such as spectral decomposition, dimension reduction, factor analysis, and spectral clustering.22,23 Several researchers have applied PCA methods on a variety of health care problems including attempting to identify the phenotypes of acute mountain sickness. 24 In Arai et al., 25 PCA is used to predict strains of a bacterial infection in septic patients. In Platts-Mills et al., 26 PCA is used for the dimensional reduction of the data related to motor vehicle accidents in an ED. In Motin et al., 27 PCA is used in the ensemble empirical mode decomposition technique to extract respiratory and heart rates from photoplethysmographic signals. In Widjaja et al., 28 PCA is applied to the electrocardiogram data to obtain respiratory signals, whereas in Sharma et al., 29 PCA is used to compress the similar type of data. In Charkhkar et al., 30 PCA-based clustering methods are applied to diagnose acute coronary syndrome. Nilashi et al. 31 apply soft computing techniques to cluster diabetes mellitus disease into groups. Some of the methods the authors use include expectation maximization, PCA, and support vector machine.
Besides PCA, other statistical and clustering algorithms that are commonly employed to classify health care data, include
Contributions
In this research, we do not introduce any new mathematics. We build on existing PCA and SNA theories to develop algorithms to classify communities of MCS and allow for the detection of the patterns of care. Our main contributions are as follows:
We introduce an SNA-based algorithm to create a neighborhood graph of MCS and identify relevant communities of MCS.
We introduce a PCA-based algorithm to develop a loading matrix to allow for the detection of communities of MCS based on the coefficients of correlation.
To detect the pattern of care, we suggest introducing secondary nodes into the network and then infer the context of each community based on the meaning of these new nodes.
We propose using radar charts and scatter plots to present and visualize communities of MCS.
We provide a detailed example of our methods using data from an ED in New York State.
The organization of the rest of the article follows. In the “Methods” section, we introduce our SNA and PCA algorithms. In the “Application” section, we highlight the performance of our methods using the ED data. We end the article with concluding remarks and plans for future work. In Appendix 1, there is a detailed example of how to implement the proposed algorithms.
Methods
We propose applying SNA and PCA–based algorithms to analyze the network of MCS. We begin with the construction of an adjacency matrix that we later use to create a neighborhood graph and a loading matrix. Subsequently, we attempt to find and interpret the pattern of care in the communities of the network.
Constructing an adjacency matrix
We suppose that there is a population of patient visits in a particular clinical setting. From this population, we take a representative random sample of
where for a visit
Next, we use the indicator matrix
where ’ signifies transpose and the expanded version of matrix
where the ≡ symbol implies equivalence. It follows that
which implies that
for
where
Here,
where
Searching for MCS communities using SNA concepts
From matrix
where
We propose Algorithm 1 to find and analyze the communities of MCS using SNA techniques.
We should note that, besides the Modularity option in Gephi, several other software such as Cytoscape 38 and NetworkX 39 are applicable for detecting communities in a graph. For a theoretical discussion about various community detection methods, see Newman. 15
Searching for MCS communities using PCA concepts
Besides SNA techniques, we can also apply the PCA concept to attempt to find communities or clusters of similar MCS. We follow the existing PCA theory22,23,40 and decompose
where
since
where
Searching for the pattern of care
To search for the pattern of care in a given community of MCS, we can request an expert’s opinion or employ topic modeling techniques such as the latent Dirichlet allocation.
41
To get started, we suggest utilizing simple tools like a
Visualizing the community
In this article, we propose using radar charts to visualize communities of services in matrix
Scatter plots tend to show more distinct communities when matrix
The values of the first eigenvector of matrix
The second eigenvector of matrix
We will demonstrate how to apply these visualization tools in the “Application” section and Appendix 1 of this article.
Other useful techniques for classifying MCS
Other useful techniques for classifying the communities of MCS include the centroid method (a hierarchical clustering technique) and
Application
In this section, we demonstrate how to apply our proposed network algorithms using the data from an ED of a hospital in New York State. We show more details in Appendix 1.
Our data
We obtained a sample of de-identified patient visits
Nodes of the network
We use CPT codes as the main nodes in the network and ICD-9-CM codes as the secondary nodes that we employ to infer the pattern of care. For the reader not familiar with CPT, this coding system is arranged into distinct sections of medical services by physician specialties, body system, or other well-defined criteria. Within each section, codes are arranged in numerical or alphanumerical order. A valid CPT code is five characters long.
45
Typically, the first two characters of such a code are enough to reveal a broader meaning of the type of service that was provided. To minimize the sparsity and noise in the data, we only use the first two characters of a CPT code to label nodes in our network. For example, CPT code
Creating an indicator matrix
We create the indicator matrix
The setup of matrix
The first column of Table 1 identifies the patient visit number. The rest of the columns represent the nodes in the network. The two numerical characters in the heading on this table relate to CPT codes, and they represent primary nodes in the network. The single characters in the heading relate to ICD-9-CM codes and they represent secondary nodes in the network.
Creating matrices
,
, and
We create matrix
We search for the context of a given community by evaluating the pattern of words in the descriptions of CPT and ICD-9-CM labels. Whenever possible, we suggest that clinical experts be involved in helping identify unusual communities of services or those communities that relate to known medical protocols. In this article, we relied on our background in health information management to describe communities resulting from our analysis.
Results obtained by applying Algorithm 1
In Figure 3, we present an

A network of MCS when only CPT nodes are considered.

A network of MCS when both CPT and ICD-9-CM nodes are used. CPT nodes, labeled using two numerical digits, represent MCS. ICD-9-CM nodes, labeled using a single alphanumeric character, broadly present medical diagnoses under treatment.
In both Figures 3 and 4, a different color identifies a different community. The bigger the node, the more frequent the label. Also, the thicker the edge, the more related the linked nodes are. To interpret each community, we apply the

Pattern of care related to the network in Figure 3—(a) {36, 80, 81, . . .}: Lab tests. (b) {73, 12, 29, . . .}: Care of injuries. (c) {43, 49, 88}: Gastrointestinal care. (d) {99, 27, 23}: Care of shoulder conditions. (e) {50, 75}: Nephrostomy care. (f) {31, 92}: Care of cardiac arrest.
To interpret the communities in Figure 4, we can also apply the option of the
Smaller communities in Figure 4, with two or three linked nodes, do not have any ICD-9-CM nodes associated with them. So, we can apply the
Results obtained by applying Algorithm 2
In this section, we present results from our PCA-based algorithm. In typical applications of PCA, a scatter plot of the components of the principal score matrix

A scatter plot of the normalized PC1 and PC2 of matrix

A scatter plot of the normalized PC1 and PC2 of matrix R, where matrix
A similar analysis can be carried out to detect the pattern of care in the communities of Figure 7.
From our experience, scatter plots from matrix
In addition to attempting to detect communities of MCS using scatter plots, we also suggest employing radar charts. To demonstrate this idea, we have plotted a few components of matrix

Radar charts portraying a few components of matrix

Radar charts portraying a few components of matrix
To visualize the communities in the radar charts, we color-code the components. Each distinct color represents a particular community. When the components overlap, the highest loading takes precedence. The interpretation of the pattern of care would be as before. For example, regarding Figure 8, after applying the
Regarding Figure 9, we can conclude that PC1 incorporates MCS for treating or diagnosing
We want to emphasize that the components of Figure 9 (e.g. PC1) are not related in any way to the components in Figure 8 (e.g. PC1) since different data sets were used to derive them. The simply share the same generic names as discussed earlier.
Remark 1
Unlike in traditional scatter diagrams, where only two components are typically visualized at a time, radar charts allow for the visualization of several components at once. Indeed, we can plot all qualifying components of matrix
Practical implications of our research
Our research builds on existing SNA and PCA mathematical theories and introduces new algorithms to help health care practitioners visualize communities of MCS and assess the pattern of care. One practical implication is that by knowing typical patterns of care, health care managers may be able to use this information to better coordinate and schedule resources to provide needed services. The latter effort may contribute to the efficiency of medical care delivery. Another practical implication is that clinical experts may be able to confirm adherence to existing protocols of medical care. For example, if services
Besides the application in health care settings, our algorithms could also be applied to the creation and analysis of any network using an indicator or incidence matrix of interest. We particularly believe that our approach of generating matrix
Limitations and future research
Our research was limited to a network of MCS from a single medical facility. As a result, we are unable to generalize our findings regarding particular communities of services. For future research, we would like to extend our work to multiple facilities and assess similarities in the clinical protocols. In addition, we intend to evaluate the effects of the communities of MCS on quality outcomes and medical costs. On the technical level, we plan to investigate the parity between Algorithm 2 and spectral clustering algorithms that use
For the researcher interested in expanding this work, a convenient problem to work on relates to how adjacency matrices of MCS converted into distance matrices affect relationships in the network. From the experiments that we ran, with limited data available to us, we have found that relationships do change, but have not yet established theoretical reasons or assessed practical implications of the resulting communities.
Conclusion
In this research, we introduced techniques for constructing a network of MCS by creating an indicator matrix and then converting it into a weighted adjacency matrix. From there, we developed algorithms to generate a neighborhood graph and a loading matrix to allow for the detection of communities of services. To interpret the context of each community, we proposed applying a
Footnotes
Appendix 1
In this appendix, we present a detailed example of how to apply our proposed SNA and PCA algorithms. We still use the same data from an ED of a hospital in New York State, but this time consider a smaller network to generate matrices that can fit on a single page. Accordingly, we only select the first character of CPT codes to label main nodes. As before, we select the first character of ICD-9-CM codes to label secondary nodes. We pad CPT nodes with zeros to distinguish them from ICD-9-CM nodes. For example, the first character of a CPT code 21010 is 2, and we pad it as
From this matrix
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
