Hybrid recommendation–based quality of service prediction for sensor services

Abstract

Wireless sensor networks are being the focus of several research application domains, and the concept of sensing-as-a-service is on the rise in wireless sensor networks. Large service repositories comprising more services and functionalities usually impose new challenges to users while identifying their preferred services and may incur higher costs. Thereby, service recommendation systems have become important and integral tools of service models to provide personalized products for consumers. However, many existing methods of sensor service recommendation focus only on service discovery. To this end, this article proposes a novel hybrid recommendation method, named new hybrid recommendation method. First, latent Dirichlet allocation model is used to compute the similarity of the latent topics of the services, and the user’s latent semantic themes are used to extract the potential interest services. Moreover, the relevance of neighbourhood services is considered, which can improve the accuracy of quality of service prediction. Experiments conducted on real datasets demonstrate that the proposed method is more accurate than the existing methods of service recommendation.

Keywords

Quality of service prediction service recommendation sensor services

Introduction

Given the rapid development of service computing in the recent years, the way of offering computing services has taken various dimensions such as infrastructure as a service (IaaS),¹ software as a service (SaaS)¹ and platform as a service (PaaS).¹ And, service computing attracts much study,^2–8 including researches on the environment of wireless sensor networks (WSNs).^9–12 WSNs are typically composed of energy-constrained sensor nodes and a higher performing sink node. Several concepts of service offering have been previously proposed in WSNs. The concept of sensing-as-a-service has been proposed, in which the sensor devices for Internet of things (IoT) are treated as service providers and applications are treated as clients of such services. In service-oriented architecture, each node is sharing some services while each service consists of a set of functions.¹³ Users access published services or data generated by sensors through Internet for monitoring temperature, humidity and air quality (air quality index (AQI)). Sensor services are being the focus of a wide range of research works.^9,10,13–19 Existing researches mostly focus on service discovery in services, for instance, efficient searching methods by exploiting the geographical properties of devices. However, the study of recommendation, which aims to return semantically equivalent services in the discovery process, has not been studied adequately. Furthermore, service discovery on its own may not be effective for availing efficient services owing to the deployment of large number of distributed and similar services. In addition, the increasing heterogeneity among the user demands and requirements further imposes various challenges in achieving effective service discovery. To this end, availing relevant recommendation services form an essential component of service offering to user requests. Most of the existing methods exploit information of services such as detailed service descriptions or quality of service (QoS) information for availing relevant services to users. But, obtaining the real values of QoS for services is often difficult. To this end, this article proposes a recommendation method based on predicting the QoS values of user requests in order to help clients to select the most relevant services to the extent of their satisfaction.

In general, a recommendation system can be defined as a system that actively recommends items to users depending on the specific application areas.² The recommendation system is an important tool for providing personalized products to consumers. In general, the success of service offering in services lies in the effectiveness of the recommendation system. An excellent personalized recommendation system should recommend specified items to user requests, in order to meet the user’s personalized needs.

Service recommendation based on QoS can facilitate an automated process of discovering and recommending a number of services to clients by the way of ranking the QoS properties of the services and behaviours of the client. However, due to the complexity of services, recommendation for services still faces various challenges. An efficient service recommendation should necessarily include the following properties.

Accuracy

A recommendation system should recommend desired services and avoid unpopular services for users, in particular when there is minimal QoS information for services. QoS attributes are usually associated with the server, network conditions, location, time and other factors.

Diversity

Not only the popular services but also the unpopular yet relevant services in accordance with the user’s preferences should be recommended to users. Returning unfamiliar yet relevant services can help users to find their desired services effectively.

Cold-start problem

Cold-start problem often occurs in memory-based recommendation systems where no rating record is available or only a small number of rating records are available for some new items or users in the system. For new users and new services, recommendation system should still make recommendations.

Data sparsity

Data sparsity can occur when a single user invokes a marginal portion of several services, resulting in very sparse QoS data. In reality, datasets usually suffer sparsity up to 99.24%.²⁰ Data sparsity usually imposes greater influences on the accuracy and effectiveness of the recommendation system, especially in the cases of high-sparsity data. Data sparsity may include a lot of zero entries in the data matrix of services, thus it is very important to resolve the problem of data sparsity.

QoS data usually represents the non-functional features of the services. QoS data of services can be exploited to provide users with their preferred non-functional features similar to the functionalities of services. Wu et al.^21,22 use QoS data as to analysis the performance of WSNs. However, under normal circumstances, the actual QoS values of the services are often implicit. Kodali and Malothu²³ consider QoS as one of the two critical issues in any WSN to measure the performance parameters. Thereby, predicting the QoS values can further enhance the effectiveness of the service recommendation methods.

With this in mind, this article proposes a novel method of QoS-aware hybrid recommendation for services to improve the prediction accuracy of missing QoS values. The proposed model not only considers the relevance of neighbourhood services but also uses the latent Dirichlet allocation (LDA)²⁴ topic model to mine the implicit semantic similarity of services. First, the relationship between neighbouring services is calculated based on the list of users whom invoked common services, and the value is used to generate a correlation matrix of services. Second, the similarities among the services in latent topics probability are computed using the LDA topic model. In this way, QoS prediction accuracy can be improved and latent topics of services can effectively satisfy user requirements.

The main contributions of this article are listed as follows:

A novel hybrid approach for service recommendation has been proposed that combines correlation-based and content-based recommendation of services.

Unlike other conventional service recommendation approaches, our approach considers the correlation between services based on the co-invocation of similar services by different consumers and further exploits the LDA model that considers the similarities of users and content of services to extract the latent topic of services.

Extensive experiments have been conducted based on real-world services to verify the effectiveness of the proposed method. The experimental results demonstrate that our approach achieves better recommendation performance than the conventional correlation-based and content-based methods.

The remainder of this article is structured as follows: section ‘Related work’ reviews the related works, and section ‘Hybrid recommendation for SWS’ details our proposed model. Section ‘Experiments’ presents and discusses the conducted experiments, and section ‘Conclusion and future work’ concludes this article along with outlining our future research plans.

Related work

A Web Service can be defined as a remote method that is accessed through the Internet. Therefore, Sensor Web Services (SWS) describe the procedure of implementing a Web Service in certain WSNs, where the centre node works as a service, which is able to provide a tool for data exchange with nodes.⁹ Recently, a range of researches^1,2,4,25 on Web Service recommendation are increasing in popularity. These researches can be applied in the recommendation for SWS. The current methods of Web Service recommendation are mainly divided into the following categories:

Correlation-based service recommendation method mainly uses the association between services based on the relationship between services, and the advantage of this method is it is easy to recommend a high-quality service and make recommendation diversity. But the recommendation system faces several challenges while making recommendations when new items or users become susceptible to data sparsity.

Content-based service recommendation method uses the content similarity between services. The advantage of this method is that it can overcome the start-up issues of new items and users.

Hybrid service recommendation method combines the features and advantages of the above two methods. This hybrid method aims not only to resolve the problem of cold start but also to achieve diversity; this combinational method can be much effective in recommendation systems. But the issue prevails in effectively balancing trade off between the two model-based service recommendation methods.

Zheng et al.²⁵ exploited the information of similar Web consumers and services to predict the missing QoS parameters. Q Xie et al.² considered the relationship of SWS based on the number of services those have been invoked by each pair of Web Service users. IPCC, UPCC, and WSRec⁴ have been proposed by Zheng et al., and such approaches are based on user-based collaborative filtering method, an item-based collaborative filtering method, a QoS-aware hybrid Web Service recommendation approach by combining both UPCC and IPCC with confidence weight, respectively. BiasSVD,²⁶ a latent factor model, uses the singular value decomposition (SVD) method and adds additional biases to users and items. GM,²⁷ is a greedy method for ordering items. CloudRank2,⁵ is a cloud service ranking method which uses the confidence levels of different preference values. 2RHyRec⁶ is a ranking-oriented hybrid approach which combines collaborative filtering and latent factors. CloudPred²⁸ a neighbourhood-based approach enhanced by feature modelling on both users and components. Few learning techniques have been adopted in the model-based approaches like clustering models,²⁹ neural networks³⁰ and latent semantic models.^31,32

Most of the existing researches on Web Services recommendation^1–4,9,33 considered only the relationship between Web Services based on neighbourhood to predict the QoS attributes, but they ignored the user’s latent interest in Web Services. This causes the existing recommendation methods for services to fall below the expectations of user requirements, despite the services characterizing higher QoS values. Moreover, in an extreme situation, making recommendations is impossible for such methods when new users and new services arrive.

The mining of user interested services is mainly based on topic models, a type of unsupervised generative probabilistic modelling. Several models have been proposed; M Ahmed et al.³⁴ proposed a novel hybrid approach to make recommendations by combining the approach in Xie et al.² with the LDA topic model to obtain the user-service probability matrix. The advantage of this method is that it considers both the correlation between services and the content of services. Thus, it can successfully recommend new items to users and also improves the accuracy of QoS prediction. Because of the fact that this approach treats users as words, this method suffers the lack of interpretability.

Several models of service offering have been recently proposed in the context of sensor service discovery. Some notable research works in this area includes IrisNet, wide-area architecture for pervasive sensing services in distributed and heterogeneous environments;¹⁵ discovery and on-demand provisioning of services for IoT-based business applications;¹⁶ IoT-based service discovery using distributed hash table¹⁷ and geographical indexing.¹⁸ However, the problem of ranking sensor services has not received much attention and has not been well investigated.

Existing research mostly focuses on service discovery, ignoring the recommendation of relevant services. Incorporating recommendation system in SWS can provide better sensor services up to the satisfaction of the consumer’s requirements, which is yet to be comprehensively investigated.

To this end, a hybrid approach of recommendation for SWS has been proposed in this article as an extension to our previous work.³⁴ The novelty beyond the state-of-the-art techniques lies in the fact that our model not only considers the relevance of neighbourhood services but also uses the LDA²⁴ topic model to mine the implicit semantic similarity between sensor services. The similarity between SWS in the probability of latent topics is computed using the LDA topic model. With the similarity between SWS in the probability of latent topics being computed, the diversity of recommendation for SWS has been improved, and the latent topics of SWS with higher QoS values are exploited to satisfy the user requirements in terms of both content and quality.

Hybrid recommendation for SWS

The proposed hybrid recommendation method for SWS comprises the following cascaded functionalities: first, a correlation matrix between services is extracted using the correlation-based Top K SWS recommendation method based on the number of SWS co-invocated by different consumers. Then, the missing values of QoS are predicted based on the correlation matrix. Then, based on the relationship between the user’s preferred interests of latent content and hidden topics of SWS, recommendation method based on LDA model obtains the topic probability distribution of the services, so as to predict the missing QoS values. Finally, the weighted linear module makes the final recommendation.

The application of LDA topic model is effective in enhancing the prediction accuracy of QoS values. Moreover, the semantic association among the mining services is exploited to overcome the problem of cold start for new items. To recommend services to new users, popularity recommendation method has been adopted to recommend higher rating services or relevant number of invoked services.

Correlation-based recommendation for SWS

Our previous work² proposed a correlation-based recommendation for Web Services, which exploits the number of SWS, co-invoked by different users to calculate the correlation degree between services, and further makes use of the propagation and attenuation using the PageRank³⁵ method to rate services and finally predicts the missing QoS values.

First, according to the one-dimensional (1D) vector ${n_{u}}_{i, j}$ formed by the co-invoked Web Service i and j and the user group with the QoS values, the correlation degree C_i,_j between the service i and j is calculated, as shown in equations (1) and (2)

\begin{matrix} {n_{u}}_{i, j} = {\begin{matrix} {u_{k} : (r_{k, i} \neq Φ) \land (r_{k, j} \neq \emptyset)} & (i \neq j) \\ Φ & (i = j) \end{matrix} \\ i = 1, \dots, N; j = 1, \dots, N; k = 1, \dots, M \\ N u_{i, j} = | {n_{u}}_{i, j} |, i = 1, \dots, N, j = 1, \dots, N \end{matrix}

(1)

where M and N are the number of service users and services, respectively. $r_{k, i}$ and $r_{k, j}$ are the vectors of QoS values of service i and service j observed by service user $u_{k}$ , respectively

\begin{matrix} C_{i, j} = {\begin{matrix} \frac{N u_{i, j}}{\sum_{l = 1}^{n} N u_{l, j}} & (i \neq j) \\ 0 & (i = j) \end{matrix} \\ i = 1, \dots, N, j = 1, \dots, N \end{matrix}

(2)

where C_{i, j} represents the ratio of the number of common users of service i and j and the number of common users of service j and all services. When C_{i, j} > 0, C_{i, j} constructs the correlation graph.

Then, according to the PageRank³⁵ method, the rating matrix SR(n) of the user u to the Web Service n is constructed and calculated as follows

SR (n) = a \cdot \sum_{q \in E (n)} SR (q) C (q, n) + (1 - a) \cdot d_{u}

(3)

where $d_{u}$ is the ratio of the user k’s rating of the service j to that of the user’s rating of all services. $d_{u}$ is computed using equation (4)

\begin{array}{l} d_{u_{k}} = \frac{d_{u_{k, j}}^{~}}{\sum_{i \in N (u_{k})} d_{u_{k, j}}^{~}} \\ k = 1, \dots, M, j = 1, \dots, N \\ = d_{u_{k, j}}^{~} {\begin{matrix} 0, r_{u_{k, j}} = \emptyset \\ r_{u_{k, j}}, r_{u_{k, j}} \neq \emptyset \end{matrix} \end{array}

(4)

Finally, the predicted QoS value is obtained. From the service rating matrix (SRM) Top K service set, we can form the Dev (absolute deviation between services) in search of the S matrix (K) dataset (k top rating service), the absolute deviation and then obtain S′(K), S(K) is the absolute deviation in SRM ranking.

Calculation of the service j and i absolute deviation is obtained as in equation (5):

Dev (j, i) = \sum_{u \in U (j) \cap U (i)} \frac{(r_{u, j} - r_{u, i})}{| U (j) \cap U (i) |}

(5)

where $u \in U (j) \cap U (i)$ is the set of service users invoked for both service i and service j, and| $u \in U (j) \cap U (i) |$ is the number of $u \in U (j) \cap U (i)$ .

Calculation of SRM is presented in equation (6)

\begin{matrix} SR M_{k, j} = SR (n_{j}, u_{k}) \\ = a \cdot \sum_{q \in E (n_{j})} SR (q) C (q, n_{j}) + (1 - a) \cdot d_{u_{k}} \end{matrix}

(6)

where $SR M_{k, j}$ is the rank value of service j by user k

S' (k) = {Dev (j, i) : i \in SRM (top - k)}

(7)

where $S' (k)$ represents the deviation values of Top K Web Services characterizing higher ranks in SRM.

Calculation of QoS prediction matrix is presented in equation (8)

PreMatrix (r_{u_{k, j}}) = r_{u_{k}} + \frac{\sum_{i \in S' (K)} S' (K_{i})}{| S' (K) |}

(8)

where $r_{u_{k}}$ is the average QoS value of different services invoked by service user $u_{k}$ . $S' (K_{i})$ is the deviation value Dev(j, i) between service j and service i. $| S' (K) |$ represents the number of $S' (K)$ .

Content-based recommendation for SWS

LDA topic model was proposed in 2003,²⁴ it is a typical model of bag of words, where a document is composed of a group of words, with no order in place and words comprise a typical relationship between one another. A document can contain multiple topics, and each word in the document is generated from one of the topics.

We use the intermediate outcome θ (shown in Figure 1) in the graph to compute the subject similarity of the services and then to predict the QoS values. Finally, a service set with similar functions and QoS values will be obtained. The procedure of content-based recommendation is described as follows.

Figure 1.

LDA probabilistic graphical model.

First, service-topic probability distribution is computed using LDA topic model. For realizing the LDA topic model, Gibbs sampling estimation (service-topic probability distribution) is used in the experiment, which is a topic probability composition of each service description document.

Then, given the existing topic probability distribution of the services, the cosine similarity between the topics of SWS can be computed using equation (9)

T (x, y) = \frac{x \cdot y}{{| | x | |}^{2} \times {| | y | |}^{2}} = \frac{\sum x_{i} y_{i}}{\sqrt{{x_{i}}^{2}} \sqrt{{y_{i}}^{2}}}

(9)

Table 1 shows a part of the topic similarity between services, in which each entry represents the similarity of topics between services i and j.

Table 1.

Similarity between Web Services (a selection of).

Services	Services
Services	$s_{1}$	$s_{2}$	$s_{3}$	$s_{4}$	$s_{5}$
$s_{1}$	1.0000	0.3476	0.2602	0.1239	0.2983
$s_{2}$	0.3476	1.0000	0.3314	0.1570	0.3801
$s_{3}$	0.2602	0.3314	1.0000	0.1931	0.3581
$s_{4}$	0.1239	0.1570	0.1931	1.0000	0.1673
$s_{5}$	0.2983	0.3801	0.3581	0.1673	1.0000

Before calculating the predicted QoS values, the existing QoS values are normalized using equation (10)

\begin{matrix} {r'}_{u_{l}, i} = \frac{r_{u_{l}, i} - \min (r_{u_{l}})}{\max (r_{u_{l}}) - \min (r_{u_{l}})} \\ u_{l} : {(r_{u_{k}, i} \neq Φ)} \end{matrix}

(10)

where $r_{u_{l}, i}$ is the QoS value of service i by user l. $\max (r_{u_{l}, i}), \min (r_{u_{l}, i})$ are the maximum and minimum QoS values of services invoked by user l, respectively.

The normalized user rating matrix for the services can be computed using equation (11)

R'_{u, j} = \frac{\sum_{j \in N} T (s_{i}, s_{j}) * {r'}_{u, j}}{| T (s_{i}, s_{j}) |}

(11)

where $T (s_{i}, s_{j})$ is the cosine similarity between the topics of services i and j. Finally, user-service QoS matrix is constructed using the content-based recommendation.

Hybrid recommendation for SWS

The hybrid recommendation method for SWS integrates the correlation of services with the content of services to predict the QoS values, by balancing the trade-off between correlation and content. By predicting the QoS values those represent the quality attributes of the services, this model recommends the services with better QoS values to users whom are more concerned with particular aspects of QoS properties. Eventually, the hybrid recommender system returns the Top K candidate list of services to the users, depending on how well the QoS attributes of the services fit and satisfy the user requirements. Figure 2 illustrates the process for the hybrid recommendation method.

Figure 2.

The process for the hybrid recommendation method.

The hybrid recommendation method not only can accurately associate services using the correlation-based recommendation method but also can mine the potential similarity among the content of services. Moreover, it avoids the cold-start problem of items and users. When new items are added to the system, the hybrid recommendation model can quickly recommend these new items to the users who have similar topic preferences according to the similarity characteristics of the content of services. For new users entering into the system, popular recommendation methods can also be applied to provide a list of diversified services with better QoS. The model of the hybrid recommendation method is expressed in equation (12)

r_{u_{k, j}} = a \cdot r_{correlatio n_{u_{k, j}}} + (1 - a) \cdot r_{conten t_{u_{k, j}}}

(12)

Experiments

In this section, experiments based on real-world QoS dataset are conducted to evaluate the efficiency and accuracy of the proposed approach. We compare our approach with seven other well-known approaches which include the following:

UPCC:⁴ which is a user-based collaborative filtering method using Pearson correlation coefficient to measure the similarity between users.

IPCC:⁴ which is an item-based collaborative filtering method using Pearson correlation coefficient to measure the similarity between items.

WSRec:⁴ which is a QoS-aware hybrid Web Service recommendation approach by combining both UPCC and IPCC with confidence weight.

Hybrid:³⁴ which is a hybrid Web Service recommendation approach using the three-layer service-topics-users model.

The rest of this section describes the dataset, experimental steps and setups, metrics, the performance comparison and result analysis.

Dataset

In the experiments, we used real-world QoS dataset collected by Zheng et al.,^4,25 comprising more than 1,700,000 QoS values from 339 users and 5825 Web Services distributed in 30 different countries. These Web Services Description Language (WSDL) files of Web Services are first crawled and then the rating matrix with 339 users and 908 Web Services over 300,000 QoS values is built. Every entry $r_{i, j}$ in the matrix represents the rating of user i to Web Service j.

Steps and settings

We run the experiments with the following steps. First, the missing values of QoS are predicted using the number of Web Services co-invocated by different consumers. Then, we extract the stem words from WSDL through files crawled from the uniform resource identifier (URI). Then, stop words not helpful to extract semantics are removed from the word lists. After that, Gibbs sampling³⁶ is applied to calculate the topic probability distribution of Web Services. Cosine similarity is then used to compute the topic similarity between services and then to obtain the predicted QoS values. Finally, we combine the missing QoS values with those predicted by correlation-based and semantic content-based recommendation.

To evaluate the performance of prediction, we randomly divide the 339 × 908 rating matrix into training and testing dataset, respectively, to construct a new sub-matrix as training dataset and the remaining entries of the rating matrix constitute the test dataset. Because users usually invoke a small number of services, we set different density range from 0.01 to 0.05 in order to mimic a real-world dataset distribution. Other parameters settings include c = 0.85, k = 30, α = 50/T and β = 0.1 in the performance comparison. c represents the sparsity of training data. k is the Top K number of correlation degree of a service. α and β are parameters of Dirichlet distribution. Higher α results in centralized topics of documents, and higher β results in centralized words of a topic. Furthermore, we study the impact of Top K up on the accuracy of the hybrid model by setting the value of Top K from 5 to 60.

Metrics

The precision of the recommendation for Web Services is measured by mean absolute error (MAE) and normalized mean absolute error (NMAE) metrics.^3,4 Lower values of MAE and NMAE represent higher prediction accuracy of corresponding approaches. The definitions of the metrics are presented in equations (13) and (14)

MAE = \frac{\sum_{u_{k}, i} | r_{u_{k}, i} - r_{u_{k}, i} |}{L}

(13)

NMAE = \frac{MAE}{\sum_{u_{k}, i} r_{u_{k}, i} / L}

(14)

where $r_{u_{k}, i}$ is the predicted QoS value of service i invoked by service user $u_{k}$ , and $r_{u_{k}, i}$ is the real QoS value of service item i invoked by service user $u_{k}$ . In addition, L is the number of predicted values.

Performance comparison

Tables 2 and 3 present the MAE and NMAE values for the evaluated approaches of recommendation based on response time, respectively. As shown in Tables 2 and 3, our approach achieves higher performance both in terms of MAE and NMAE values than the other methods. This is due to the fact that our approach effectively balances the trade-off between the correlation of users and semantic similarities of the Web Services.

Table 2.

MAE values for performance comparison of different methods.

Methods	Density
Methods	0.01	0.02	0.03	0.04	0.05
UPCC	1.1061	1.0231	0.9930	0.9602	0.9299
IPCC	1.0896	0.9907	0.9625	0.9364	0.9180
CTRe	0.7754	0.7399	0.6977	0.6761	0.6590
Hybrid	0.6806	0.6602	0.6522	0.6422	0.6337
NHR	0.6620	0.6472	0.6313	0.6168	0.6021

MAE: mean absolute error

Table 3.

NMAE values for performance comparison of different methods.

Methods	Density
Methods	0.01	0.02	0.03	0.04	0.05
UPCC	1.3808	1.2755	1.2300	1.1904	1.1492
IPCC	1.3369	1.2161	1.1817	1.1502	1.1267
CTRe	0.9503	0.9084	0.8561	0.8299	0.8086
Hybrid	0.8356	0.8105	0.8004	0.794	0.7733
NHR	0.8124	0.7945	0.7747	0.7571	0.7388

NMAE: normalized mean absolute error.

For all the methods, the prediction shows better accuracy with gradually increasing matrix density values from 0.01 to 0.05 accordingly. This observation depicts that fact that increasing the density of training matrix has a positive influence on prediction accuracy.

Effects of matrix density

In reality, the rating matrix is usually highly sparse³⁷ due to a small number of services invoked by users. Hence, investigating the density of the matrix is necessary. As shown in Figures 3 and 4, all the evaluated approaches exhibit better prediction accuracy in terms of both MAE and NMAE, respectively, when the density increases from 0.01 to 0.05 accordingly. This observation shows that highly denser knowledge data can provide us with higher accuracy of prediction.

Figure 3.

Effects of matrix density for MAE.

Figure 4.

Effects of matrix density for NMAE.

Effects of Top K neighbours

A minimum number of Top K is an important factor to achieve adaptive accuracy. Now, the models are simulated with different Top K neighbours to study the corresponding impacts up on adaptive accuracy. As shown in Figures 5 and 6, the accuracy of predicted QoS values increases by around 17% when the Top K value is increased from 10 to 60. Clearly, from Figures 5 and 6, the recommendation system suffers more difficulties while predicting the QoS values with less Top K overlay. In the case of highly sparse data, increasing the Top K value does not improve the prediction accuracy; on the contrary, it will reduce the speed of operation. When density is greater than or equal to 0.02, increasing the Top K gradually decreases the prediction error and the prediction accuracy is saturated at a certain K value. This is because, when Top K value is small, similar services might have been missed. However, largely increasing the Top K value may result in increasing prediction errors due to including some less similar or even unrelated services. Thus, selecting the most appropriate Top K value is important to improve the accuracy of the prediction.

Figure 5.

Effects of Top K neighbours for MAE based on different matrix densities.

Figure 6.

Effects of Top K neighbours for NMAE based on different matrix densities.

Conclusion and future work

This article proposed a hybrid approach for recommendation systems by incorporating the complementary advantages of correlation-based recommendation and content-based recommendation models for the purpose of improving the prediction accuracy of missing QoS values in sparse matrix. Correlation-based recommendation for Web Services calculates the correlation degree between services based on the number of co-invoked Web Services by different users. Meanwhile, the LDA topic model calculates the similarity of SWS in the content. The proposed approach not only considers the user’s correlation but also the content of services, by which the problem of data sparsity existed in real-world dataset has been effectively resolved to improve the accuracy of prediction.

Experiments are conducted on real-world datasets to evaluate the performance of the proposed model of hybrid recommendation and demonstrate that the proposed approach accurately predicts the QoS values even when dataset comprises increased sparsity.

As a future work, we intend to conduct more experiments in large-scale datasets and plan to include other attributes of services such as location, networks and so on to study the performance and scalability of the proposed approach in a larger scale.

Footnotes

Handling Editor: Wenbing Zhao

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was partially supported by the National Natural Science Funds of China under grants nos 61502209 and 61502207, the Natural Science Foundation of Jiangsu Province under grant BK20170069 and UK–China Knowledge Economy Education Partnership.

References

Vasileiadou

Ullrich

Tamm

. Cloud computing definitions and approaches – levels of abstraction: IAAS, PAAS, SAAS. Berlin: Asperado GmbH & SRH Hochschule, 2011.

Xie

Zheng

Liu

et al . Correlation-based Top-k recommendation for Web Services. In: Proceedings of the IEEE international conference on computer and information technology; ubiquitous computing and communications; dependable, autonomic and secure computing; pervasive intelligence and computing, Liverpool, 26–28 October 2015, pp.1903–1909. New York: IEEE.

Chen

Zheng

Liu

et al . Personalized QoS-aware web service recommendation and visualization. IEEE T Serv Comput 2013; 6: 35–47.

Zheng

Lyu

et al . QoS-aware web service recommendation by collaborative filtering. IEEE T Serv Comput 2011; 4(2): 140–152.

Zheng

Zhang

. QoS ranking prediction for cloud services. IEEE T Parall Distr 2013; 24(6): 1213–1222.

Chen

et al . A ranking-oriented hybrid approach to QoS-aware web service recommendation. In: Proceedings of the IEEE international conference on services computing, New York, 27 June–2 July2015, pp.578–585. New York: IEEE.

Guo

Liu

et al . Interest-aware content discovery in peer-to-peer social networks. ACM T Internet Techn 2018; in press.

Liu

Antonopoulos

Zheng

et al . A socioecological model for advanced service discovery in machine-to-machine communication networks. ACM T Embed Comput S 2016; 15(2): 38.

Asad

Erolkantarci

Mouftah

. A survey of Sensor Web Services for the smart grid. J Sens Actuator Netw 2013; 2(1): 98–108.

10.

Wang

Yao

et al . A ranking method for sensor services based on estimation of service access cost. Inform Sciences 2015; 319: 1–17.

11.

Yan

Liu

et al . An adaptive multilevel indexing method for disaster service discovery. IEEE T Comput 2015; 64(9): 2447–2459.

12.

Huang

Yin

Min

et al . Energy-aware dual-path geographic routing to bypass routing holes in wireless sensor networks. IEEE T Mobile Comput. Epub ahead of print 9 November 2017. DOI: 10.1109/TMC.2017.2771424.

13.

Liu

Antonopoulos

et al . Adaptive service discovery on service-oriented and spontaneous sensor systems. Ad Hoc Sens Wirel Ne 2012; 14(1–2): 107–132.

14.

Ponmagal

. An efficient and extensible service based sensor access architecture. Int J Power Control Signal Comput 2012; 4(2): 117–128.

15.

Gibbons

Karp

et al . IrisNet: an architecture for a worldwide sensor web. IEEE Pervas Comput 2003; 2: 22–33.

16.

Guinard

Trifa

Karnouskos

et al . Interacting with the SOA-based internet of things: discovery, query, selection, and on-demand provisioning of Web Services. IEEE T Serv Comput 2010; 3: 223–235.

17.

Paganelli

Parlanti

. A DHT-based discovery service for the internet of things. J Comput Network Comm 2012; 2012: 107041.

18.

Wang

Cassar

et al . An experimental study on geospatial indexing for sensor service discovery. Expert Syst Appl 2015; 42: 3528–3538.

19.

Miao

Liu

et al . An efficient indexing model for the fog layer of industrial internet of things. IEEE T Ind Inform. Epub ahead of print 30 January 2018. DOI: 10.1109/TII.2018.2799598.

20.

Huang

Wang

Cui

et al . Web service QoS prediction based on auxiliary features. Comput Syst Appl 2016; 25: 154–161.

21.

Min

Yang

. Performance analysis of hybrid wireless networks under bursty and correlated traffic. IEEE T Veh Technol 2013; 62(1): 449–454.

22.

Min

Al-Dubai

. A new analytical model for multi-hop cognitive radio networks. IEEE T Wirel Commun 2012; 11(5): 1643–1648.

23.

Kodali

Malothu

. MIXIM framework simulation of WSN with QoS. In: Proceedings of the international conference on advanced communication control and computing technologies, Ramanathapuram, India, 25–27 May 2016, pp.128–131. New York: IEEE.

24.

Blei

Jordan

. Latent Dirichlet allocation. J Mach Learn Res 2003; 3: 993–1022.

25.

Zheng

Lyu

et al . WSRec: a collaborative filtering based web service recommender system. In: Proceedings of the IEEE international conference on web services, Los Angeles, CA, 6–10 July 2009, pp.437–444. New York: IEEE.

26.

Paterek

. Improving regularized singular value decomposition for collaborative filtering. In: Proceedings of the KDD cup & workshop, San Jose, CA, 12 August 2007. New York: ACM.

27.

Cohen

Schapire

Singer

. Learning to order things. In: Proceedings of the NIPS, Denver, CO, 1–6 December 1997. MIT Press.

28.

Zhang

Zheng

Lyu

. Exploring latent features for memory-based QoS prediction in cloud computing. In: Proceedings of the 2011 30th IEEE symposium on reliable distributed systems (SRDS), Madrid, 4–7 October 2011, pp.1–10. New York: IEEE.

29.

Xue

Lin

Yang

et al . Scalable collaborative filtering using cluster-based smoothing. In: Proceedings of the international ACM SIGIR conference on research & development in information retrieval, Salvador, Brazil, 15–19 August 2005, pp.114–121. New York: ACM.

30.

Jennings

Higuchi

. A user model neural network for a personal news service. User Model User-Adap 1993; 3(1): 1–25.

31.

Hofmann

. Latent semantic models for collaborative filtering. ACM T Inform Syst 2004; 22(1): 89–115.

32.

Hofmann

. Collaborative filtering via Gaussian probabilistic latent semantic analysis. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval, Toronto, ON, Canada, 28 July–1 August 2003, vol. 13, pp.259–266. New York: ACM.

33.

Hofmann

. Probabilistic latent semantic analysis. In: Proceedings of the 15th conference on uncertainty in artificial intelligence, Stockholm, 30 July–1 August 1999, pp.289–296. ACM.

34.

Ahmed

Yuan

Liu

et al . Personalised service recommendation based on correlation and semantic content features. Future Gener Comp Sy 2016; 20(3): 283–293.

35.

Brin

Page

. The anatomy of a large-scale hypertextual Web search engine. Comput Networks ISDN 1998; 30: 107–117.

36.

Linden

Smith

York

. Amazon.com recommendations: item-to-item collaborative filtering. IEEE Internet Comput 2003; 7: 76–80.

37.

Liu

et al . Personalized QoS prediction for Web Services using latent factor models. In: Proceedings of the IEEE international conference on services computing, Anchorage, AK, 27 June–2 July 2014, pp.107–114. New York: IEEE.