A novel QoS-aware mechanism for provisioning of virtual machine resource in cloud

Abstract

Efficient low-level resource provisioning and QoS guaranteed are key challenges for cloud computing. Solving such kind of problem can reduce under- or over-utilization of resources, increase users’ satisfaction by serving more users during peak hours, and cut down implementation cost for providers and service cost for users. The existing research works focus on the accurate estimation of the capacity needs, static or dynamic virtual machines creation and scheduling. But significant amount of time is required to create and destroy virtual machines which can be used to serve more requests. In this paper, a novel adaptive QoS-aware virtual machine provisioning mechanism is developed that ensures efficient utilization of the system resources. The VM for similar type of requests has been recycled so that the VM creation time can be minimized and used to serve more user requests. In the proposed model, QoS is linked by a low-level infrastructure resource and is ensured by serving all the tasks within the requirements described in SLA. Tasks are separated using queues and different task is given different tasks are assigned into different queue according to their priority. The CloudSim-based simulation experimental results show that a great number of tasks can be served more effectively compared to others which will help to satisfy customers during the peak hour.

Keywords

Virtual machine QoS provisioning CloudSim

Introduction

Due to the characteristics of elasticity, the quality of service guaranteed, on-demand resource provisioning model, cloud computing,^1–3 have received more and more attention and been adopted in both the academia and the industry. The providers of cloud computing offer users different kinds of services, which include Infrastructure as a Service (Iaaa), Platform as a Service (PaaS) and Software as a Service (SaaS). Users can access cloud services using a variety of devices including PCs, smartphones, laptops, etc over the Internet. Cloud computing can provide pay-as-you-go resources with a relatively cheaper cost and reduced computational expense. The services rendered by cloud platform are much more reliable, maintainable, and scalable. However, there are still a number of challenges including an efficient model of virtual machine provisioning which will ensure quality of service (QoS) of cloud service. QoS usually includes a number of parameters and properties to be fulfilled which encompasses subjective ones (packet loss, transmission rate, delay variance, cost and reputation, etc.) as well as objective ones (data security, trust, privacy concern and user experience as well as degree of satisfaction, etc.). To enhance user satisfaction and to justify the investment in cloud based deployments, meeting up target QoS is necessary.

Some existing works on QoS^4–9 have tried to provide assurance in meeting the service level agreement (SLA). Some other works including Calheiros et al.¹⁰ tried to control VM provisioning in proactive or reactive manner. Efficient resource management through VM multiplexing has been examined in Meng et al.¹¹ However, the target of fulfilling SLA is a great challenge because of the uncertain and dynamic characteristics of network and IT resources in cloud environment.

In this paper, a QoS-aware adaptive VM recycling and provisioning approach have been presented that will serve as an automated, flexible, and efficient management of the low-level cloud resources. The model ascertains that target QoS has been met by controlling the admittance of the requests so that the system does not get overloaded. In the model, QoS is linked by low-level infrastructure resource and is ensured by serving all the tasks within the requirements described in SLA. Here, multiple input queues has been designed for requests of similar requirement metrics of cloud resources and VM created for serving a request can be recycled or reused by other jobs of the same queue. Therefore, the VM creation and destroying time can be minimized to some extent. The most urgent VM is selected using a priority metric and depending on the priority of requests and the resources availability, new VM is created.

The rest of the paper is organized as follows. Section Related works describes some of the works related to our topics of interest. The model is proposed in section The proposed model, which includes QoS model, low-level resource model, mapping mechanism between QoS and low-level resource metric, architecture and so on. In section VM and QoS provisioning algorithm, the proposed model of QoS-aware VM provisioning algorithm has been described. Section Simulation and evaluation presents the result of performance evaluation and simulation. At last, conclusion along with the direction for future research has been provided.

Related works

Efficient resource sharing and scaling which are the key advantages of cloud platforms are mainly obtained by using virtualization which is mainly employed for fault isolation and improved manageability.¹² In Beloglazov and Buyya¹³, the authors proposed an energy management system for virtualized data centers, where resource sharing is categorized into local and global policies. Virtual machine provisioning based on some analytical results has been proposed by Calheiros et al.¹⁰ Automation, adaptation and performance assurance were the key factors in this paper. The authors have proposed a dynamic scenario for virtual machine by using SaaS, PaaS, and IaaS layer of cloud. However, they did not differentiate the type and requirement of every request and every time after finishing the task of virtual machine they have destroyed them. They chose to create new VM for every new task, although the new task might have the same degree of resource requirement compared with the previous one. This type of approach is time consuming and is not suitable in every situation and in our approach we have tried to solve this problem by reusing the old VMs rather than creating.

In Zhang and Yan,¹⁴ the researchers have proposed the framework of adaptive QoS management process, QoS framework for mobile cloud computing and they have modeled QoS management system based on fuzzy cognitive map (FCM). In that paper, How many requests will be accepted by the system, in what way the request is handled, what the system will do if it gets congested, etc. were not clearly defined. In our approach, we have described the scenarios clearly.

Based on queuing networks, the authors have proposed an architecture form provisioning multitier applications in cloud data centers in Bi et al.¹⁵ But such kind of model does not recalculate the number of required VMs based on the expected load and monitor performance, as does our approach. A reactive algorithm for dynamic VM provisioning of PaaS and SaaS applications is proposed by Chieu et al.¹⁶ in a proactive manner.

A queuing infrastructure is proposed using SaaS mashup applications by Lee et al.¹⁷ targeting to optimizing the benefit of reduced costs of the SaaS provider by finding an optimal number of instances for the application. Claudia is developed by Rodero-Merino et al.,¹⁸ where the user defined cloud provisioning, based on performance indicators and elasticity rules. In this model, the authors have also used a reactive approach, whereas we have applied proactive model for obtaining QoS.

In cloud, data center host level to manage power consumption of resources and performance of applications has been proposed in Jung et al.¹⁹ However, this method requires access to the physical infrastructure, which typical IaaS providers do not provide to consumers. But our approach can be applied in both cases whether the services are provided by the same provider or IaaS and PaaS/SaaS providers are different organizations. An architecture of energy management system for virtualized data centers where resource management is divided into local and global policies has been proposed in Nathuji and Schwan.²⁰ Although local policies are described here in a way that the system leverages guest operating systems power management strategies, the global policies are not discussed in detail considering QoS requirements. We focus on VM allocation policies over the cloud, considering strict service level agreement (SLA).

The proposed model

QoS Model and resource model

QoS model

The response time, availability, pay-per-use, cost and reputation are critical elements in cloud computing; therefore, the QoS model could be defined as follows.²¹

Response time: The guaranteed average time required to complete a cloud service request, denoted q_res. It can be measured as: $q_{res} = Avg (T_{r 1}, T_{r 2}, T_{r 3}, \dots, T_{rn})$ , where: Avg() is the average function, and $T_{r 1}$ means the response time of cloud service r1, the rest can be done in the same manner.

Process time: It is a measure of the time that a cloud service takes between the time it gets a request and the moment it sends back the corresponding response, denoted q_pro. It can be measured as: $q_{pro} = Avg (T_{p 1}, T_{p 2}, T_{p 3}, \dots, T_{pn})$ , where: Avg() means the average function, and $T_{p 1}$ means the processing time of cloud service p1, the rest can be done in the same manner.

Availability: It represents the probability of cloud services that can be accessed, denoted q_ava. It can be measured as: $q_{ava} = N_{succ} / N$ , in which: $N_{succ}$ is the accessible times of success time and N expresses the request times that cloud consumers want to use cloud service during a certain period of time.

Reputation: The property is obtained from feedback of service users, denoted q_rep. It is measured by: $q_{rep} = \sum_{i = 1}^{N} Re p_{i} / N$ , in which: $q_{res}, q_{ava}, q_{rep}, q_{\cos t}$ is the reputation value of cloud services i and its range of values is {1, 2, 3, 4, 5}, and N is the number of times the cloud service has been graded.

Cost: It is a measure of the cost involved in requesting the cloud services, denoted $d_{res} = \frac{r_{res} - q_{res}}{max (Q_{res})}$ .

The types of the physical resources are computing resource, network resource, storage resource, and so on. We can define resource model as 4tuple {CR, NR, SR, OR}, where:

(1) Computing resource (CR): It refers to the memory size of VMs and the number of computing units.

(2) Network resource (NR): It refers to bandwidth for communicating with the service provided.

(3) Storage resource (SR): It refers to the hard drive size of virtual machine storage.

(4) Other resource (OR): Resource type other than the resources above.

For example, a consumer wants to request an IaaS service, and she/he may apply for a VM with 2 core CPU, 2G RAM, 2T hard-disk space, which represent a low-level resource set. Moreover, different resource information can be refined, for example: NR may include bandwidth, packet size and so on.

Mapping of QoS and low-level resource metrics

Mapping of low-level resources to service-level QoS has an impact on the cooperation of cloud clients and providers. For different types of services, we need to define a series of rules to deal with the mapping from low-level resource metrics to QoS parameters. The cloud consumers usually pay more attention on service types, response time, availability, processing time, and so on. We can see a list of mapping metrics in Table 1. In this table, the service types may be computing services, storage service. The downtime represents the mean time to repair (MTTR), which denotes the time it takes to bring a system back online after a failure situation and the uptime represents the mean time between failures (MTBF), which denotes the time the system was operational between the last system failures to the next. The request.timein and request.timeout denote the arriving time of a service request and the sending time of the service request. So, the PT can be represented as: PT = request.timein − request.timeout. From the third row, Rin is the response time for a service request and is calculated as: (packetsize)/(avail. bandwidthin − inbytes) in milliseconds. Rout is the response time for a service response and is calculated as: (packetsize)/(avail. bandwidthout − outbytes) in milliseconds. The packet.timein and packet.timeout are similar to request.timein and request.timeout, and they are used to calculate the network delay. When considering the cost for storage service, disk.size is a critical index.

Table 1.

The mapping of low-level resource and QoS parameters.

Service types	Resource metrics	QoS parameters	Mapping rules
Computing service	Downtime Uptime	Availability	A = 1-downtime/uptime
	Request.timein Request.timeout	Process time	PT = request.timein − request.timeout
	Inbyte, outbyte, packsize Avail.bandwidthin Avail.bandwidthout	Response Time	R_total = R_in + R_out(ms)
Storage services	Downtime Uptime	Availability	A = 1-downtime/uptime
	Disk. size	Cost	Cost = k* disk.size
	/	Reputation	/
	Packet. timein Packet. timeout	Network delay	$\frac{\sum_{i = 1}^{N} (packet . timein - packet . timeout)}{N}$

The mapping rules of QoS and resource metrics can be extended and stored in an XML or OWL document.

The architecture proposed

Figure 1 shows the architecture of cloud load balancing mechanism in the premise of ensuring QoS. The architecture encompasses four layers: low-level physical layer, virtualization layer, provisioning layer, and cloud client interface. In the physical layer, it includes various hardware equipment such as servers, database, which also can be abstracted various low-level infrastructure resource pools, such as: computing resource pool, storage resource pool, network resource pool, and so on.

Figure 1.

The architecture of the proposed model.

Above the physical layer, the virtualization layer is composed by cloud management module, VMs cluster which consists of a number of virtual machines, and virtual machine manager (VMM). Every virtual machine is deployed a monitoring agent which is used for acquiring load status of the VM periodically. To be more specific, cloud management module has many function components: (1) VMs image library, (2) VMs image management, (3) VMs management module, (4) VMs deployer, and (5) data collection and analysis unit. As the name suggests, VMs image library and VMs image management are used for storing and managing VMs image; VMs management module and VMs deployer are responsible for creating the VM instances and managing their whole lifecycle. Data collection and analysis unit is the role of collecting and analyzing the information submitted by monitoring agents in order to estimate the load status of VMs and resource utilization ratio of the cluster, as well as communicating with task scheduler layer.

As the core of the architecture, provisioning layer consists of QoS-aware agent, task queues, scheduling policy repository, mapped metric, and task scheduler. QoS-aware agent continuously monitors the tasks based on the agreed QoS. After quantification and classification, QoS-aware agent pushes the tasks into corresponding task queues according to types of tasks, degree of emergency, or the definition of the quality properties. Upon receiving the measured metrics from cloud management module, task scheduler uses the mapped values and predefined scheduling policies to work out optimal action. Then it schedules the tasks to suitable VM for execution with the help of cloud management module.

The top of the architecture is cloud client interface, and cloud consumers can submit tasks through it.

The business flows between components can be described as follows. Firstly, the users submit services through the user interface. QoS-aware agent acquires and quantifies the QoS requirements submitted. According to the quantitative indicators, QoS-aware agent classifies and pushes the services into the different task queues. In the virtualization layer, data collection and analysis unit collects load information by monitoring agents at the same time. This load information includes various resource metrics as well as the task execution information. Then, the data collection and analysis unit computes and analyzes the resource metrics in order to estimate the load status of VMs and the resource utilization ratio of the cluster. Periodically, it also sends the information to task scheduler. Task scheduler utilizes task scheduling algorithm to forward the tasks to the VM selected in the cluster. According to resource utilization rate of cluster which is calculated by elastic scaling algorithm, task scheduler sends commands to cloud management module to adjust the cluster resources.

VM and QoS provisioning algorithm

QoS is maintained by letting requests to enter the system in a controlled way and judicious provisioning of VM. Before allowing a job to be served, the summation of negotiated time for all the jobs in service and in the queue is calculated and denoted by T_negotiated. Then, the total time required for completing all the jobs in service and in queue is estimated from the monitored mean execution time of a work, T_mean. This total required time is given by T_total and is added to a reserved value of time to cope with the uncertain behavior of the network elements and dynamic workload of the requests. Whenever a new job comes to queues, it is actual working time T_{new_act} is predicted. If the negotiated time T_{new_neg} combining with T_negotiated becomes greater than the summation of T_total, T_{new_act}, threshold time and the total affordable service time T_service is greater than the summation of T_negotiated and T_{new_act}, then the new job is allowed to enter into the queues to get service. The requests then enter into a queue that correspond to its requirement of resources. Some VM for a queue is created and the jobs of major priority have the chance of getting executed early. Whenever the task of a VM completes, if it has followers in its queue, this VM is allocated to the job of most priority from the queue and thus the time for creating and destroying VM becomes limited and the performance of the system will upgrade. The requests are served according to the arrival time and the time hungriness of the job. All the queues are priority queue and the priority factor is given by

Priority_factor = arrivaltime + Negotiatedtimelimit

(1)

VM server provides new virtual machine for the queue which has the maximum total priority factor for all of its jobs.

Adaptive VM Recycling and Provisioning algorithm

Input:

(1) T_mean: Monitored mean execution time;

(2) T_service: Total affordable service time by the service provider,;

(3) Reserved_Time:Time forsecured QoS provisioning;

(4) n:Number of tasks;

(5) j:number of queues

(6) i:number of tasks in queue j

Output: QoS-aware VM provisioning result

T_{negotiated ←} ΣTreq;

T_{total ←} n*Tmean;

T_{estimate ←} T_{new_act +} T_{total +} Reserved_Time

If (T_{max_limit} ≥ T_{estimate &&} T_service ≥ T_{max_limit}) then

Push task into queue;

Calculate priority_factor;

If (resources available for creating VM) then

{Repeat for all queue;

Calculat priority_{j ←} Σpriority_factor_i;

Find out maximun priority_j for creating new VM of type j;

}

Else

{Wait for VM of same type for completing current task}

Simulation and evaluation

The result of simulation found from the experimental implementation of the proposed QoS-aware VM recycling and provisioning algorithm is given in this section. CloudSim²¹ discrete event Cloud simulation tool was used for performing the experiments. The simulation environment was set up by a data center containing 100 hosts only each having quad-core processor and 8 GB of RAM. Arriving rate of requests of tasks is considered to be 500 requests per second. The time for creating and de-allocating a VM is considered to be 2–3 min where the time for serving a request is considered to be 30–50 min. The results show a comparative study of system performance between our proposed approach for VM and QoS provisioning (named IVQ) and VM provisioning based on analytical performance and QoS (named AVM). The results show that significant amount of time can be saved from creating new VM which can be used to serve more requests.

In Figure 2, the relation of time for VM creation with the number of requests is shown. In the best case scenario, all the tasks are of a single type; therefore, all the created VMs can be recycled and the time for creation of VM decreases. In the worst case scenario, all the tasks are of different type and new VM is required for all the serving requests. Hence, the performance of the proposed model will be similar to the existing models. Here, the simulation results show that if all the requests are of the same type, our proposed model will take time only for creating VMs initially. The created VMs will then be recycled and therefore no time is spent for creating VMs. In AVM, the time for best case scenario and worst case scenario is the same.

Figure 2.

VM numbers and the related creating time.

In Figure 3, the comparative study on the number of requests served with time between IVQ and AVM is given. Here, we see that, as the time for creation of VMs can be eliminated in our proposed model, more number of requests can be served compared to AVM and hence user satisfaction and economic profit can be achieved. Here, as the time of service increases, the difference in serving requests between IVQ and AVM also increases.

Figure 3.

Simulation time versus number of request served.

Conclusion

In this paper, we have presented a novel approach for VM and QoS provisioning system in cloud environment. We have defined the problem of making VM repeatedly and stated a QoS model to recover this problem. Moreover, an algorithm for minimizing the rejection rate is proposed. The goal of the model is to meet QoS targets by optimizing rejection time in clouds. Cloudsim-based simulation results indicate that our model will give a more reliable performance as well as it meets the QoS.

As a future work, we plan to do resource allocation policies among VM and also we are planning to do resource allocation in case of VM multiplexing. We will work to make a new model that will act as resource allocation model. Efficient memory access mechanism will also get priority as our future work.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The research is supported by: (1) A project funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions; (2) Water Science and Technology Project of Jiangsu Province (2013025); (3) Central University of Basic Scientific Research and Business Fee of Hohai University (2009B21614).

References

Rajkumar

James

Andrzej

. Cloud computing principles and paradigms, Hoboken: John Wiley & Sons, 2011.

Lizhe

Rajiv

Jinjun

. Cloud computing: methodology, systems and applications, Boca Raton: CRC Press, 2012.

Xiaoyu

. Principles, methodologies, and service-oriented approaches for cloud computing, Hershey: Business Science Reference, 2013.

Lodi

Panzieri

Rossi

. SLA-driven clustering of QoS-aware application servers. IEEE Trans Software Eng 2007; 33: 186–197.

Bheda HA and Lakhani J. QoS and Performance Optimization with VM Provisioning Approach in Cloud Computing Environment. Proceedings of 3rd Nirma University International Conference on Engineering, Ahmedabad, 2012, 1–5.

Hassan MM, Song B, Shamim, et al. QoS-aware Resource Provisioning for Big Data Processing in Cloud Computing Environment. Proceedings of 2014 International Conference on Computational Science and Computational Intelligence, Las Vegas, 2014, pp.107–112.

Yuliang S, Chao Y, Jie W, et al. A SLA-based cloud resource provisioning optimisation mechanism for multi-tenant applications. Int J Autonomous Adaptive Commun Syst 2015; 8: 374–391.

Hao

Xiao

. Adaptive Management of virtualized resources in cloud computing using feedback control. 2009 First International Conference on Information Science and Engineering. Nanjing, 26–28 December 2009, pp. 99–102.

Xiao

Lin

Jiang

. Reputation-based QoS provisioning in cloud computing via Dirichlet multinomial model. IEEE Int Conf Commun. Cape Town, 2010, pp. 1–5.

10.

Calheiros RN, Ranjan R and Buyya R. Virtual machine provisioning based on analytical performance and QoS in cloud computing environments. In: International conference of processing in parallel processing (ICPP), Taipei City, 2011, pp.295–304.

11.

Meng X, Isci C, Kephart J, et al. Efficient resource provisioning in compute clouds via VM multiplexing. Proceedings of the 7th International Conference on Autonomic Computing, Washington DC, 2010, pp.11–20.

12.

Nathuji R, Kansal A and Ghaffarkhah A. Q-clouds: managing performance interference effects for QoS-aware clouds. Proceedings of the EuroSys 2010 Conference, Paris, 2010, pp.237–250.

13.

Beloglazov A and Buyya R. Energy efficient resource management in virtualized cloud data centers. In: 10th Processing of IEEE/ACM international conference on cluster, cloud and grid computing, Melbourne, 2010, pp.826–831.

14.

Zhang

Yan

. A QoS-aware system for mobile cloud computing. Processing of 2011 International Conference on Parallel Processing, Beijing, 2011, pp. 518–522.

15.

Bi J, Zhu Z, Tian R, et al. Dynamic provisioning modeling for virtualized multi-tier applications in cloud data center. In: Proceedings of the 3rd international conference on cloud computing (CLOUD10), Miami, FL, 2010, pp.370–377.

16.

Chieu TC, Mohindra A, Karve AA, et al. Dynamic scaling of web applications in a virtualized cloud computing environment. In: Proceedings of the 6th international conference on e-business engineering (ICEBE09), 2009, pp.281–286.

17.

Lee YC, Wang C, Zomaya AY, et al. Profit-driven service request scheduling in clouds. In: Proceedings of the 10th IEEE/ACM international conference on cluster, cloud and grid computing (CCGrid10), 2010, pp.15–24.

18.

Rodero-Merino

Vaquero

Gil

. From infrastructure delivery to service management in clouds. Future Gener Comput Syst 2010; 26: 1226–1240.

19.

Jung G, Hiltunen MA, Joshi KR, et al. Mistral:Dynamically managing power, performance, and adaptation cost in cloud infrastructures. In: Proceedings of the 30th international conference on distributed computing systems (ICDCS10), Genova, 2010, pp.62–73.

20.

Nathuji

Schwan

. Virtualpower: coordinated power management in virtualized enterprise systems. ACM SIGOPS Operat Syst Rev 2007; 41: 265–278.

21.

Calheiros

Ranjan

Beloglazov

. CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Software Pract Exp 2011; 41: 1–5.