Abstract
The integration of the Internet and Mobile networks results in huge amount of data, as well as security threat. With the fragile capacity of security protection, worms can propagate in the integration network and undermine the stability and integrity of data. The propagation of worm is a great security risk to massive amounts of data in the integration network. We propose a kind of worm propagating in big data environment named BD-Worm. BD-Worm consumes computing resources and gets privacy information of users, which causes huge losses to our working and living. This paper constructs an integration network topology model and designs the BD-Worm propagating in the big data environment. To analyze the propagation of BD-Worm, we conduct a simulation and provide some recommendations to contain the widespread of BD-Worm according to the simulation results.
1. Introduction
The popularity of mobile intelligent terminal brings great convenience to people's lives. Mobile shopping, mobile banking, mobile social network, mobile maps, and other applications provide users with a variety of services. However, the convenience also brings up a security risk. Mobile phones store a lot of privacy information including contacts, SMS, bank accounts, social network accounts, and geographic information. Network attackers steal the user's private information to make correlation analysis and engage in illegal activities, which violates user's privacy.
The integration of the Internet and Mobile networks has brought great convenience for us. The increasing number of mobile devices causes explosive growth of the amount of data in integration network. While the high-speed development of the integration network brings people into the era of big data, it also brings some data security problems, such as theft and leakage of privacy data and sensitive data [1].
As a kind of malicious program that can infect large amount of hosts in short time, worm is exploited by network attacker. We name the worms destroying data security in integration network as BD-Worm, which takes advantage of the weak security protection ability, propagates in a large area in the network, and destroys the stability and security of data. Here, BD-Worm constitutes one of the major network data security problems because of the integration of the Internet and Mobile network.
In order to ensure the massive data are much safer, we should analyze the propagation mechanism of BD-Worm firstly and then provide effective protection strategies against its propagation characteristics. This paper constructs an integrated network topology and simulates the propagation of BD-Worm. The worm propagates by files attached with malicious code. Considering the differences between computer and mobile intelligent terminal operating system, worm propagation in different operation system needs cross different protocol. The paper chooses files supported by a variety of operating systems as virus vector. The formats of such files include txt and mp3. Once user opens the file attached with worm code, the worm will be activated, will copy itself, and will attach other files with BD-Worm.
The remaining sections of the paper are organized as follows. Section 2 introduces related work. Modeling of BD-Worm will be presented in Section 3. In Section 4, we simulate the BD-Worm in integration network and study the BD-Worm spreading in the different network topology and defense. Finally, Section 5 concludes this paper.
2. Related Work
In this section, we first introduce the effect of integration network on data. Then, we introduce the security risk of big data and several related improvements. At last, we explore the work related to worm theory and new generation worm in different scenarios.
Here, the integration of the Internet and Mobile network is the integration of fixed node and Mobile network, which has greatly expanded the network's flexibility [2]. The popularity of smart phones and tablets spawned a large number of network applications, such as social network, online shopping, and games. It is much more convenient for people's lives by using those applications. The integrated network produces a variety of data formats. In addition, much data such as communications and online transaction need real-time analysis and process. It presents a great challenge to integrated network's data processing capability [3].
More and more privacy leak events raise people's awareness about importance of personal information. With the integration of Mobile network and Internet, the storage, management, and use of huge amount of data are faced with serious security challenges. The protection of mobile phone and Internet users' privacy information has become a major research question in integration network.
Considering the security risks of distributed data storage in big data environment, Zhao [4] takes data access patterns and query into consideration and designed a distributed platform, to ensure the integrity and security of data. Data encryption and privacy protection technologies and management modes cannot meet the requirements in capacity, performance, storage, and security of big data. Data security and privacy protection of users are faced with huge impact and challenges. Wang [5] provides a kind of big data encryption algorithm based on data deduplication technology. The studies have shown that the security of the algorithm is reliable and the algorithm improves the speed of large data encryption processing effectively.
The research on worms over the past few years has focused on future worms and those future worms may propagate in specific complex environment or be designed with new function. For example, Su [6] designs a new kind of network worm that propagates in IPv6 and IPv4-IPv6 transition environment, and the new worm is named NHIW, New Hybrid Internet Worm. Based on the analysis of network worm scanning strategy, Xu et al. [7] design a new kind of network worm-DNSWorm-V6, which can propagate rapidly in IPv6 network by scanning the whole network applying two layers different scanning strategy. Wang [8] analyses the propagation characteristics of worms propagate in Internet of Vehicles and proposes a kind of benign worm defensing malicious worm in Internet of Vehicles.
The study of worms mainly focuses on function structure, scan strategy, and propagation models [9]. Function structure of the worm consists of two parts: the main function structure and the auxiliary function structure. The main function structure controls the basic characteristics of worms, and the auxiliary function is designed for enhancing the properties of worms. Worms scan the whole network to find next attack target. There are many kinds of scan strategies and different strategy will achieve different effects [10]. The research of worms propagation model is based on the spread of epidemic in biology [11]. The classic worm propagation models include SIR/SIS model [12], two-factor model [13], and WOW model [14].
All of these studies as mentioned above focus on the traditional worm; however, our paper focuses on constructing a propagation model of BD-Worm. The security of big data has attracted the attention of mobile phone and computer users. Once the BD-Worm is released into the integrated network by attacker, it will steal huge privacy data. Attacker can control the whole data in infected host through the backdoor reserved by worms.
3. Modeling of BD-Worm
In this section, we provide the big data structure of integrated network and model the BD-Worm.
The integration of Internet and Mobile network makes many data services shared in the mobile terminals and computers. Users can access the Internet anytime and anywhere. Mobile office, remote office, and real-time office are the marks of big data era. The data in Internet and Mobile network are collected into the cloud platform for further storage and management. The structure of big data environment is showed in Figure 1.

The big data environment structure.
The model of BD-Worm can be modeled in five aspects: the infecting process of BD-Worm, the connection probability among each node, the defense capability of mobile nodes and fixed nodes, the opening probability of each suspicious file after being received, and, the last part, computing resource controlling.
3.1. Infection Process of BD-Worm
The integrated network produced variety of data formats, such as .gif, .doc, .mp3, and .rmvb [15]. BD-Worm propagates in integrated network by embedding in the document. BD-Worm spreading in a large scale occupies amounts of data storage space. For the reason that BD-Worm runs on various operating systems, the malicious software programs attached by the document must contain most of the major operating systems both for computer and smart phone, such as windows, Mac, and Android.
The process of worm infection is shown in Figure 2. As the figure shows, when user received a file attached worm, the file should be scanned by antivirus software to detect whether there are any abnormalities or not. If the file is abnormal, it will be deleted. If the file is opened by user, it will copy itself and infect other files, which will consume large amount of computing resources. That means the abnormal computing resource consuming will cause user's awareness. The user will adjudge the memory consuming. If he or she finds that the computed resource controlling is abnormal, the progress of the worm will be killed directly. Otherwise, we consider that the file is benign. If the file is a normal one, it will continue receiving the file. The BD-Worm which runs with infected file will begin to control the computing resources. Finally, it continues to receive the file. This process will be repeated in the whole network unless all BD-Worms are removed.

The process of worm infecting.
3.2. Connection Probability of Nodes
In big data environment, the topology of the integrated network plays a critical role in determining the propagation speed of BD-Worm. In this paper, the topology of the integrated network is determined by connection probability of nodes. All notations used in our paper are shown in Abbreviations Section.
To analyse the topology of integrated network,
In the integration network, we define
When
Therefore, in order to generate the integrated network, we need to analyze the degree distribution
In the Internet, recently Faloutsos et al. [16] showed empirically that certain properties of the AS-level Internet topology are well-described power laws. The most interesting of these regards the degree of a node. If we let
In the mobile network, Lambiotte et al. [18] analyzed statistical properties of a Mobile network constructed from the records of a mobile phone company. The network consists of 2.5 million customers that have placed 810 million communications (phone calls and text messages) over a period of 6 months. It is shown that the degree distribution in the Mobile network has a power-law degree distribution
According to the above analysis, the power-law exponent of the integrated network degree distribution can be written as
In the integrated network, infected nodes will transfer files with other connected nodes. Among the large number of connected nodes, which node the infected node would like to choose is a significant problem. Then, we will calculate the node connecting probability.
If there is an edge between node i and node j, we note that
According to the matrix, we can find that the nodes directly connected with node i can be defined as
The total degree of all nodes connected to node i is
We consider a node only transferring files to the other node that is connected. It sounds more reasonable than transferring files to all the nodes no matter whether it is connected or not.
3.3. Opening Probability
One of the most significant studies of modeling the worm propagation model is qualifying the user awareness. The user security consciousness determines whether the worm can be activated successfully.
User awareness is too complex to be modeled well, for the reason that it may be affected by everything around the user. Based on the BD-Worm malicious acts to the system and the common characters of the computer and smartphone, we can study the computing resource consuming acting on the user awareness. Because worm copies itself and infects other files, it will cause CPU hogs and rewrite hard-disk driver frequently and that will reduce the system operability sharply. In particular, when the computing resource consuming is at a very high level, the obvious abnormal lag of opening files or software will easily draw the user's attention and replace his normal work (such as opening received files from email) with checking his system.
When the amount of computing resource consumption increases at a high level, we can notice the abnormity. Also, we can conclude that the opening probability equals 100 percent with no computing resource consuming and zero percent with full use of computing resource consumption. Therefore, we should simulate the opening probability with an equation like circle
3.4. Computing Resource Controlling
In big data environment, when a host is infected by worm, it will consume many computing resources. The high computing resource consuming will result in users' security consciousness and will kill the worms.
The computing resource controlling is a complex factor that affects the worm propagation speed. There are two reasons. One is the higher computing resource consuming intending to increase user awareness which will reduce the opening probability. The other is the higher computing resource consuming and longer infected time which will increase abnormal files among the transferring files which will increase the propagation speed. Let diagonal matrix
3.5. Defense Capability of Nodes
Because of difference of defense capability of mobile nodes and fixed nodes, the probability of worm nodes being detected is different. In this paper, we define defense capability of nodes as the probability of worm nodes being detected.
Defense strategy can be generally classified into two categories: active defense strategy and passive defense strategy. Active defense refers to those strategies aiming at enhancing the defense capability of the system actively. For example, abnormal detection can reduce the possibility of worm to attack system successfully. Active defense strategy is deployed not for a particular worm, while passive defense strategy is deployed after detecting worms on the Internet. There are many passive strategies, such as system patch and blacklist of malicious address [22]. Actually, whether it is active defense strategy or passive defense strategy, the defense capability of mobile nodes is weaker than fixed nodes.
We introduce a parameter
4. BD-Worm Simulation
To study the characteristics of BD-Worm propagation in integrated network, we simulate the propagation on OMNet. First, we generate several GLP topological networks by Brite. Second, we simulate the spreading process of BD-Worm by sending message. Lastly, we compare the BD-Worm spreading simulation results with different parameters. There is a parameters list of integrated network which is shown in Table 1.
Default simulation parameters list.
4.1. The Influence of Network Topology
The network topology has a great impact on worm propagation.
We know the amount of mobile phones and computers in the integrated network will affect the propagation of worms obviously. Therefore, we try to find the character of worm propagation on the topology we proposed by simulating worm propagation on the network topology of Internet, Mobile network, and integrated network.
In the simulation, the integrated network consists of 10000 nodes of Internet and 10000 nodes of Mobile network, and the total number of the nodes in integrated network is 20000. Unlike the traditional worm, BD-Worm could spread on both the Internet and Mobile network. The network proportion is 0.5. We generate the topology by Brite and the average degree in Internet degree is 4.0196, in Mobile network the degree is 3.9794, and in the integrated network the degree is 4.0166. So, the average degree of the network topology is 4. From the worm propagation in the complex network, we know the degree of the first infected node makes a big difference on worm propagation. In this paper, we choose a high degree node instead of the average degree for the reason that a node with high degree has a more stable spreading.
From the result showed in Figure 3, worm propagation in Mobile network is a little faster than in Internet. Worm propagates the fastest in the integrated network. When

BD-Worm propagation in integrated network compared with traditional worm spreading in single network.
We can draw a conclusion from the simulation results that BD-Worm spreads fast at the beginning of the propagation and spreads faster in integrated network than in Internet or Mobile network. Once worm outbreaks in integrated network, its high propagation speed will lead the existing defense to be useless and the loss will be catastrophic.
Therefore, we need to improve the capability of anomaly detection and early warning in both Internet and Mobile network to contain the spread of worms.
4.2. The Influence of Defense Capability
Because of the limited computing resources of mobile intelligent terminals, the defense in the Mobile network is absolutely not as good as in the Internet. In this simulation, we increase the defense capability of Mobile network

BD-Worm spreading with different defense.
As we all know, the replacement of smartphones is very fast. The security technology for smartphones is not keeping up with the development of phones, leading to the weak defense capability to virus and worm. Our personal information is stored in smartphone; the security capability is a serious problem.
Therefore, we can draw the conclusion that the weakness of host detection in Mobile network increases the BD-Worm propagation sharply and causes the defense in Internet to be not useful as before. On the other hand, if we could reduce the undetected probability
5. Conclusion
In this paper, we first propose a BD-Worm, worm propagating in big data environment caused by integration of Internet and Mobile network. Then we model the BD-Worm with its infection process, connection probability opening probability and computing resource consuming in theory. Finally, we simulate the propagation of BD-Worm. From the simulation result, we draw the conclusions. First, worms in big data environment which are integrated by mobile nodes and fixed nodes propagate faster than worms in traditional Internet and they will cause more serious damage than traditional ones. Second, if we put more resources into developing the defense on Mobile network node, it also protects the Internet nodes.
BD-Worm provided in this paper is just one classic security problem under the big data field. The privacy protection is a serious problem in big data environment. Enhancing security and defense capability should improve our technology both in smartphones and computers.
Footnotes
Abbreviations
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgment
This work is partially supported by 863 National Hi-Tech Research and Development Program (2011AA01A103).
