Abstract
The software architecture of Internet of Things defines the component model and interconnection topology of Internet of Things systems. Refactoring is a systematic practice of improving a software structure without altering its external behaviors. When the Internet of Things software is refactored, it is necessary to detect the correctness of Internet of Things software to ensure its security. To this end, this article proposes a novel refactoring correction detection approach to ensure software security. Control flow analysis and data flow analysis are used to detect code changes before and after refactoring, and synchronization dependency analysis is used to detect changes in synchronization dependency. Three detection algorithms are designed to detect refactoring correctness. Four real-world benchmark applications are used to evaluate our approach. The experimental results show that our proposed approach can ensure correctness of Internet of Things software refactoring.
Keywords
Introduction
In recent years, the wide adoption of the Internet of Things (IoT) systems and immature IoT technologies pose multiple challenges to the development of IoT software.1,2 Despite the multitude of IoT software architectures proposed in previous studies, the optimal IoT software architecture has not been found on a global scale, which means that the IoT technology still needs to be optimized. 3 IoT products have provided much convenience to people’s lives. Juniper Research predicts that nearly 38 billion devices will be connected to the Internet by 2020. 4
With the increase in IoT applications, the type and quantity of IoT terminal devices increase as well. Therefore, the intelligence and correctness of IoT terminals draw wider attention than before.5,6 However, because the functions and structures of the IoT terminals are different, some terminal devices will not be able to meet the needs of users.7,8
Some developers refactor the architecture of IoT software to improve reusability and maintainability. However, the existing refactoring methods may incur a variety of concurrency bugs and lead to changes in behaviors. These problems can also cause the security of IoT software to be compromised.9,10 In order to avoid the problem of post-refactoring behavior inconsistency, it is necessary to study the consistency detection approaches.
We propose a novel detection approach to detect software security. This approach uses the control flow analysis, synchronization dependency analysis, and data flow analysis to detect the security of the refactoring under the WALA software analysis framework, and detection algorithms are designed for three kinds of problems that are common in software development to ensure the security of IoT software. In the experiment, we refactor the benchmark programs using the Eclipse refactoring tool and use the proposed detection approach to assess the refactored program. The experimental results show that the proposed approach can effectively resolve the security problems.
Related work
This section reviews previous studies on IoT software-based methods and refactoring consistency-based methods.
IoT software-based method
IoT security involves several abstraction layers and a number of dimensions. 11 Most security attacks happen at the software level because these attacks are currently the most popular and can affect a large number of devices and processes simultaneously. Most attacks are semantic attacks in data processing. 12 Rebuilding the IoT software is very likely to trigger security threats to the IoT. Therefore, IoT security detection is of great importance. 13
Xu et al. 14 proposed a trajectory privacy-protection scheme based on a trusted anonymous server. Zhang et al. 15 introduced some background knowledge of information security and ongoing challenges to IoT security. Conti et al. 16 introduced existing major security and forensics challenges in the IoT domain and briefly analyzed some papers targeting identified challenges. Xu 17 proposed a method to address the security issues and key technologies in IOT. He elaborated the basic concepts and the principle of the IOT and combined the relevant characteristics of the IOT as well as the international main research results to analyse the security issues and key technologies of the IOT.
IoT has become a popular term around the globe.18,19 Although IoT systems have brought convenience to users, they also cause huge security risks.20,21 The risks of IoT software are immeasurable. Problems that occur in IoT software refactoring may lead to changes in user requirements or security vulnerabilities in the software. Therefore, it is important to detect the refactoring of IoT software.22,23 Security problems related to IoT systems are drawing more and more attention from security experts and government departments.24,25 Both the business community and relevant governmental departments have put forward necessary security assessment requirements for information systems and IoT systems.
Refactoring consistency-based method
Many previous studies focused on the consistency of software refactoring. Changes in the behavior may cause security problems in the software. Therefore, some researchers proposed refactoring tools and methods. If the time spent using the refactoring tools and fixing the bugs is less than the time doing it manually, the tool is useful. 26
Schafer et al. 27 illustrated several types of behavior changes that may cause inconsistency by current refactoring engines and proposed techniques to make the concurrent programs behavior-preserving. They introduced synchronization dependencies that modeled the ordering constraints imposed by the Java memory model and proved that their techniques yielded a strong behavior-preservation guarantee.
Maruyama et al. 28 presented an approach that tames behavior preservation by introducing the concept of a frame. In order to accommodate individual problems in refactoring, a frame was used to represent the boundary of a stakeholder’s concern about the refactored codes. This frame-based refactoring approach preserved the observable behavior within a particular frame and helped programmers distinguish the behavioral changes.
Zhang et al. 29 presented an automated refactoring method among locks at the byte code level. With the promising features of StampedLock, Zhang et al. 30 presented an automated refactoring framework to convert a synchronized lock to a StampedLock. Although many methods are proposed to address software refactoring issues,31,32 there is still no static analysis method to validate the synchronization dependency of synchronized methods and blocks and to detect the consistency of the refactoring behavior. To this end, we use static analysis methods to create an automated detection tool that can detect the security problems of IoT software.
Motivation
Refactoring is an effective way to improve software efficiency. In this section, we use an example to illustrate the problem of software security, as shown in Figure 1. In Figure 1(a), method v1() first acquires the monitor object B.class and then calls A.m(), which in turn acquires the A.class lock. Similarly, method v2() first acquires the monitor object A.class and then calls A.n(), which acquires the monitor object lock A.class.

An example refactoring of changing program behavior to Move method: (a) code before refactoring and (b) code after refactoring.
In Figure 1(b), we apply the Move method refactoring to move method n() from class A to class B. Moving the synchronized method A.n() to class B leads the method to acquire the monitor object B.class. Method v2() first attempts to acquire A.class and then B.class. Method v1() acquires the monitor object B.class and then A.class. Hence, refactoring may end in a deadlock.
To address concurrency problems in IoT software, we designed three detection algorithms based on static analysis: deadlock detection algorithm, object reusable detection algorithm, and shared static field detection algorithm.
Approach overview
In this section, we introduce our approach to detect code changes before and after refactoring. The framework of our approach is shown in Figure 2:

Approach overview.
Static analysis
Control flow analysis
Control flow analysis generates a directed control flow graph. Node D represents the basic code block, and D={
By comparing the changes of nodes before and after refactoring, we find that the software structure changes because of the refactoring. We define that
We assume a refactoring node changes when the node meets the following conditions:
Comparing
For example, we conduct control flow analysis for Figure 1. The code in line 14 before and after refactoring is the same, but in line 15, A.n() ≠ B.n(), that is, the node corresponding to the 15th row is changed.
Synchronization dependency analysis
Synchronization dependency analysis is to analyze the methods that contain synchronized blocks or methods. Synchronization dependencies occur in the following situations:
There is a nested relationship between synchronized blocks;
There is a calling relationship between the synchronized methods;
Synchronized methods contain synchronized blocks;
Synchronized methods are called in the synchronized blocks.
A monitor-enter is an instruction in the synchronized block that acquires a lock, and a monitor-exit is an instruction in the synchronized block that releases a lock. If the lock of the monitor is the current class object, it is a static synchronized method. If the lock of the monitor is an instance object of the current class, it is a synchronized method.
The synchronization dependence edge is defined as follows: Synchronization dependence edge analysis is based on the control flow graph analysis. All nodes include an entry node and an exit node of the monitor on the control flow graph:
A control flow graph node, Node b, has an acquire dependence on Node a if Node a corresponds to an acquire action and there is a path from a to b in the control flow graph. In this case, we consider there is an acquire edge between a and b, denoted as a.
A control flow graph node, Node a, has a release dependence on Node b if Node b corresponds to a release action and there is a path from a to b in the control flow graph. In this case, we consider there is a release edge between a and b, denoted as b.
Synchronization dependency is also defined as follows: A situation is considered to have synchronization dependency if the following four conditions are met between the synchronized methods and synchronized blocks. Method g() represents that this method contains synchronized blocks, and method f() represents that this method contains synchronized methods:
If g(m1) happens before g(m2), g(m2) synchronization depends on g(m1);
If f(m1) happens before f(m2), f(m2) synchronization depends on f(m1);
If g(m2) happens before f(m1), f(m1) synchronization depends on g(m2);
If g(m1) happens before f(m2), f(m2) synchronization depends on g(m1).
Table 1 describes the synchronization dependency relationships of Figure 1. In Figure 1, we first access the synchronized block in the method v1() and then access the synchronized method m() in the static class A. Hence, synchronized method m() has a synchronization dependency relationship with the synchronized block, that is, synchronization of the method m() is dependent on the synchronized block in method v1(). Similarly, synchronization of the synchronized method n() is dependent on the synchronized block in method v2(). However, after refactoring, synchronization of the synchronized method n() is dependent on the synchronized block in method v2(). Since the synchronization dependency relationship has changed, the behavior has changed.
Synchronization dependency of Figure 1.
Data flow analysis
Data flow analysis is based on control flow analysis. It analyzes the flow direction of data on the execution path of a program. The purpose of data flow analysis is to detect changes in the data flow. The set of nodes D={
We define
When a node
The algorithm
In this section, using three examples, we design three detection algorithms to accurately detect security problems.
Deadlock detection
We describe the situation of deadlock threads. Thread A requests acquiring lock L2 while holding lock L1, and thread B requests acquiring lock L1 while holding lock L2. The example program is shown in Figure 3.

Deadlock example.
In Algorithm 1, the main idea is, first, to acquire the monitor object of the synchronized block and then acquire the pointed address of the monitor object. Finally, if the pointed address of the monitor object in the two different synchronized blocks is the same, we detect a deadlock.
Method doPerformAnalysis is the step to perform the algorithm.
Method getSynchronizedClassTypeNames accesses the monitor instruction
Method getAccessedField is the core part of the algorithm. We acquire the instruction pointed to instances
Object reusability detection
The object reuse problem is very likely to occur in synchronized methods or blocks when the lock objects are Boolean, Integer, or String objects. For example, a Boolean object has only two values: true and false. If we use a Boolean object as the monitor object, the object may point to the same address and cause problems. In Figure 4, the lock monitor object is a Boolean object in a synchronized block. Because the two constants, Boolean.FALSE and false, represent the same memory location, they are the same synchronized object, which makes the resources access mutually exclusive.

Object reusable example.
Algorithm 2 is the object reuse detection algorithm we designed. By detecting the type of a monitor object, we can determine whether the monitor object is a Boolean, an Integer, a String, or other types. If the type is a reusable type, we output the detection result.
Method doPerformAnalysis is the step to execute the algorithm. If the program method is detected to be not empty, we will call the method populateBugInstances to detect the monitor object and assign it to the instance
Method populateBugInstances determines whether the “acquire” instruction is a type of reused object. The instruction
Method getReusableLockObjectTypes analyzes the instruction to acquire the lock object type. We acquire the pointed address of the monitor instruction
Method createReusableChecker is the core part of the algorithm to detect the type of
Static shared field detection
For software programs, shared resources are subjected to conflicts due to simultaneous access by multiple threads. As shown in Figure 5, they create two instances of the monitor object when two runnable tasks start. In this situation, it locks two instances, separately.

Shared static field example.
Algorithm 3 is the static shared field detection algorithm. The algorithm acquires all static shared fields and checks whether the field has been modified in the program. If it is modified, it acquires the pointed instance of the field and outputs the detection result.
Method doPerformAnalysis is the step to perform the algorithm. We call the method getAllStaticFields to acquire the static field and store the detected field in
Method populateAllInstancesPointedByStaticFields acquires the static field pointed to the instance
Method populateModifyingStaticInstancesMap acquires the modify static instance. If the modified instruction
Method getModifyingStaticFieldsInstructions is the core part of the algorithm. It determines the static field that needs to be modified by calling the method canModifyStaticField. If the field instruction
Evaluation
Benchmarks
We select four benchmarks to evaluate our refactoring tool. Quark is an open-source tool for developing applications for networked devices based on IoT sensing data. JGroups is an open-source group communication tool. Apache Mina is a network communication application framework, but it mainly provides a programming model for event-driven and asynchronous operations based on the IoT TCP/IP and UDP/IP protocol stacks. In addition, the Apache Mina-core is a core network application framework and HSQLDB is a small database.
Table 2 shows the benchmarks and their respective attributes. The second column represents the total number of classes in the program; the “Method” column represents the number of methods in the benchmark; “Sync” represents the number of methods that may involve synchronization; and “No sync” represents the number of methods not to involve synchronization.
Benchmarks and their attributes.
In summary, the result shows that our analysis can search synchronization methods in real-world programs and analyze their synchronization dependencies. All experiments were conducted on a 16-core 2.60 GHz Intel Xeon E5-2650 workstation with 128GB RAM. The workstation ran on Windows 7 operating system with Eclipse 4.5.1 and JDK 1.8.0 installed.
Experimental results and analysis
Experimental results
The refactoring tool Eclipse was used to convert the benchmarks. We evaluated all the benchmarks, except for Mina which only detected the core package Mina-core.
We refactored the software in each benchmark. By executing three detection algorithms, we detected the existing problem in each of the benchmarks. For example, we found deadlock problem in Quark, Mina-core, and HSQLDB. We detected the object reuse problem and static shared field problem in JGroups.
By using the three algorithms, we detected the three problems (such as deadlock, object reusable, and static shared field). The experimental results are given in Table 3. We assessed the number of inconsistencies and detection time in all benchmarks. Detection of inconsistency indicates that the problems can occur in the refactored program. The detection time shows that our tools are efficient in a short time.
Experimental results.
Case study
The importance of IoT software is highlighted in the “Introduction” section. IoT software is subjected to security problems. In many cases, refactoring does not preserve program behaviors in the presence of concurrency. The new behavior will cause problems that did not exist before refactoring, such as security problems and deadlock.
Figure 6 is the benchmark Mina-core, which classifies the synchronized blocks to parent classes

Case demonstration: (a) code before refactoring and (b) code after refactoring.
By using Algorithm 1 to acquire the pointed address of a monitor object of a synchronized block, we found that the pointed addresses
Conclusion
This article presents a detection approach which uses control flow analysis, synchronization dependency analysis, data flow analysis, and three detection algorithms to ensure consistency and security of IoT software. Static analysis analyzes the structure of changes, and the three detection algorithms are used to detect software security problems. The three detection algorithms solve three problems: deadlock, object reuse, and static shared field. In the experiment, we evaluated our approach by four benchmarks, that is, Quark, JGroups, Mina-core, and HSQLDB. Experimental results show that our approach are efficient in detecting existing problems.
One possible area of future work would be to explore more complex refactoring detection beyond the field of IoT software. For instance, some advanced refactorings inccur new problems and lead to more challenges in software development. The approach proposed herein is not enough to solve all of the problems, but the concepts and techniques developed in this study are expected to serve as a basis for addressing new challenges.
Footnotes
Acknowledgements
The authors gratefully acknowledge the helpful comments and suggestions of the reviewers.
Handling Editor: Xiaojiang Du
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Key Research and Development Plan (grant no. 2018Y FB0803504), National Natural Science Foundation of China (grant nos 61440012, 61871140, 61872100, and U1636215), Scientific Research Foundation of Hebei Educational Department (grant no. ZD2019093), Fundamental Research Foundation of Hebei Province (grant no. 18960106D), Guangdong Province Key Research and Development Plan (grant no. 2019B010137004), and Guangdong Province Universities and Colleges Pearl River Scholar Funded Scheme (2019).
