Abstract
With the rapid popularization and development of Internet, global researchers are increasingly paying more and more attention to anonymity and privacy security. Anonymous communication technology has gradually become people’s focus. Attacks against sensors may appear in different layers, so anonymity at network layer should therefore be envisioned. Tor, as one of the anonymous communication systems which implements multi-hop to protect user’s identity privacy on the Internet, is currently the most popular anonymous privacy solution, but the concept of Tor has not applied to wireless sensor networks. Therefore, in this article, we think the mix structure of Tor which is crucial to its anonymity can be applied to wireless sensor networks; what’s more, an improved path selecting algorithm according to round trip time we propose can boost some degree of performance. First, we make a summary of current typical path selection algorithms and analyze the performance and anonymity problems of these algorithms in network congestion. Then, in order to improve the performance of the algorithm in congestion, a routing improvement scheme based on round trip time is proposed.
Keywords
Introduction
Time information is very important for sensors to communicate with each other, because most sensor network applications rely on time synchronization. So, time information can be used by attackers to intrude. 1 In addition to time information, there is other sensitive information in wireless sensor networks (WSNs), such as IP address, the size of message, and the frequency of contact. Tor is an anonymous network system, and we think the introduction of Tor can settle these problems. Some researchers have integrated onion routing with sensor networks to preserve privacy 2 and anonymize sensor network IDs using Tor.3–5
Tor, the second-generation onion router, provides circuit-based low-latency anonymity service based on transmission control protocol (TCP). 6 We can treat Tor as a black box when data packets step into the anonymous system and reach application or service outside Tor system. We can apply Tor in the WSNs by means of equipping sensors with the technology of Tor to achieve anonymity, like mobile phones with Tor app. Therefore, path selection algorithm is crucial for performance and anonymity of this system which guarantees time requirement for sensor.
In Tor, before a communication starts, clients have chosen three nodes to construct one circuit, 7 and the information of these three nodes has downloaded from authority directory servers by agents. Once the circuit is constructed, the clients forward traffic encrypted in each layer with fixed-size block of data, the nodes in the circuit encrypt or decrypt layer by layer depending on the direction of forwarding. 8 It is designed to choose nodes randomly which presents equal probability for every node in the early systems of onion router. 7 It may increase the difficulty that an attacker monitors the traffic in a circuit, but with the user traffic increasing, in order to acquire performance improvement we prefer to apply bandwidth weighted routing algorithm to balance flow load, imperfectly this algorithm may introduce low-resource attacks. The performance of an anonymous path can be described using many network metrics—for example, bandwidth, latency, jitter, and loss. 9 How to achieve both performance and anonymity is one of the most concerned problems in the academic circles.
Our contributions: we describe some studies of different routing algorithms, which involve the metrics of choosing nodes and constructing circuits. Then, we put forward a new path algorithm which can reduce the RTT (round trip time) to improve performance and introduce the evaluation of this algorithm.
Related works
In this section, we will discuss several kinds of path selection algorithms and conclude their merit and demerit.
Tune-up for Tor
Tor routers periodically self-report their bandwidth capabilities to authority directory servers, but the authority directory servers cannot verify whether or not the routers misreport higher bandwidth to get a large fraction of tunnels. Besides, this design lacks flexibility, users can not compromise between performance and anonymity based on actual requirements.
Snader and Borisov 10 from University of Illinois at Urbana–Champaign propose an improved algorithm to solve these problems, which effectively blends the traffic of users with different preferences and makes attacks difficult. Every router records the bandwidths from routers it interact with, at the meantime reports the records to the authority directory servers. Users also have abilities to select a point on the anonymity performance scale either globally or depending on the task, so if users want nodes with wide bandwidth to be the part of circuit, all they need to do is to set a large number.
Scalable link-based relay selection
Micah and Matt from University of Pennsylvania analyze the tune-up algorithm, they think it has significant drawbacks to implement, 9 for example, if one of the routers falsely claims its bandwidth to collude malicious routing nodes, it would invite danger to the system. What is more, every node in the network reporting others’ bandwidth might reveal details about circuit constructing, which gives routing nodes more powerful global view. With these concerns, they propose an alternative one that offers strong security and flexibility, which is to utilize link-based path selection strategies. 9
In this design, the initiator chooses the most suitable path from randomly generated paths on the basis of predicted end-to-end performance which is measured by the costs of constituent circuits. At the meanwhile, these strategies allow the sender’s preference for anonymity or performance. If the performance of anonymous paths is determined by the network topology instead of local effects at end nodes, link-based path selection is more appropriate than node-based.
Shortest path algorithm
The design of Tor pursues low-latency anonymous communication, however, in the actual practice, interaction in Tor taking up over 90% of connection results in quintuple latencies than Internet direct path. Moreover, the invisibility of constructing a circuit in Tor may enable an autonomous system to correlate traffic across the entry and exit segments of a circuit which brings in danger.
To solve these problems, excogitation of a new Tor client LASTor came out, 11 it implements three improvements. First, according to the inferred geographic locations of relays, rather than real-time update latency information from Internet paths, it gains significant latency. It uses Weighted Shortest Path Algorithm to select a shorter path. Second, LASTor can predict Internet routing between relays and end-hosts by developing a computation lightweight technique to increase the likelihood of avoiding “snooping” autonomy systems. Finally, LASTor takes a tunable mechanism by setting a parameter between 0 which represent lowest latency and 1 which represent highest anonymity to present a trade-off between latency and anonymity.
Congestion-aware path selection for Tor
Wang et al. 12 from University of Waterloo propose an algorithm called Congestion-aware path selection to reduce congestion and improve Tor’s load balancing. The core concept of this algorithm is that the congestion target is congestion time on the congested nodes. The initiator first uses opportunistic and lightweight active measurements to calculate RTT for many times, and we let t represent RTT, then set the minimum as tmin, the congestion time tc = t – tmin. If we form a circuit, we just choose the path that has smallest tc. Measurements of a circuit can either be putting an active or opportunistic probe which has no overhead.
Predistributed Diffie-Hellman values
For the purpose of reducing overhead and delay of circuit establishment in the Tor network, Øverlier and Syverson 13 make some modifications of Tor, they describe four protocols for establishing circuits and accessing hidden service over Tor, in the first protocol it applies a half-certified Diffie-Hellman key exchange to initialize the keys along the circuit which can reduce the computational expense when a circuit sets up. The second is to send out control command through the first protocol for the purpose of reducing the communication expense in forward secrecy. Then they add Rivest–Shamir–Adleman (RSA) encryption to the communication between clients and entry nodes in the third protocol. Finally, it is a new agreement protocol which provides immediate forward secrecy, whereas the others provide eventual forward secrecy.
Pairing-based onion routing
The work of Kate et al. 14 present an improved single pass circuit construction scheme using pairing-based key exchange which extends the non-interactive key agreement scheme of Sakai et al. 15
It applies bilinear pairings in an identity-based infrastructure to achieve one-way anonymity, that is, clients can verify the next node without disclosing their identity information. An employment of trusted authority called Private Key Generator enables nodes to generate their own private key which means the clients may generate separate pseudonyms with private keys as more as possible. Two nodes can come to a session key agreement by exchanging pseudonyms. Furthermore, two nodes can also perform an authenticated key agreement by modifying any secure symmetric-key based on mutual authentication protocol and simply replace their identities by their pseudonyms. Because of the non-interactive specialty, the computational cost is reduced. However, on account of frequently replacing identity keys, the communication overhead is very high.
Certificateless onion routing
Catalano et al. 16 introduce a novel certificateless anonymous key-agreement protocol consisting of advantages of both identity-based. Each user has a string representing its identity with a secret key produced by Key Generation Center and a public/secret key pair in this protocol. In particular, routers can generate their own keys without interacting with key generation center; what’s more, these keys do not need to be certified. Another important improvement is that there is no expensive pairing computation.
We conclude a few relevant papers about these seven kinds of algorithms. Figure 1 shows the relationship among these papers. There are seven kinds of researches about improved path algorithms, we use different colors to present each kind, and the connecting lines represent the relations between these papers.

Relative researches on the path algorithm of different emphasis in Tor network. Different colors mean different kinds of algorithm’s citation, for example, the red lines represent citations of the researches about certificateless onion routing.
Implementation of path selection algorithm with minimal delay
Overview
Tor is an anonymous network consisting of routing nodes maintained by volunteers around the world; in its selection tactics, one of the most popular algorithms is weighting the estimated bandwidth. Nevertheless, it still has some blemish, one of them is that it never takes current bandwidth into consideration which makes performance decreases as network congestion increases. It not only degrades performance but also hinders the growth of user scale.
In this section, we propose an improved algorithm; its central idea is making use of minimal delay which is measured by RTT. RTT is an important performance index which denotes the total delay including transmission delay and congestion nodes delay from sending data to the receiving acknowledgment segment. 17 Because Tor network is a low-latency interactive anonymous communication system, network latency becomes the focus of the academic circles; in 2009, Sherr et al. 9 proposed routing selection criteria based on the attributes of the circuit, such as the waiting time, jitter, and packet dropout rate.
Design objective
Our goals are reducing congestion and improving load balancing, and we maintain a list of circuits on the client. When clients communicate with servers, five circuits are constructed virtually and we select the one with minimal RTT to send data. During the transmission, we measure transmission delay periodically, if the RTT of the circuit is too long, we can select a circuit from the maintained list with a relatively shorter RTT and then continue the transmission. Our targets are as follows:
Anonymity. For all the anonymous system, anonymity and performance cannot improve at the same time, we just compromise between them. Our scheme pursues performance at the expense of anonymity. But experimental results show that relative to the original system, the anonymity just degrades slightly which does not affect users’ anonymity.
Balance load. Our scheme is a path selection algorithm based on link’s RTT, when there exists transmission delay or node congestion delay, we prefer to choose a circuit with shorter RTT to communicate, so the load balancing problem solves effectively.
Low overhead. In RTT measurement, the adopted technique is using CREATE and SENDME data units from circuit establishment phase and forwarding phase, so it will not bring additional load on the Tor network.
The architecture design
Pre-built circuits
In order to select a circuit with smaller delay, Tor automatically maintains a few pre-built circuits. In communication, it prefers to choose a circuit with smallest delay from the pre-built circuit list as the communication circuit. So before the communication starts, the clients pre-build multiple circuits and maintain them. The initiative detection technology is used to calculate the RTT while constructing a circuit. We build five circuits in the simulation experiment for the reason that it is adequate to choose a circuit with relatively short RTT from five pre-built circuits. Its architecture is shown in Figure 2.

Client A pre-builds circuits and puts the circuits to the local cache for maintenance.
Change the circuit
The clients continue to measure the RTT periodically in the process of communication, using SENDME, BEGIN, or CONNECTED control unit. If real-time RTT is larger in the process of communication, the client should stop using the current circuit and choose another circuit with shorter delay from pre-built circuits to continue the communication. The architecture is shown in Figure 3.

Circuits chosen by client A and client B are overlapped in some parts, such as reuse in the communication circuit between intermediate node and exit node.
When client A communicates with a web server, client B requests service from the file server, and reuses the communication circuit between intermediate node and exit node with A. This means that the two circuits reuse the same TCP connection in the middle node and the exit node. Obviously, file sharing consumes more than web browsing. Then, the two circuits’ multiplexing output caching of application layer will soon be filled up with traffic of file sharing which leads to routing nodes’ congestion. At this point, if the RTT measured by client A is longer than the specified value, a circuit with shorter delay will be selected from local pre-built circuit buffer and continues to communicate with the Web server.
This design implements traffic load balancing in Tor network, reduces the user interaction latency, and improves the performance of the Tor network when the circuit transmission delay and node congestion are large.
Implementation
According to the architecture designed in the previous section, the workflow chart of our path selection improvement scheme based on RTT is shown in Figure 4.

Workflow for the algorithm.
Then, we will account for the implementation of the main functional modules in detail.
The measurement of RTT
Measurement standards
Measurement should be lightweight. Even if all the users in the Tor network use this scheme, there is no additional load to the Tor network at the same time.
Measurement should be indistinguishable with the measurement business. Otherwise, it will affect the Tor network security and introduce danger to nodes because of measurement of data information.
Measurement should not take the delay of the directory server into consideration. We only want to measure the Tor network latency, because whether or not the users use Tor to communicate, it could lead to a delay of the directory server.
Measurement model
Latency mainly includes circuit transmission delay between nodes and delay caused by each routing node’s congestion. As shown in Figure 5, tti (i = 1, 2, 3) represents the propagation delay between the routing node, tci (i = 1, 2, 3) is on behalf of each routing node’s congestion delay.

Measurement model for circuit’s RTT.
Due to the single tunnel model of Tor network, the request latency between clients and servers is almost equal to latency between the servers and the clients in the process of data transmission. We define the RTT equals 2 × (tt1+tt2+tt3+tc1+tc2+tc3).
Measurement technology
Panchenko et al. 18 put forward an initiative delay measurement technology.
The client sends a request command to the exit node that requesting exit node to connect to local host, then exit node will send a response to the client that it refuses to respond after receiving the request. We can get RTT by time difference between sending a request and receiving the response. The measurement technology has potential faults that malicious exit nodes may identify the measuring probe and try to affect the measurement result.
In order to meet the proposed delay measurement standard, we can use two circuit delay measurement methods. Our delay measurement technique is similar with Panchenko’s, when constructing a circuit, we use the control command unit from construction phase to measure the RTT. Clients forward an EXTEND data unit from introduction node to intermediate node in order to link to the final exit node when constructing a circuit in Tor. Intermediate node sends a CREATE data unit to exit node when receiving the EXTEND. Exit node sends response data unit CREATED to the client when receiving CREATE. The client calculates RTT by time difference between sending EXTEND and receiving CREATED. 19 The whole process is shown in Figure 6.

Sequence diagram of RTT during circuit construction stage.
Using control command unit, we can measure RTT during the period of data transmission. At the transport layer, the SENDME control unit is sent in an end-to-end way for each 50 and 100 onion data units to notify the sender to continue sending the data. In the source file, there is a function to send the control unit, as shown in Figure 7. In addition, the client sends the BEGIN control unit whenever a new TCP stream is created. For web browsing clients, whenever a new TCP stream is created, the request control unit BEGIN and the response control unit CONNECTED will deliver end-to-end requests, and such requests and responses occur in each access page, so we can get multiple delay measurements while browsing the web without causing any additional load on the Tor network.

A code module sent by SENDME control unit from the link layer.
Circuit construction
In the construction of a circuit, the circuit is pre-built by function, and part of the function is shown in Figure 8.

Code module of circuit pre-building, measurement of RTT during circuit construction, and pre-building five circuits.
The client first downloads locally the routing descriptor and the network consensus file from the directory server, then the client randomly selects three routing nodes from the routing list of the network consensus file according to the routing algorithm, and they are introduction node, intermediate node, and exit node, respectively. Then, client looks up the routing descriptor file, and finds out the corresponding IP address and key information of the node, and begins to build the circuit in a scalable and interactive way; in the meanwhile, the Diffie-Hellman Key Exchange (DH) handshake protocol is used to communicate with each of the routing nodes in the circuit to generate the shared session key. In the extended last hop to the exit node, we use the EXTEND data unit that builds the circuit to calculate the RTT and select five circuits with low delay to buffer locally. After the circuits are pre-built, one circuit is selected to transmit the data.
Circuit replacement
The circuit is replaced by the function when the circuit delay is long in the process of data transmission. Some codes of the function are shown in Figure 9.

Code module for circuit replacement in data transmission stage, if congestion happens, then change the circuit.
Variable circuits represent a circuit list constructed during circuit pre-building stage. During transmission, the SENDME units are sent in response to every 50 and 100 data units, the RTT of the circuit can be measured during this period, and determined whether the RTT exceeds the maximum delay of the defined value, if it has exceeded the maximum delay that can be sustained, we will select a circuit with shorter delay from the circuit buffer list to continue the communication.
Evaluation
In this section, we present the evaluation of our improved algorithm. We use the Shadow simulator, which is a discrete-event network simulator that runs the real Tor software as plug-in. First of all, we need consensus file and routing descriptors as input. There is the information of all routers in the consensus file, including alias of the router, ID, tag, version, online time, bandwidth, and so on. We can download these two files from Tor’s website https://metrics.torproject.org/collector.html; in our experiment, we select the files from June 2017 to December 2017.
Our experimental scheme is designed to verify the performance improvement of our path selection algorithm and whether it effectively improves congestion and load balance of Tor network, and the degree of influence on anonymity while improving its performance. In the experiment described below, the control group is the path selection algorithm currently used by Tor, at the same time, the experimental group is the improved path selection algorithm mentioned above.
Figures 10–13 present experimental results. From the results of these experiments, we can roughly find out, in the case of low congestion in Figure 10, that the performance of the improved path selection algorithm is very close to that of the current Tor path selection algorithm. However, as the degree of congestion becomes severe in Figures 11 and 12, the improved path selection algorithm has a significant performance improvement. It shows that our scheme can reduce congestion efficiently and improve the network load balance and performance of Tor.

Under the condition of low congestion, total time and throughput for client downloading the first byte.

Under the condition of moderate congestion, total time and throughput for client downloading the first byte.

Under the condition of high congestion, total time and throughput for client downloading the first byte.

First time to compromise of exit node and flow ratio through exit node.
Compared with the current Tor path selection algorithm, Figure 13 shows that our scheme has been slightly accelerated the time to compromise for the first time, which is what we expected from the beginning. But the flow rate increased, the cause may be the fact that the link with smaller RTT will be more likely to be selected by the user. This may provide a new direction of attacks with reducing RTT of the link which the attacker has already controlled. It is the corollary of balancing performance against anonymity, and we have improved its performance at the expense of a certain amount of anonymity. The anonymity of the algorithm will be further studied.
Conclusion and future work
In order to improve congestion control and network load balancing, and ultimately improve the performance of Tor, an improved routing algorithm based on circuit’s RTT is proposed; we identify the RTT as indicator of path selecting, in the case of no congestion on the network, its performance is very close to the current Tor, but with the increase of network congestion, its performance has been significantly improved. These techniques of onion routing in Tor can be applied to WSNs which ensure privacy-preserved and time requirements.
We can achieve improving traffic load balancing and Tor performance, although there is a slight reduction in anonymity, it does not affect the user’s normal anonymous communication.
Tor is the most widely used anonymous system recently; I believe, in the future, there are more and more researchers participating in the performance improvement research of Tor. In this article, the improvement of Tor’s performance is deeply analyzed and researched, and an improved path algorithm is proposed. In this section, we define a variety of avenues for future work.
In our scheme, it provides a new attack direction for attackers because of choosing a circuit with shortest RTT; the attackers could reduce the response time of the circuit which has been controlled, so that the circuits selected by the client are more likely to be insecure. It means that we sacrifice some degree of anonymity to win performance. Because of time constraint, we just study and improve Tor’s performance and leave the anonymity for further studies.
In the future, other performance improvement studies of Tor need to be analyzed, summarized, and improved, making it easier for future researchers in anonymous Tor communications system. The anonymity of the current Tor anonymous system is very mature; at the same time, we call for a large number of researchers to devote themselves to the performance improvement research of Tor, improving performance and user’s experience and keeping the size of Tor and the volume of users growing steadily.
Footnotes
Handling Editor: Wenbing Zhao
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Natural Science Foundation of China under Grant No. 61170273.
