Abstract
In the Internet of things, a large number of objects can be embedded over a region of interest where almost every device is connected to the Internet. This work scrutinizes the broadcast overhead problem in an Internet of things network, containing a very large number of objects. The work proposes a probabilistic structure (bloom filter)-based technique, which uses a new broadcast structure that attempts to reduce the number of duplicate copies of a packet at every node. This article utilizes a clustering concept to make the broadcast efficient in terms of memory space, broadcast overhead, and energy usage. The unique idea of a bloom-based network uses a filter to incorporate neighbor information when taking a forwarding decision to reduce broadcast overhead. The simulation results show that parallel broadcasting among different clusters and the use of a bloom filter can achieve a reduction in broadcast overhead from hundreds to ones and tens, when compared with a conventional non-bloom-based broadcast algorithm and a bloom-based algorithm. In addition, it helps to reduce energy usage evenly throughout the network, 1/100 times, when compared with conventional broadcast (non-bloom-based) and, 1/10 times, when compared with bloom-based broadcast. This increases the lifetime of a network by having control over network density usage and communications overhead as a result of broadcasting.
Introduction
The world today has entered a new era of pervasive communication where almost every device is connected to the Internet. The idea of an Internet of Things (IoT) was initiated by K Ashton 1 back in 1999. It identifies every object of a network as an “Internet-like” structure. These objects can be as large as an aeroplane and as small as part of the human body. With passing time, different authors have given different definitions of the IoT.2,3 In the IoT, a large number of objects can be embedded over a region of interest. The deployed objects can be set out to network and disseminate data using the web to offer services to different end users. The end-user system gathers data from the objects and processes these data in order to monitor the state of the environment under examination. 4
Typically, IoT devices interconnect embedded sensors using the Internet through the IP (Internet protocol) stack. 5 This approach demands complex processing, large amount of energy expenditure, and memory. In order to keep the system capabilities less complex, the devices are made to communicate locally through a non-IP network. The paradigm of IoT is maintained in the network using a smart gateway that connects it to the Internet. In a non-IP communications system, intra-transmission channels are made using ZigBee, Bluetooth, Near-Field Communication (NFC), and so on. These channels are generally range limited up to a few meters (1–100 m). The Internet connectivity is used to provide inter-transmission via a smart gateway. This resulted in one of the most popular sub branches of IoT, that is, wireless sensor networks (WSNs).
WSNs have generated great interest in industry and academia, and today sensors are used in many applications. Sensors can be used to measure properties about an object and reveal facts taking place in the vicinity of the sensor. The measured properties are usually communicated via a radio transmitter to a manipulator’s site either directly (using an IP network) or through a gateway node (using non-IP network). The unattended sensor node data can significantly influence the efficiency of many applications. Therefore, a proper data collection mechanism is important for the system to monitor events and wisely manage data processing and communications. A wide variety of IoT applications are envisioned as requiring proper networking of nodes where each node translates sensing activities and identifies itself for required services. 6 Predictive maintenance must be performed in many applications for enhanced safety and security using remote data acquisition and monitoring, and a WSN can facilitate this real-time data management and distribution to cater for timely alarm triggering for security sensitive applications.
In IoT, broadcasting is a very common and efficient solution for sharing data. This article is one of the few papers that has researched efficient broadcasting in IoT. Broadcasting serves as a robust and effective source for network discovery, facilitates network configuration, initializes query for data, and allows the discovery of multiple routes between node pairs in a network. 7 When a node broadcasts a packet in the network, it floods the packet to its neighbors. Flooding is a very simple way of broadcasting data. 8 Each node broadcasts the packet to its neighbors, and all receiving nodes will rebroadcast to their own neighbors; this leads to flooding of the packet on the network. This method of broadcasting will result in the storm problem and will inflict serious overhead in a network. 8
Controlling the amount of rebroadcasting can help to limit the overhead generated thereby ensuring that efficiency is introduced with regard to memory space and energy consumption. Energy consumed by IoT nodes in transmission is much greater than in reception as shown by Zhao et al. 9 and Healy et al. 10 Therefore, controlling rebroadcast transmissions to neighbors can significantly help in improving network performance in terms of increasing network lifetime and energy efficiency. This work is an extension of Talpur et al. 11 that proposed a design of an efficient broadcasting mechanism for data collection, which works to control broadcast overhead and subsequently results in a more energy-efficient network.
The objective of this work is to utilize a bloom filter, a space-efficient probabilistic data structure given by BH Bloom 12 in 1970. Bloom filter and its variants are finding use nowadays in various areas of distributed systems where broadcasting data are used. The broadcasting problem is formulated using a bloom filter that works to define a set of forwarding nodes to prevent flooding of packets to all neighbors. Using defined metrics, a set of nodes that have already participated in the transmission would be defined in the form of a bloom filter. This helps to prevent those nodes from being flooded with unwanted packets previously received. The proposed mechanism in the study of Talpur et al. 11 was implemented in MATLAB, and performance results were compared with existing common broadcasting mechanisms. The extended version of this work presents a distributed network broadcasting solution, and it is compared with the previously proposed version of a bloom-based broadcast presented in the study of Talpur et al. 11
In this extended article, a concept of distributed bloom–based broadcast is proposed for IoT networks. A memory-efficient broadcasting technique is the goal of the proposed work. This work has resulted in a more efficient solution for large networks with respect to the one proposed previously in the study of Talpur et al. 11 The concept in this work is to partition a large IoT network of nodes into a number of small networks, called clusters. A cluster head (CH) is elected 13 or selected to manage nodes locally within each cluster. 14 The CHs will also work as gateways to perform communication with other CHs. Local broadcasting is executed within each cluster to make the process efficient in terms of memory usage. The problem of broadcasting in the IoT where a very large network of nodes exists can consume network bandwidth and lead to failure of the network. This work allows broadcasting to be controlled/restricted to a cluster thereby maximizing network lifetime for large networks.
The remainder of this article is organized as follows. Section “Related work” explains the related work done in this area. Section “Bloom filter” discusses the concept and creation of a bloom filter. Sections “Architecture of broadcasting in the IoT,”“Bloom-based broadcast construction,” and “Distributed bloom–based broadcast construction” describe the architecture of traditional broadcasting, bloom-based broadcasting, and distributed bloom–based broadcasting, respectively. Section “Simulation results” evaluates the simulated results, and section “Conclusion” concludes the article.
Related work
This section reviews the published works which address broadcasting in WSNs as very little research work has been proposed for broadcasting in IoT networks to date. In the study of Lakhlef et al., 15 the author claims its work to be the first to present an efficient broadcast protocol for the IoT as most of the research published to date has targeted WSN-based IoT networks for broadcasting issues.16,17 The use of probabilistic schemes for broadcasting services is discussed in a number of previous works.17–20 The analyses and simulations performed in the study of Ni et al. 17 and Guo 18 help to reduce rebroadcasting of packets using differentiated timing perception and novel heuristic timing techniques. In the study of Kyasanur et al., 19 the smart gossip concept is presented where probabilistic structure–based protocols, which offer broadcast services, have been targeted in order to understand the basic topology and thereby propose a low overhead solution. The method keeps track of previous broadcasts to determine the probability of the node having to rebroadcast a packet. In another work, 21 authors have used concept of gossiping probability and ensure fully converged broadcasting for values between 0.6 and 0.8. For larger networks, gossiping proves that usage of messages is reduced to 35% when compared with flooding. Another work related to probability-based broadcasting in WSN is probability-based broadcast forwarding (PBBF) by Miller et al. 20 The PBBF works at the media access control (MAC) layer and explores the energy-latency-reliability trade-off. It determines the mathematical relationship between energy and latency for a given level of reliability.
Energy wastage during transmission has become a crucial problem with broadcasting protocols. There are a number of researchers focusing on minimizing the energy cost of broadcasting. In some studies,8,22–25 different techniques have been proposed to control energy flow while maintaining full network connectivity. The opportunistic large array (OLA) in the study of Hong and Scaglione 22 provides a cooperative form of broadcasting to save energy. It allows the optimal energy allocation scheme subject to signal-to-noise ratio (SNR) requirements to reduce energy expenditures. Whereas Kumar and Wong 23 utilize a variable permeable length-based broadcasting scheme to disseminate data in a network using small permeable lengths. It helps to conserve energy while allowing for significant throughput in a network. Zeng et al. 24 and Tang and Wang 25 are also working to cope with energy-deficiency problems while utilizing a broadcast tree concept. A tree constructing algorithm is used in the study of Zeng et al. 24 and Tang and Wang 25 to control duplicate packet reception and thereby save energy by reducing flooding. The work by Zeng et al. 24 and Tang and Wang 25 are compared with the proposed work in this article, and it is shown that the work presented here is better in terms of neighbor retransmission overhead. The proposed work provides a mechanism to control broadcast overhead while building a bloom filter containing a list of neighbor nodes. A well-defined probabilistic structure–based bloom filter is used to create a virtual tree of nodes. It helps to relieve broadcast storm by controlling packet duplications and provides energy efficiency using a bloom filter. The extended work uses clusters to enhance the efficiency of a broadcasting for IoT applications. The enhanced work is compared with previous work, 11 and it is shown to be more efficient in terms of memory space, overhead, and energy consumption.
Bloom filter
Bloom filters have been widely used in many fields. It is a space-efficient solution used as a filter for identifying input memberships. 26 There are a number of algorithms that successfully implemented bloom filters for enhanced network performance.27–29 The implementation of a bloom filter in WSNs is as diverse as routing, filtering, and security.30–33 Bloom filters are probabilistic data structures used to test whether an element is a member of a set. False positive matches are possible, but false negatives are not—that is, a query returns either “possibly in set” or “definitely not in set.” 26
In this work, the bloom-based approach is used to control packet broadcast overhead, using an input membership set of a bloom filter.
Basic
The bloom filter is defined as an array of bits used for representing a set
False positive probability
False reporting can occur in the bloom filter where an element that is not a member of set S but is reported as its member. A bloom filter allows a false indication of an element’s presence with a small probability, known as false positive probability. This false positive probability is an accepted error as the advantage of space/storage efficiency outweighs the disadvantage of false positive probability as long as it is small. False positive errors can be controlled by adjusting the size of the filter. The size of the filter provides a clear trade-off with the probability of false positive. The probability that a bit is “0” in the filter can be given as
where k is the number of hash functions used to insert n elements into a filter of size m. The false positive probability, in which a bit is 1 when the element does not exist in the given set of a bloom filter is given by
In the simulation, the secure hash algorithm 2 (SHA-2) algorithm is used. The parameters of the filter are adjusted to keep the maximum false probability value at 0.01.
Architecture of broadcasting in the IoT
The simplest way to communicate data to an entire IoT network is where each node broadcasts a data frame to all known neighbors. Such broadcasting where a single node receives multiple copies of the same data frame from different neighbors will significantly increase overhead and energy consumption of a network. Broadcast communication is needed in a number of applications, including network-wide query and response packets. This is often necessary to provide network-wide reliable transmissions where individual nodes may fail. It is important to maintain successful frame forwarding while controlling overhead to save energy. In the proposed broadcast control algorithm, a node needs to transmit data to all the network nodes. The idea is to provide control over broadcast overhead where multiple copies of data frames are limited through utilizing a space-efficient bloom filter.
A densely deployed network is considered to monitor a wide area where all nodes are equipped with similar processing and transceiver properties. Broadcasting is an event-triggered only when certain conditions take place or a threshold level is detected; then, a node broadcasts its ID to request required services.
System model
The sensor nodes are a crucial part of an IoT network. An IoT network consists of a large number of sensor nodes and a gateway. In the proposed system model, a network is divided into different clusters, and a CH is used as an intermediate node to provide a communication channel between sensor nodes and the gateway (i.e. Base Station (BS)), as shown in Figure 1. The deployment process is made using different assumptions where the N number of nodes are randomly distributed over the area of 1000 m × 1000 m, that is,

A general network scenario of proposed work.
Radio model
A widely used radio model34–37 is simulated in proposed research. According to that the radio energy spent in transmission and reception can be given as
where k is the number of bits transmitted over distance d, with path loss coefficient n, depending on different environmental factors. In proposed research, a two-dimensional network is considered where 400 nodes are deployed in a random manner. The radio model assumes a error-free channel and exploits a free space model. The
Neighbor discovery
It is important to maintain neighborhood information, where each node keeps the list of nodes that are reachable to it by direct transmission (within the same cluster). These nodes are considered neighbor nodes. A periodic exchange of hello messages is used to discover neighboring nodes. In this work, a randomly distributed network is assumed where a node has a neighbor node that is reachable to it by a distance of d. The distance d between two nodes
where
Bloom-based broadcast construction
The algorithm proposed has two main goals: (1) energy-efficient transmission and (2) successful delivery of data to an end node (EN). This is to be facilitated by a bloom-based broadcast structure. To provide an energy-efficient solution, each node must consume as little energy as possible to provide a network-wide broadcast transmission. The idea is to limit the number of copies of a broadcast packet received by a sink node from its neighbors. It is neither possible nor likely to receive exactly one, so the goal is to minimize the total number of duplicate packets received, thereby minimizing energy usage in the network due to packet transmission.
The tree construction phase discussed here takes place, once all neighbors are discovered. After completion of the neighbor discovery phase, all nodes have a neighbor filter (NF) containing a list of their respective neighbors. Here, the algorithm uses a broadcast technique to issue urgent service requests from nodes and data transfer requests to the central station. The case of urgent service is considered as an example to examine the working of bloom-based broadcast. This algorithm can be used for all different kinds of broadcast informations. The tree construction begins with the initial source node (ISN) that begins the process by broadcasting two filters to its one-hop neighborhood, as shown in Figure 2. The first filter has a list of ISN neighbors and is called a NF, while the second filter is an empty filter called an urgent member filter (UMF). An UMF is used to update the list of nodes that require services from the central station as described. During the sensing process, when certain conditions take place or a threshold level is detected, the node considers itself as an urgent case member and adds itself to the UMF. The selection of the ISN is random.

Structure of neighborhood discovery in proposed scenarios.
Once the ISN has broadcast the filters, its neighbors will act as induced forwarders and rebroadcast the updated filters to their own neighbors. This cycle continues until the EN receives the data as shown in Figure 2. EN is a BS that helps to connect the sensor network to the central network for further processing and decision-making. During the rebroadcasting process, when a receiver accepts two filters, it marks the neighbors that it has common with the sender to block retransmission of these filters to these nodes. The selection of common neighbors is performed by querying the neighbor ID from the NF. The filter responds to the presence and absence of IDs with some probability. This helps to minimize broadcast overhead through limiting multiple reception/transmission of the same filters to common neighbors. Before forwarding the filter data received from its preceding node, the sender will add its ID to the UMF if the determined threshold level is exceeded. Otherwise, without any changing, the UMF data are forwarded to the succeeding nodes. The succeeding nodes are the neighbors of the sender node, excluding the neighbors it has in common with its predecessor. The selection of forwarders is chosen by the sender node from its neighbors and received NF, as shown below in Algorithm 3.
During the growing network, use of a NF does not limit the reception of a filter to a node. It is found that there is a number of situations during which a node receives multiple copies from different neighbors. For example, when two nodes that are not neighbors to one another but they share another node in their neighbor list. The analysis presented compares results to a normal packet broadcast and with two existing broadcast control algorithms (i.e. lightweight energy-efficient broadcast control protocol (LEBCP) 24 and lightweight energy-efficient reliable broadcast tree (LERBT) 25 ), and it shows that the proposed method controls the overhead generated by broadcasting better than previous methods.
Distributed bloom–based broadcast construction
This section discusses the newly proposed algorithm for efficient broadcasting in the IoT. Figure 1 gives a representation of the proposed algorithm that is cluster-based. Three kinds of nodes are present in the network; BS (represented as an Internet cloud in the figure), CH (performs aggregation of data from CNs), and CN (sensor nodes with limited energy). All nodes can be mobile or immobile, depending on the application used. CH use direct transmission to communicate to the BS/gateway. and CN used direct or multi-hop transmission to communicate to the CH in order to transmit any broadcast data (BD).
Algorithm 4 explains how two kinds of filters are used to perform efficient broadcasting in a large IoT network, as shown in Figure 3. The ISN is defined as the initial node that has some BD to transmit. The filter created by the ISN holds the IDs of its neighbors that are chosen based on the neighbor discovery procedure (explained in the neighbor discovery section). The broadcasting process is initiated by the ISN. The idea is to maintain a low broadcast overhead through avoiding retransmission of broadcast packets previously received. Before packet transmission, every node compares the received child neighbor filter (CNF) with its own CNF to avoid retransmission to common neighbors. The process of transmission continues until the CH of the cluster, that the ISN is a member of, receives an updated CNF. After reception, the CH will transmit the BD to the IDs contained in the head and gateway ID filter (HGIF). In this way, the broadcasting process is distributed among different clusters where each cluster will perform parallel broadcasting within its own cluster. In the next section, simulation results for this newly proposed algorithm are compared with a simple bloom-based broadcast algorithm. The results obtained show that the use of clusters is more efficient in terms of overhead, memory space, and energy usage.

Usage of bloom filters in distributed bloom–based broadcast.
Simulation results
In this section, an analysis of the proposed algorithm is performed. The simulation parameters and configuration are described below.
Simulation configuration
MATLAB is used to create a two-dimensional network of 400 nodes where nodes are randomly distributed over the area of 1000 m × 1000 m, as shown in Figure 4. In the proposed system model, a network is divided into four different clusters (each contains 100 CN) and each node in the network is identified by a unique ID. A CH is used as an intermediate node to provide a communication channel between sensor nodes and the gateway (i.e. BS), as shown in Figure 1. The cluster formation and CH selection are kept simple. The CHs are selected and located at the onset of the network. Every CH knows and keeps a record of the ID of its all children nodes using a bloom filter. The CHs are marked with a black cross in simulation scenario, as shown in Figure 4. Every new child will add itself to the nearest cluster and accordingly chooses the CH. Each CH maintains two filters, one for child information and another for a record of other CHs and the gateway (or BS). Each child maintains only one filter for neighbor information. The filter sizes are calculated by limiting the probability of a false positive to a maximum of 0.01, as shown in Table 1. The table also shows that the maximum number of filters (at any node) necessary for making forwarding decisions is different for both configurations. Each node has a neighbor separation of

A simulation scenario of distributed bloom–based broadcast with random deployment of 400 nodes in an area of 1000 m × 1000 m.
Filter parameters.
Simulation parameters.
The reduced memory requirement for filter storage, 50% less, for the distributed bloom algorithm demonstrates that it is more memory-efficient than the author’s previous bloom-based algorithm. The new algorithm is also evaluated in terms of network connectivity and optimal neighbor separation distance, overhead, and energy consumption. The proposed algorithm aims at reducing broadcast packet overhead while maintaining a reliable transfer of packets. The reduction in overhead leads to an energy-efficient transmission that helps battery operated sensor nodes to last longer. The proposed distributed bloom–based algorithm maintains a low average number of neighbors while maintaining 100% connectivity. This helps in less packet processing and therefore less energy consumption. For further evaluation, the distributed bloom–based algorithm is compared with LEBCP, 24 LERBT, 25 and the bloom-based algorithm proposed previously by Talpur et al. 11 The results are presented in Table 3.
Network density—average number of neighbors.
LEBCP: lightweight energy-efficient broadcast control protocol; LERBT: lightweight energy-efficient reliable broadcast tree.
Network connectivity and neighbor separation
Network connectivity is of key importance in a sensor network. It is important for a node to stay connected within a network to ensure that all nodes are receiving transmissions and are included in broadcast communications which are often used for network configuration purposes. The loss of network connectivity to any node can be hazardous and can result in total network failure. This issue is analyzed by varying the neighbor separation

Maximum number of duplications obtained at any single node and percent of network converged versus neighbor separation for unclustured network.

Percent of network converged versus neighbor separation for clustured network.
Overhead
To evaluate the network performance, the number of copies of duplicate packets is calculated that are received by N nodes. It tells about the number of times same data is received or duplicated at any particular node. In an ideal situation, the number of copies per node would be one; however, practically this is not possible. Figure 7 gives a comparison between the simple broadcast (non-bloom), bloom-based broadcast and the distributed bloom–based broadcast. It can be seen that the use of clusters helps to considerably reduce the number of duplicated data packets. It minimizes the overhead introduced by broadcasting and packet duplication from thousands to hundreds and from hundreds to tens in some cases.

Broadcast overhead or number of duplications of data packet (including filters) at a node.
Energy consumption
In order to calculate the total energy expenditure of the network, the energy required for packet transmission and reception is calculated and discussed in “Radio model” section. The energy used in processing and sensing is not included to simplify the evaluation process. Energy dissipated by the electronics is given in Table 2.
Figure 8 shows the total energy consumed by a sensor node in transmission and reception. This shows the energy expended by a normal (non-bloom) broadcast algorithm, bloom-based broadcast, and the distributed bloom–based broadcast. It shows that the energy consumed in the distributed bloom–based algorithm is 1/100 and 1/10 compared to the normal non-bloom broadcast and bloom-based broadcast techniques, respectively. Optimizing the decision for data forwarding keeps unwanted reception of the same data to a minimum. This results in an energy-efficient transmission and increases the lifetime of battery powered nodes.

Total energy consumed by a node in transmission and reception.
Conclusion
Broadcasting is very common in distributed IoT networks and serves as a robust and effective source for transmitting update and configuration data. However, broadcasting can result in serious overhead problems in an IoT network. Controlling the redundancies of rebroadcasting can help in limiting this overhead problem. This article presents a space-efficient data structure using a bloom filter to help maintain a set of node IDs in the form of a vector of bits to allow intelligent forwarding decision to be made for BD so that reliably communications can be maintained for every node of the network while reducing duplicate packet retransmission.
The distributed bloom–based network outperforms other broadcasting algorithms, both non-bloom-based and bloom-based, by minimizing the overhead and duplication of packets from hundreds to tens, and tens to units. Introducing the idea of clustering with a bloom filter has reduced memory storage to almost half when compared to non-cluster bloom filter algorithms. The reduction in the duplication of packets is an energy-efficient solution in minimizing data transmission cost and improving system lifetime. The proposed structure does not require any control information from a central system in order to operate. It can operate effectively in a distributed IoT network and is independent from network global parameters. This algorithm also facilitates parallel broadcasting in each cluster using a small number of filter bits. The distributed bloom filter algorithm is an extension of the bloom algorithm published in the study of Talpur et al., 11 and it has reduced energy consumption by a factor of 1/10. When compared to non-bloom broadcasting algorithms, like LEBCP, 24 LERBT, 25 it reduces energy consumption by a factor of 1/100. In addition to this, unwanted duplication of packets at each node is minimized thereby improving network performance and packet delivery times.
Footnotes
Acknowledgements
The authors thank the Science and Technology Unit at Umm Al-Qura University for their continued logistics support.
Handling Editor: Sang-Woon Jeon
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is supported by grant number 10-INF1236-10 from the Long-Term National Plan for Science, Technology and Innovation (LTNPSTI) and the King AbdulAziz City for Science and Technology (KACST), Kingdom of Saudi Arabia.
