Node scheduling in heterogeneous IoT systems with cluster topology

. IoT systems are heterogeneous network systems which consist of sensor terminals monitoring different objects. Terminal sensors have different sampling frequencies determined by minimum Nyquist sampling requirements. It is difficult to ensure synchronization when monitoring data of different sensors is transmitted to the data center. In consideration of sampling requirements of different terminals and in order to upload the sampling data timely and avoid data congestion, an applicable node scheduling scheme is necessary. In this paper node scheduling algorithms are proposed in hierarchical IoT systems with cluster topologies. The IoT systems include IoT alarming systems and IoT monitoring systems. In the simulation test, node scheduling methods are tested and the validity is demonstrated quantitatively.


Introduction
IoT(Internet of Things) refers to distributed collaborative networks which connect the Internet and huge smart objects such as RFID(Radio Frequency Identification) devices, infrared sensors, global positioning systems, laser scanners, etc [1]. The fundamental part of an IoT system is the sensing network which consists of sensor nodes, and the end terminal is the data center. Sensor nodes in IoT systems can connect with each other through both wireless and wired networks. Most IoT systems are used to monitor different objects and have many subsystems. For example, in an IoT system for health monitoring, there are different numbers and types of sensors which monitor different health information, such as temperature, pulse, heart rate, blood pressure and posture [2].
Information sampling is a fundamental problem in IoT systems, and due to multiple monitoring objects and different sampling frequency requirements, a multi-rate sampling problem is involved in IoT systems [3]. In most IoT systems, uploading the sampling data to the data center in real time is an important requirement, and applicable network scheduling schemes are necessary to ensure this requirement under constraints of sampling rules of different senor terminals. And network congestion should also be avoided in scheduling.
In this paper, the node scheduling problem is discussed in hierarchical cluster IoT networks [4], in which there are heterogeneous sensor nodes with different sampling rates. And the research objects are the IoT alarm network system in which only the interest data is reported, and the IoT monitoring network system in which the monitoring data is reported in real time.
The following organization is: in Section 2 the related work is introduced, Section 3 is the problem statement, in Section 4, nodes scheduling algorithms are proposed, Section 5 is the simulation part, and the last is the conclusion part.

Related works
Some node scheduling problems have been discussed in the network control system (NCS). There are the RM (Rate Monotonic) scheduling algorithm [5], the EDF (Earliest Deadline First) scheduling algorithm [6] and the MEF -TOD (Maximum Error First -Try Once the Discard) scheduling algorithm [7]. In the RM algorithm, the priority of a task is assigned according to the duty cycle, and the smaller the duty cycle is, the higher the priority is. The algorithm is a static one, and priorities of different tasks are fixed. EDF is a dynamic task scheduling algorithm, and in the algorithm the priority is assigned according to the deadline of the task. The longer the deadline of the task is, the lower the priority of the task is. The task with the highest priority is always the first one to be executed. In MEF-TOD algorithm, the priority of a task is calculated by the MEF algorithm firstly. When multiple nodes compete for the authority in the network communication, the node with the highest priority sends the data and the other nodes discard current data packets.
Scheduling problems are also discussed in sensor networks. In wireless sensor networks, node energy is limited, so energy is the primary factor which is considered in scheduling algorithms. In scheduling algorithms in sensor networks, most nodes are divided into sleeping nodes and non-sleeping nodes, the node is scheduled to switch between the two states. An overview on the node scheduling in wireless sensor networks is given in [8].
As the problem background is different, the above works cannot be adopted to solve the problems in this paper directly. The problems to be solved in this paper are node priorities assignment considering realtime data requirements and data integrity, node task and layer task scheduling algorithms, and the scheduling coordination between the node layer and the cluster layer. In this paper IoT systems are not divided into wired and wireless ones. Different from scheduling problems in NCS, realtime data and continuous scheduling between layers are considered in this paper. Different from scheduling problems in wireless sensor networks, network performance is taken more consideration, and energy problem is not considered.

Problem statements
Firstly, there are below symbols and assumptions: Denote: 1 T : which is the cost time of a node requiring to communicate with a cluster; 2 T : which is the cost time of a cluster responding to communicate with a node; 3 T : which is the cost time of a node uploading sampling data to a cluster in an acknowledge cycle; 2. The network is a three-level system, including nodes, clusters and the data center.
3. IoT systems are centralized data acquisition systems, it is required that the data is uploaded to the data center.
4. In each sampling period, the data item is uploaded to the superior node, and there is no data fusion in each node.

Scheduling algorithms
Scheduling algorithms are discussed in the two IoT systems as below: 1. When the monitoring data is greater than a given threshold, the data is uploaded to the data center from the sensor terminal, otherwise discarded. Such a system is an alarm network system. 2. The information is sampled according to the minimum Nyquist sampling law, and uploaded in real time. Such a system is a monitoring network system.

Scheduling in an alarm IOT system
Layer 1: Scheduling between nodes and the cluster A node has a fixed sampling frequency, discards the monitoring data when the data value is smaller than a given threshold, gives an alarm and uploads the data when the monitoring value is greater than a given threshold. Take the case of the communication between the node 1 N and the cluster 1 C as an example to illustrate the scheduling procedure.
Step 1: When the sensing value of the node 1 N is larger than a given threshold, the importance of the request information is set to be 1 P  . Node 1 N gives a communication request to the cluster 1 C .
Step 2: Cluster 1 C detects whether there are nodes which have given out communication requests. If there are multiple nodes, choose the node which has the largest information importance to response. If the cluster chooses the node 1 N , go to Step 7, otherwise go to Step 3.
Step 3: Delay 1 tT  , and set the request information importance as 0.5 P P  . If the node 1 N does not receive the response information from the cluster, go to Step 4, otherwise go to Step 7.
Step 4: Delay , and set the request information importance as 0.5 P P   . Node 1 N gives a communication request to the cluster again, if the cluster gives a response, go to Step 7, and otherwise go to Step 5.
Step 5: Delay , and set the request information importance as 0.5 P P   . Node 1 N gives a request to the cluster again, if the cluster gives a response, go to Step 7, and otherwise go to Step 6.
Step 6: Delay , and set the request information importance as 0.5 P P   . Node 1 N gives a request to the cluster again, if the cluster gives a response, go to Step 7, and otherwise go to Step 4.
Step 7: Cluster 1 C gives a response to the node 1 N , and then the node 1 N uploads the monitoring data to the cluster. When the communication is ended, set the request information importance of the node 1 N as 1 P  again. Cluster 1 C and the node 1 N go to another communication cycle.
In the procedure above, to avoid that the communication request of a node is not responded for a long time, P , which is the importance of the request information is given. The specific communication procedure between the node and the cluster above is as follows: A node 1 N requires communicating and uploading the monitoring value to the cluster 1 C firstly, and this sub-procedure costs time 1 T . Then the cluster 1 C responses and gives a reply to the node 1 N and this subprocedure costs time 2 T . At last, if the cluster 1 C gives a positive response, the node 1 N uploads the monitoring data to the cluster, and this sub-procedure costs time 3 T , otherwise the communication between the node and the cluster ends immediately.
The above communication procedure between 1 C and 1 N is an example, and can be extended to the communication between any node j N and any cluster i C . Layer 2: Scheduling between clusters and the data center D .
Replace the cluster i C with the data center D , and the node j N with the cluster i C , the scheduling algorithm in Layer 1 can be extended to be adopted in Layer 2.

Scheduling in an alarm IOT system
Sampling periods satisfy Nyquist minimum sampling requirements. The purpose of the monitoring task is to upload realtime samples to the data center. The scheduling algorithm is also divided into Layer 1 and Layer 2 two part.
Layer 1: Scheduling between nodes i j N and the cluster i C There is relevant information of the sensor nodes shown in Table 1.
The pre-scheduling time is renewed according to , 1,2,..., Let the clock time be  and assume that nodes obtain data successfully at each sampling time. Clusters traverse each node according to priorities , 1,2,3,..., i j y j N  which are determined by the formula as below: (2) The higher the priority of the node is, the earlier the node is visited by the cluster.
Step 1: Calculate scheduling priorities according to (2). Sort the nodes by scheduling priorities, and let the orders be 1 2 3 y y y ... y Step T  , and go to Step 3.
Step 3: Renew scheduling priorities according to (2) and go to Step 1.
The specific communication procedure between the node and the cluster is: The cluster i C gives a communication requirement to node 1 N has data to be uploaded, uploads the data and the renewed pre-scheduling time to the cluster i C , and this sub-procedure costs time 3 T , otherwise visiting procedure ends immediately. Remark 1. Node number 1, 2, 3, ..., N change with priorities in ordering procedures.
Layer 2: Scheduling between clusters and the data center.
Assign the initial priority of the cluster according to the minimum sampling interval of the nodes in the cluster. If a node has the minimum sampling interval, the cluster where the node is has the highest initial priority. The data center traverses clusters according to priorities. If there are M clusters, let the initial visiting tag be 1 k  , counting tag be 0 c  . The scheduling procedure is as follows: Step 1: The data center gives a communication request to the cluster k C . When the cluster k C has data to be uploaded, 1 cc , and go to Step 2. When no data should be uploaded, let the visiting tag be 1 kk  . If kM  , go to Step 3, otherwise loop this step.
Step 2: The cluster k C sends the data to the data center. Renew the priority of the cluster k C as Mc  ; that is marking the cluster k C as Mc C  . Make the visiting tag be 1 kk  . If kM  , go to Step 3, otherwise go to Step 1.
Step 3: Sort the clusters by priorities again and remark the clusters from small priorities to large priorities. The clusters are remarked as 1 C , 2 C , 3 C , ..., M C again. Go to step 1.
The specific communication procedure between the data center and the cluster above is: The data center gives a communication requirement to a cluster firstly, and this sub-procedure costs time 1 T . Then the cluster gives a feedback to the data center, and this sub-procedure costs time 2 T . Finally, if the cluster has data to be uploaded, the data is transmitted to the center and this subprocedure costs time 3 T , otherwise the visiting procedure ends immediately.

Simulation test
In scheduling algorithms, the main performances are network delay and data acquisition ability. In this paper, the data acquisition ability is measured by non-effective data sampling time which is the total time loss in failing communication between sensor terminals and clusters, as well as clusters and the data center. The larger the noneffective data sampling time is, the lower the data acquisition ability is.