Reliable Dynamic Packet Scheduling over Lossy Real-Time Wireless Networks

LIPICS - Leibniz International Proceedings in Informatics, Jul 2019

Along with the rapid development and deployment of real-time wireless network (RTWN) technologies in a wide range of applications, effective packet scheduling algorithms have been playing a critical role in RTWNs for achieving desired Quality of Service (QoS) for real-time sensing and control, especially in the presence of unexpected disturbances. Most existing solutions in the literature focus either on static or dynamic schedule construction to meet the desired QoS requirements, but have a common assumption that all wireless links are reliable. Although this assumption simplifies the algorithm design and analysis, it is not realistic in real-life settings. To address this drawback, this paper introduces a novel reliable dynamic packet scheduling framework, called RD-PaS. RD-PaS can not only construct static schedules to meet both the timing and reliability requirements of end-to-end packet transmissions in RTWNs for a given periodic network traffic pattern, but also construct new schedules rapidly to handle abruptly increased network traffic induced by unexpected disturbances while minimizing the impact on existing network flows. The functional correctness of the RD-PaS framework has been validated through its implementation and deployment on a real-life RTWN testbed. Extensive simulation-based experiments have also been performed to evaluate the effectiveness of RD-PaS, especially in large-scale network settings.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

http://drops.dagstuhl.de/opus/volltexte/2019/10748/pdf/LIPIcs-ECRTS-2019-11.pdf

Reliable Dynamic Packet Scheduling over Lossy Real-Time Wireless Networks

E C R T S Reliable Dynamic Packet Scheduling over Lossy Real-Time Wireless Networks Tao Gong 0 1 2 3 4 5 0 Michael Lemmon University of Notre Dame , USA 1 Qingxu Deng Northeastern University , Shenyang , China 2 Song Han University of Connecticut , Storrs , USA 3 University of Connecticut , Storrs , USA 4 Xiaobo Sharon Hu University of Notre Dame , USA 5 Tianyu Zhang University of Notre Dame, USA Qingdao University , China Along with the rapid development and deployment of real-time wireless network (RTWN) technologies in a wide range of applications, effective packet scheduling algorithms have been playing a critical role in RTWNs for achieving desired Quality of Service (QoS) for real-time sensing and control, especially in the presence of unexpected disturbances. Most existing solutions in the literature focus either on static or dynamic schedule construction to meet the desired QoS requirements, but have a common assumption that all wireless links are reliable. Although this assumption simplifies the algorithm design and analysis, it is not realistic in real-life settings. To address this drawback, this paper introduces a novel reliable dynamic packet scheduling framework, called RD-PaS. RD-PaS can not only construct static schedules to meet both the timing and reliability requirements of end-to-end packet transmissions in RTWNs for a given periodic network traffic pattern, but also construct new schedules rapidly to handle abruptly increased network traffic induced by unexpected disturbances while minimizing the impact on existing network flows. The functional correctness of the RD-PaS framework has been validated through its implementation and deployment on a real-life RTWN testbed. Extensive simulation-based experiments have also been performed to evaluate the effectiveness of RD-PaS, especially in large-scale network settings. 2012 ACM Subject Classification Networks ? Network resources allocation; Networks ? Network dynamics; Networks ? Network reliability and phrases Real-time wireless networks; lossy links; dynamic packet scheduling; reliability - In recent years, real-time wireless networks (RTWNs) have been making their way into a wide range of industrial applications [1, 5, 14, 19]. These applications commonly have stringent timing and reliability requirements to ensure timely data collection and control decision delivery. Thus packet scheduling in RTWNs plays an important role for achieving the desired Quality of Service (QoS) in such applications. QoS here is often measured by 1 The first two authors have equal contribution to this work. how well the network delivers the packets by their deadlines. Although packet scheduling in RTWNs has been studied for a long time, how to handle abruptly increased network traffic in the presence of unexpected disturbances (i.e., events causing more frequent sensing of the environment and processing of sensed data) remains a challenge. This challenge is further exacerbated by the lossy wireless links in typical industrial environments [7]. Most RTWNs adopt Time Division Multiple Access (TDMA) based data link layers to achieve deterministic real-time communication. Sensing and control tasks are abstracted as end-to-end (e2e) flows with specified timing and reliability requirements. Most earlier packet scheduling algorithm designs in RTWNs focus on schedulability analysis and employ centralized and static (or infrequently updated) management frameworks (e.g., [17, 18, 16, 8, 22]). Those solutions may fit well for small-scale static RTWNs. They however often lead to significantly degraded QoS when the system becomes large and/or when deployed for monitoring and controlling complex physical processes where disturbances are present. To model and respond to disturbances in RTWNs, many dynamic scheduling approaches have been proposed. Both [4] and [27] support admission control in response to adding/removing tasks for handling disturbances in the network. They however do not consider scenarios when not all tasks can meet their deadlines. The protocol in [12] proposes to allocate reserved slots for occasionally occurring emergencies (i.e., disturbances), and allow regular tasks to steal slots from the emergency schedule when no emergency exists. However, how to satisfy the deadlines of regular tasks in the presence of emergencies is not considered. [21] proposes a MAC protocol with a centralized reschedule scheme allowing on-line changes of active streams and network topology. However, the scheduler and the data format of the schedule distribution are not specified in [21]. Another thread of research significantly advances the state of the art by providing dynamic packet scheduling functions in RTWNs. Among these approaches, OLS in [9] relies on a centralized gateway to construct and disseminate a dynamic schedule to all the nodes in the network; D2-PaS in [23] offloads the schedule construction to individual nodes and only disseminates minimum information for the nodes to construct a dynamic schedule locally; and FD-PaS in [26] further eliminates the need of a centralized gateway by notifies and handles the disturbances in a local and distributed manner. They, however, all assume perfect wireless network links, which is not realistic especially in noisy and harsh industrial environments. To our best knowledge, none of the existing dynamic packet scheduling algorithms consider packet losses and thus can lead to poor QoS for real-life deployment. On the other hand, a rich set of methods have been designed for RTWNs to improve the reliability of wireless packet transmission over lossy links. For instance, most RTWN solutions (e.g., WirelessHART [20], ISA 100.11a [ 10 ], and 6TiSCH [6]) employ multiple channels and some frequency hopping mechanisms to minimize potential interference. Further, [8] proposed a set of reliable graph routing algorithms in WirelessHART networks to explore path diversity to improve reliability. These works are complementary to the approach to be introduced in this paper since we focus on single channel with pre-defined routing. [3] proposed an algorithm to allocate a necessary number of retransmision links for individual nodes to guarantee a desired success ratio of packet delivery in a star network topology. [2] extended the network model in [3] to allow multi-hop flows and proposed both Link-Centric and Flow-Centric scheduling policies. However, the policies in [3, 2] tend to assign more retransmission slots than necessary, and thus require higher network bandwidth. Our approach in this work results in an optimal retransmission slot assignment. Furthermore, all aforementioned studies only focus on packet scheduling in static RTWN settings over lossy links, and cannot be easily extended to handle abruptly increased network traffic caused by unexpected disturbances. In a recently submitted work [25], we addressed disturbance handling in lossy RTWNs. However, the schedule used in the static setting is generated by directly applying the retransmission mechanism in [2] which can lead to higher network bandwidth usage than necessary. Further, [25] is based on the distributed framework FD-PaS to handle disturbances which can cause high QoS degradation on other uncritical tasks according to the results in [26]. In this work, we introduce a reliable dynamic packet scheduling framework, called RD-PaS, for meeting both timing and reliability requirements in packet scheduling in the presence of disturbances. When no disturbance occurs (i.e., in the static scenario), RD-PaS determines the minimum number of retransmission slots needed for each task to guarantee reliable e2e packet delivery, and construct a communication schedule locally in a hybrid manner at individual nodes. The hybrid approach needs a centralized controller and a local schedule generator to keep a good tradeoff between bandwidth usage and QoS. When a disturbance occurs, RD-PaS generates a dynamic schedule to guarantee desired reliability of critical task(s) while judiciously degrade the reliability of packet transmissions for other tasks. We formulate a reliable dynamic scheduling problem to minimize such degradation, prove that this problem is NP-hard, and present an effective heuristic to solve it. The functional correctness of the RD-PaS framework has been validated through its implementation and deployment on a real-life RTWN testbed. Extensive simulation-based experiments have also been performed to evaluate the effectiveness of RD-PaS, especially in large-scale network settings. Our results show that RD-PaS can reduce e2e packet deliver ratio degradation in dynamic schedule by 58% on average compared to the D2-PaS approach. The remainder of this paper is organized as follows. Section 2 describes the system model and problem definition, and gives an overview of the RD-PaS framework. Section 3 presents the details of RD-PaS for the Transmission-based Scheduling (TBS) model, including both static schedule construction and dynamic schedule adjustment in the presence of disturbances. These efforts are further extended to the Packet-based Scheduling (PBS) model in Section 4. In Section 5, we present the implementation and functional validation of RD-PaS on a real-life RTWN testbed. Performance evaluation from extensive simulation-based experiments is reported in Section 6. Finally, we conclude the paper and discuss future work in Section 7. 2 Preliminaries In this section, we first discuss the system model and then give an overview of the proposed RD-PaS framework. 2.1 System Model and Problem Definition The system architecture of an RTWN studied in this work is modeled after RTWNs often found in industrial process control applications. Such an RTWN consists of multiple sensor and actuator nodes wirelessly connected to a single controller node either directly or through relay nodes. The network is described by a directed graph G = (V, E), where the node set V = {V0, V1, . . . , Vc}. Vc is the controller node and the rest are referred to as the device nodes. A direct link e = (Vi, Vj) ? E represents a wireless link from node Vi to Vj with a Packet Delivery Ratio (PDR), ?L, which represents the probabilistic transmission success e rate on link e2. Vc connects to all the nodes via some routes and is responsible for executing 2 Link PDR ?eL is usually measured during the site survey and is stable during normal network operations. In case the value of ?eL changes significantly, the new value is assumed to be broadcast to all the nodes in the network. Sensor V3 Actuator V4 ?0,j(0) ?5,j(1) ?1,j(3) ?5,j(1) Sensor V0 ?5,j(0) relevant control algorithms. Vc also contains a network manager which conducts network configuration and resource allocation. In this work, we focus on RTWNs with only one controller node. Networks with multiple controller nodes are left for future work. We assume that the system executes a fixed set of control tasks T = {?0, ?1, . . . ?n} where ?i (0 ? i < n) is a unicast task and ?n is a broadcast task. Each task ?i is associated with a period Pi and deadline Di, and follows a designated single routing path with Hi hops. We use ??L i = [Li[0], Li[1], . . . , Li[Hi ? 1]] to represent the routing path of task ?i. For a unicast task, Li[h] ? E (0 ? h < Hi). Each unicast task periodically generates a packet originated at a sensor node, passing through the controller node and delivering a control message to the designated actuator node. For the broadcast task ?n, each hop involves multiple links, thus Ln[h] = (Ln[h](0), Ln[h](1), . . . ), where Ln[h](i) ? E. The broadcast task runs periodically in Vc and only generates packets when necessary. These packets are broadcast to each node directly or though some intermediate nodes by the designed broadcast path Ln. The j-th released instance of ?i is referred to as packet ?i,j, with its release time, deadline, and finish time denoted as ri,j, di,j and fi,j, respectively. We denote the transmission of packet ?i,j at the h-th hop as transmission ?i,j(h), (0 ? h < Hi). Fig. 1 shows an example RTWN running 4 unicast tasks (?0, ?1, ?2 and ?3) and 1 broadcast task (?5) on 7 nodes (V0, V1, . . . , V5 and Vc) where V0, V3, V5 are the sensor nodes, V1, V4 are the actuator nodes, and V2 is a combined sensor and actuator node. The routing paths of individual tasks are summarized on the right side of Fig. 1. In applications such as crude oil refining, a disturbance, e.g., a sudden change in temperature, may occur unexpectedly. When a disturbance occurs, the system usually requires the sensor nodes located within the range of the disturbance to monitor the environment more closely, and thus one or multiple tasks may demand more network bandwidth during the disturbance. To capture such abrupt increase in network resource demand upon the detection of a disturbance, we adopt the rhythmic task model [11] in this work3. In the rhythmic model, each task has two states: nominal state and rhythmic state. In the nominal state, ?i releases packets following the nominal period Pi and each packet has a relative deadline Di ? Pi. In the rhythmic state, the period and relative deadline of ?i adopt a series of new values specified by pre-designed vectors ??Pi and ??Di. Once ?i returns to nominal state, it starts to use Pi and Di again. When a disturbance occurs and the corresponding tasks (denoted as TRhy) enter their rhythmic states, we say the system switches to the rhythmic mode. The system returns to the nominal mode after the disturbance has been completely handled, i.e. 3 RD-PaS is not limited to the rhythmic task model and can be applied to any task models capturing unexpected network resource demand changes. all the corresponding tasks return to their nominal states. In Fig. 1, when the disturbance (in the yellow region) occurs, ?0 and ?2 (installed on nodes V3 and V0, respectively) will enter their rhythmic states and the system switches to the rhythmic mode. In the following, we first assume that at any time during the system operation, at most one disturbance can occur and needs to be detected and handled. We will then generalize the system model to discuss concurrent disturbances at the end of Section 3.2. Following the industrial practice for RTWNs, we consider a synchronized network adopting a time-slotted schedule. The length of a time slot is typically 10ms. Within each time slot, at most one packet can be transmitted over the air from a sender to a receiver. The acknowledgement (ACK) is then sent back from the receiver to the sender in the same slot to notify the successful reception. Traditional RTWNs employ Link-based Scheduling (LBS) to allocate time slots. In LBS, each time slot is allocated to a link by specifying the sender and receiver. If packets from different tasks share a common link and are both buffered at the same sender, their transmission order is decided by a node-specified policy (e.g., FIFO). This approach introduces uncertainty in packet scheduling and may violate the e2e timing constraints on packet delivery. To tackle this problem, Transmission-based Scheduling (TBS) and Packet-based Scheduling (PBS) are proposed in [23] and [2], respectively, to construct deterministic schedules. Each of the two scheduling models has its own advantages and disadvantages and is preferred in different usage scenarios as discussed in [2]. Hence, we consider both models in our RD-PaS framework. Furthermore we focus on single-channel RTWNs in this work since it forms the basis for more advanced studies. Multichannel networks are left for future work. In the TBS model, each time slot is allocated to the transmission of a specfic packet ?i,j at a particular hop h or kept idle. Once the network schedule is constructed, packet transmission in each time slot is unique and fixed. In the PBS model, each time slot is allocated to a specific packet ?i,j or kept idle. Within each time slot assigned to ?i,j, every node along ?i,j?s routing path decides the action to take (e.g., transmit, receive or idle), depending on whether the node has received ?i,j or not. Table 1 gives a time slot allocation example for task ?2 in Fig. 1. In TBS model, each time slot is allocated to a dedicated hop. In PBS model, slot 1 can be used to transmit both hops depending on whether the first transmission succeeds in slot 0. Since each link e in the network may suffer packet losses, i.e., ?eL < 1, packet transmissions may fail, which can significantly affect the timely delivery of real-time packets. To handle such cases, a retransmission mechanism is commonly employed in RTWNs [20, 6]. Specifically, if a sender node does not receive the ACK from the receiver node of a packet, it automatically retransmits the packet in the next possible time slot. To quantify the reliability requirement of the e2e packet delivery for each task, a required e2e PDR for ?i, denoted as ?iR, is introduced. For example, a control application can tolerate 0.01% packet loss, so ?iR is 99.99%. Based on ?iR, the transmission of any packet of ?i is reliable if and only if the achieved e2e PDR of ?i is larger than or equal to ?iR, i.e., ?i,j ? ?iR. To simplify presentation, we assume that all tasks in the network share a common V0, V1, . . . Device nodes: sensor, actuator or relay node Vc Controller node T , ?i Task set and task i Hi, Pi, Di Number of hops, period and deadline of ?i Li[h] The h-th link on the routing path of ?i (0 ? h < Hi) ?i,j The j-th released packet of ?i ri,j, di,j, fi,j Absolute release time, deadline, finish time of ?i,j Wi,j Total number of slots assigned for ?i,j ?R Required e2e packet delivery ratio (for all tasks) Measured link packet delivery ratio of link Li[h] required e2e PDR value, denoted as ?R. However, our proposed approach can be easily extended to support different ?R?s for different tasks. Table 2 summarizes the frequently used symbols in this paper. Based on the above system model, the two key problems that we aim to solve in this work are as follows. P1: In the system nominal mode, construct a schedule such that both the e2e timing and reliability requirements of all tasks can be satisfied; P2: When disturbances occur and are detected, adjust the schedule in a dynamic and hybrid manner to still guarantee the reliable and timely transmissions of the rhythmic packets while achieving the minimum reliability degradation on other packets. More formally, we have the following. P1: Given RTWN G = (V, E) where each link e ? E has an associated PDR, and task set T in which each task ?i has a single routing path ??L i, determine the nominal-mode schedule under which the following constraints are satisfied. I Constraint 1. ?i, j, ?i,j ? ?R. (e2e reliability requirements for all tasks) I Constraint 2. ?i, j, fi,j ? di,j . (e2e timing requirements for all tasks) P2?: Given the packet set, ?, in the rhythmic mode under consideration, the PDR function of each task ?i, and other network related constraints, determine the rhythmic-mode schedule such that P?i,j?? max{0, ?R ? ?i,j } is minimized, with the following constraints being satisfied. I Constraint 3. ??i ? TRhy, ?i,j ? ?R. (e2e reliability requirements for rhythmic tasks) I Constraint 4. ??i ? TRhy, fi,j ? di,j . (e2e timing requirements for rhythmic tasks) Here we use P2? instead of P2 as we have not discussed the network constraints. They will be elaborated in Section 3 and 4 where formal definitions of P2 will be given. Power on Network starts Disturbance detected Broadcast (Rhythmic tasks info + schedule update) tsp tep Initialization Compute ?i?(w), ??Ri?(w), wi+ System nominal mode All packets are reliable System rhythmic mode Rhythmic state Some packets are not reliable, but QoS degradation is minimized System nominal mode All packets are reliable We propose a reliable dynamic packet scheduling framework, referred to as RD-PaS, to address the questions raised above. An overview of the execution model of RD-PaS is shown in Fig. 2. Below we focus on a high-level discussion while leave the detailed explanation of the symbols in Section 3. In the network initialization phase, each device node stores necessary specification information of all tasks (i.e., Hi, Di, Pi and ?R) locally after receiving it from the network manager through broadcast packets. Each device node then calculates the number of time slots to be allocated to each task (for both transmission and retransmission) in order to achieve the required e2e PDR value ?R. After the network starts, each device node generates a static schedule locally, following which all tasks can meet their timing and reliability requirements. By locally generating a static schedule, no unnecessary bandwidth is wasted on transmitting the schedule from the gateway. When a disturbance occurs, several sensor nodes within the range may detect it and send a report to the controller node via the corresponding tasks. After the controller node receives the disturbance information from any of the sensor nodes, Vc first determines a time duration, denoted as [tsp, tep), during which the system runs in the rhythmic mode using a temporary dynamic schedule. As RD-PaS and D2-PaS in [23] both require each node to generate schedule locally, RD-PaS adopts the same end point selection method in D2-PaS to determine the system rhythmic mode duration [tsp, tep). Vc, then, checks whether all tasks can still be reliably delivered after the rhythmic tasks entering their rhythmic states. If so, Vc only broadcasts the rhythmic tasks information (task IDs and the corresponding ??Pi and ??Di) to the network. Otherwise, Vc needs to generate a dynamic schedule in which the number of time slots assigned to certain periodic packets are updated in order to accommodate the increased workload from the rhythmic tasks. Vc then piggybacks the information of the updated packet set as well as the rhythmic tasks information to a broadcast packet and disseminates it to all nodes in the network. After all the nodes receive the updates, the system switches to the rhythmic mode to handle the disturbance. In the rhythmic mode, individual device nodes generate their own dynamic schedules locally and these local schedules collaboratively guarantee the timing and reliability requirements of the rhythmic packets while minimizing the total reliability degradation suffered by other periodic tasks. After executing the dynamic schedules, all the device nodes return to the nominal mode and re-employ the static schedule. In the following, we first present the details of the RD-PaS framework for the TBS model in Section 3. We then introduce required modifications to support the RD-PaS framework for the PBS model in Section 4. 3 Reliable Scheduling for TBS This section focuses on reliable scheduling for the Transmission-based Scheduling (TBS) model. We first describe how RD-PaS constructs a reliable static schedule in the system nominal mode. We then introduce how RD-PaS handles disturbances in the rhythmic mode. An RTWN starts at running in the nominal mode in which all tasks need to 1) be reliably scheduled to achieve the required e2e PDRs; and 2) meet the e2e timing constraints for all the packet transmissions. That is, we need to solve P1 defined in Section 2.1. In the TBS model, each specific time slot is assigned to an individual packet transmission. Considering the lossy nature of wireless links, when a transmission is not successful, retransmissions are needed, which require extra time slots. To reduce the demand on network resources, we aim to minimize the number of extra slots for each task while satisfying the reliability requirement (i.e. Constraint 1 in P1). On the other hand, we observe that Constraint 2 can be handled separately from Constraint 1 since satisfying Constraint 2 can be treated as a standard transmission scheduling problem once the number of extra time slots is determined for each task. Thus, we intend to first tackle the following sub-problem. P1.1: Given RTWN G = (V, E) where each link e ? E has an associated PDR, and task set T in which each task ?i has a single routing path ??L i, determine the minimum number of extra slots needed by each task ?i for satisfying Constraint 1. To solve P1.1, we propose to first determine whether a given number of extra time slots for each task can satisfy Constraint 1 and then search for the optimal number of extra time slots for every task. We will prove later that this approach indeed leads to an exact solution for P1.1. We discuss our approach in detail below. Let ??Ri,j = [Ri,j [0], Ri,j [1], . . . , Ri,j [Hi ? 1]] be the retry vector of packet ?i,j , where Ri,j [h] denotes the number of time slots assigned to hop h of ?i,j . We use Wi,j to denote the total number of time slots assigned to ?i,j , i.e., Wi,j = PHi?1 Ri,j [h]. Given the PDRs h=0 of all the links along the routing path of ?i and the retry vector of ?i,j , the e2e PDR of ?i,j , ?i,j , can be derived as: According to Constraints 1 and 2 in P1, all the packets released by ?i must meet the same timing and reliability requirements in the system nominal mode. Thus, in the following discussion, we only consider parameter settings (including both the assigned number of slots and the retry vector) for each individual task ?i instead of each packet ?i,j . For a given number of slots, say w, assigned to ?i, the number of possible slot allocations, i.e. retry vectors, equals to Hwi??11 . We further introduce the following definitions. I Definition 1. Optimal Retry Vector ??Ri?(w): An optimal retry vector of task ?i for a given number of slots w is the retry vector that leads to the largest PDR value for the given w, denoted as ?i?(w), among all the possible allocations. I Definition 2. Optimal Retry Vector Function ??Ri?(?): The optimal retry vector function of task ?i is the set of pairs (w, ??Ri?(w)) such that each ??Ri?(w) is the optimal retry vector for the given number of slots w. ?i,j = Hi?1 Y 1 ? (1 ? ?LLi[h])Ri,j[h]. h=0 (1) Algorithm 1 PDR Table Computation under TBS for Task ?i. Input: G = (V, E), ?i, ?R Output: PDR table of ?i and wi+ I Definition 3. Optimal PDR Function ?i?(?): The optimal PDR function of task ?i is the set of pairs (w, ?i?(w)) such that each PDR value ?i?(w) corresponds to the optimal retry vector with the given number of slots w. As the first step towards satisfying Constraint 1, we present our solution to calculate the optimal retry vector function ??Ri?(?) and the optimal PDR function ?i?(?) for each task ?i. As both functions are only related to task ?i itself, the computation for each task is independent. For the sake of clarity, we create a PDR table for each task ?i to store both ??Ri?(?) and ?i?(?) for all (needed) values of w in each node, such overhead in our implementation is given in Sec. 5. (An example PDR table can be found in Table 4 in Section 5.) Below, we describe our optimal PDR table generation algorithm, Alg. 1, and prove its optimality. Alg. 1 iteratively constructs the PDR table. At each iteration, we add one time slot to ?i at the h-th hop that yields the maximum PDR value ?i? and store the resulting retry vector ??Ri? into the PDR table (Lines 5-7). The retry vector is initially set to [1, 1, 1, . . . ] and the corresponding PDR value equals to QHi?1 ?LLi[h] (Lines 1-3). Since the required PDR value h=0 is ?R, the iterative process stops when ??(w) ? ?R. We use wi+ to denote the minimum i number of slots that guarantees the reliable delivery of ?i. Lemma 4 and Theorem 5 below affirm that Alg. 1 indeed results in the optimal retry vector function ??Ri?(?) and optimal PDR function ?i?(?). I Lemma 4. Let G(R?(w)[h], ?LL[h]) = ??(w+1) be a function of R?(w)[h] and ?LL[h]. When ??(w) ?LL[h] is set to an arbitrary value ?0, G?0 = G(R?(w)[h], ?0) is a monotonically decreasing function of R?(w)[h]. Proof of Lemma 4. If we update ??R?(w) by allocating one slot at an arbitrary hop h-th, according to Eq. (1), we only need to update ??(w) by replacing the term 1?(1??LL[h])R?(w)[h] by 1 ? (1 ? ?LL[h])R?(w)[h]+1 to get ??(w + 1). That is, G(R?(w)[h], ?LL[h]) = ??(w + 1) ??(w) = 1 ? (1 ? ?LL[h])R?(w)[h]+1 1 ? (1 ? ?LL[h])R?(w)[h] Thus, if ?LL[h] is fixed to ?0, we have: G?00 = ?G(R?(w)[h], ?0) = ?R?(w)[h] ?0 ? (1 ? ?0)R?(w)[h] log(1 ? ?0) (1 ? ?0)R?(w)[h] ? 1 2 Since 0 < ?LL[h] < 1 and (1 ? ?LL[h])R?(w)[h] > 0, we have G?00 < 0. Further, G?0 decreases monotonically as R?(w)[h] increases. J I Theorem 5. For any given number of time slots w, no other retry vector can yield a larger PDR value than ??Ri?(w) as computed by Alg. 1. Proof of Theorem 5. We prove the theorem by mathematical induction, i.e., for any w = H, H + 1, . . . , w+, the retry vector ??R?(w) determined by Alg. 1 can achieve the largest PDR value ??(w). (Here we omit the task index i since only one task is considered). Base case: When w = H, the statement holds as only one possible retry vector exists, i.e., ??R?(H) = [1, 1, . . . , 1]. Inductive step: Suppose the PDR value of ??R?(w) is largest among that of all possible retry vectors when w = k, we should prove that the PDR value of ??R?(k + 1) obtained by Alg. 1, i.e. ??(k + 1) is also the largest. We prove this by contradiction. Suppose there exists another retry vector (denoted as ??Ro(k + 1)) leads a larger PDR value, i.e., ??(k + 1) < ?o(k + 1). Since the total number of slots assigned to the task (i.e., the sum of all elements in the retry vectors) both equal to k + 1 and ??R?(k + 1) 6= ??Ro(k + 1), we can always find one hop at which the number of assigned slots in ??Ro(k + 1) is larger than that in ??R?(k + 1). We use q to denote this hop index and Ro(k)[q] to denote the number of slots assigned at the q-th hop in ??Ro(k). Then, Ro(k + 1)[q] > R?(k + 1)[q]. Suppose ??R?(k + 1) is achieved by adding one slot at the p-th hop in ??R?(k). Case 1: p = q. In this case, ??R?(k + 1) and ??Ro(k + 1) are both achieved by adding one slot at the p-th hop in ??R?(k) and ??Ro(k), respectively. Then, according to Lemma 4, ??(k + 1) and ?o(k + 1) can be rewritten with G(R?(w)[h], ?LL[h]) function as follows: ??(k + 1) = ??(k) ? G(R?(k)[p], ?LL[p]), ?o(k + 1) = ?o(k) ? G(Ro(k)[p], ?LL[p]). According to the assumption that the PDR value of ??R?(k) is largest, we have ??(k) ? ?o(k). Since R?(k)[p] < Ro(k)[p], according to Lemma 4, we have G(R?(k)[p], ?LL[p]) > G(Ro(k)[p], ?LL[p]). Then, ??(k + 1) > ?o(k + 1). This contradicts our assumption. Case 2: p 6= q. ??(k + 1) and ?o(k + 1) can be rewritten as: ??(k + 1) = ??(k) ? G(R?(k)[p], ?LL[p]), ?o(k + 1) = ?o(k) ? G(Ro(k)[q], ?LL[q]). As ??(k + 1) < ?o(k + 1) and ??(k) ? ?o(k), it must holds that G(R?(k)[p], ?LL[p]) < G(Ro(k)[q], ?LL[q]). G(R?(k)[q], ?LL[q]) > G(Ro(k)[q], ?LL[q]). (2) (3) Since R?(k)[q] < Ro(k)[q] according to the assumption, the following inequality holds: Combining Eq. (2) and Eq. (3), we have G(R?(k)[p], ?LL[p]) < G(R?(k)[q], ?LL[q]) . Further, ??(k) ? G(R?(k)[p], ?LL[p]) < ??(k) ? G(R?(k)[q], ?LL[q]). This means that if we allocate one slot at the q-th hop in ??R?(k) instead of at the p-th hop, we can have a larger PDR value. This contradicts with Alg. 1 which allocats one slot at the hop which yields the largest PDR value at each iteration. Since both cases lead to contradiction, the inductive step is proved. Thus, Theorem 5 holds for all values of w. J Now with the functions ??Ri?(?) and ?i?(?) being determined, we have successfully solved P1.1. To satisfy Constraint 2 in P1, we need to create a static schedule, i.e., specifying when a packet uses a slot, to ensure real-time constraints are met. We introduce an observation that helps map the reliable static schedule generation problem, i.e., P1, to a conventional real-time scheduling problem. I Observation 1. Given task set T to be reliably scheduled, if we set the number of slots for ?i to wi+ according to ?i?(?)4, wi+ is then equivalent to the execution time of ?i. Then, each task ?i ? T with Pi, Di and wi+ can be mapped to a task in a conventional real-time task set with the same period, deadline and execution time. Thus, a feasible schedule for the corresponding conventional real-time task set is also a feasible schedule under which all tasks in T can be reliably delivered. Given the schedule specifying the slot assignment for each task, each node can further allocate specific slots to the transmission at each hop according to the retry vector function ??Ri?(?). Thus, given a task set to be reliably scheduled in an RTWN, the network can adopt any conventional real-time scheduling algorithm to generate a static schedule that guarantees to meet all the constraints in P1. Since we allow at most one transmission within each timeslot, determining the nominal-mode schedule (i.e., P1) can be mapped to a uni-processor scheduling problem. Here, we adopt Earliest-Deadline-First (EDF) [13] to generate optimal schedule for tasks and assign time slots to transmissions according to retry vector, consistently at each node. Note that regarding the broadcast task, two more issues need to be considered. First, the transmission of a broadcast packet at each hop involves one sender node but multiple receiver nodes. Second, no acknowledgement is sent back from the receiver nodes in a broadcast slot. The first issue mainly affects the number of slots assigned at each hop since multiple links with different link PDRs are involved. To tackle this, we directly adopt the lowest link PDR to determine the number of retries assigned at the hop. Due to the second issue, the sender node does not have any knowledge about whether the current transmission succeeds. Thus, we just let the sender node to keep transmitting at all the slots assigned to the current hop to maximize the success probability. 3.2 Reliable Dynamic Scheduling Our proposed solution for P1 ensures that both timing and reliability requirements are met in the system nominal mode. However, upon the detection of any disturbance, the corresponding tasks enter their rhythmic states and follow new release patterns and deadlines as shown in Fig. 2. The static schedule may no longer be able to meet both requirements especially for all the critical rhythmic packets. Therefore, a well-designed reliable dynamic packet scheduling mechanism is needed to enable the system to be adaptive to any workload change after the detection of a disturbance. In our RD-PaS framework, the network generates the static schedule by assigning wi+ slots to each task ?i according to the retry vector function. When a disturbance is detected and reported to the control node, the system follows the execution model outlined in Section 2.2 4 All the retry vectors for other w values stored in ??Ri?(?) are used in the dynamic schedule generation, which will be discussed in Section 3.2. to switch to the rhythmic mode. The main challenge here is to generate a temporary dynamic schedule when tasks cannot be reliably delivered after the rhythmic tasks (in TRhy) enter their rhythmic states. That is, problem P2? needs to be solved. The dynamic schedule must be able to accommodate the increased rhythmic workload and minimizes the degradation on both timing and reliability of other periodic tasks. Specifically, all the rhythmic packets must meet their timing and reliability requirements. That is, Constraints 3 and 4 are satisfied. To ensure this, we may have to sacrifice the reliability requirements, i.e. lowering the e2e PDR values of some periodic packets, or even sacrifice their timing requirements, i.e. dropping some periodic packets. That is, the number of slots assigned to each packet may need to be updated. Since the PDR table for each task containing both the retry vector function ??Ri?(?) and PDR function ?i?(?) is pre-calculated and stored at each node, Vc only needs to piggyback on a broadcast packet the information of the updated total number of slots (Wi,j) assigned to each periodic packet, and then each node can decode the updated retry vector accordingly, once it receives this information.5 To formally define the dynamic schedule generation problem, we introduce some concepts/notation. Let ? denote the active packet set containing all the packets to be scheduled within the rhythmic mode duration [tsp, tep). Since the payload size of a broadcast packet is bounded, we set an upper bound on the number of periodic packets whose Wi,j can be changed, and denote it as ?. To capture the reliability degradation for periodic packet ?i,j, let ?i,j represent the difference between the required PDR ?R and the updated PDR value ?i,j = ?i?(Wi,j) in the dynamic schedule, i.e., ?i,j = max{0, ?R ? ?i,j}. Note that the timing degradation of each packet can also be captured by ?i,j where ?i,j = ?R if ?i,j is dropped. Now the dynamic schedule generation problem, which is defined formally below, becomes finding Wi,j for each periodic packet in ? to satisfy Constraint 3 and 4. P2: Given the active packet set ?, the PDR function ?i?(?) of each task ?i, the maximum allowed number of updated packets ?, determine the updated packet set ? = {Wi,j|?i,j ? ?} such that i) the size of ? is not larger than ?, i.e., |?| ? ?, and ii) the total reliability degradation is minimized, i.e., ??i,j ? ?, min P ?i,j. The theorem below states that determining the updated packet set, i.e. solving P2, is non-trivial. I Theorem 6. The updated packet set generation problem P2, i.e., the dynamic schedule generation problem, is NP-hard. Proof of Theorem 6. We prove the theorem by reducing the 0-1 knapsack problem [15] to a special case of the updated packet set generation problem. The 0-1 knapsack problem is defined as follows: Given a set of n items numbered from 1 up to n, each with a weight wi and a value vi, along with a maximum weight capacity W . Each item can either be included in the knapsack, denoted as xi = 1, or not which is denoted by xi = 0. The 0-1 knapsack problem is to maximize the sum of the values of the items in the knapsack, i.e. max Pn i=1 vixi, so that the sum of the weights is less than or equal to the knapsack?s capacity W , i.e. Pin=1 wixi ? W and xi ? {0, 1}. Given a knapsack problem, we construct a special case of the updated packet set generation problem in polynomial time: Suppose the active packet set ? = {?1, ?2, ..., ?n} such that ??i ? ?, ri = 0, Di = W, Hi = wi. Each packet ?i can either be scheduled, i.e. ?i = vi or 5 In the system rhythmic mode, we adjust the assigned number of slots for each packet instead of each task for more flexibility. Algorithm 2 Updated Packet Set Generation. dropped, i.e. ?i = 0. Let the required PDR value ?R for all packets equals to max{vi}. Then, the PDR degradation ?i = ?R ? vi if ?i is scheduled. Otherwise, ?i = ?R. As minimizing the total PDR degradation for all packets equals to maximizing the total PDR value, the updated packet set with the minimum total PDR degradation can be determined if and only if a knapsack with the maximum value can be identified. J Next we propose a heuristic to solve P2 and the high-level idea is as follows. Since dropping any packet ?i,j leads to a significant decrease in the PDR value of ?i,j, i.e., ?i,j = ?R, we always prefer to allocate at least the basic number of slots (i.e., Hi) to each packet. If the network bandwidth is sufficient, we assign extra slots to periodic packets in a greedy manner according to their PDR degradation. Alg. 2 summarizes the updated packet set generation algorithm which uses the greedy extra slots assignment heuristic described in Alg. 3. Specifically, at each iteration, Alg. 3 adds one slot to the packet resulting in the minimum PDR degradation after an extra slot has been assigned. Using Alg. 2 and Alg. 3, the updated packet set can be determined in O(? ? Wmax) time where Wmax is the maximum wi+ among all the tasks. Note that the proposed RD-PaS framework can be readily extended to handle concurrent disturbances in RTWNs, following the similar way as elaborated in [24]. Specifically, we need to handle two cases depending on the relative positions of any two consecutive disturbances [24]. The first case is when both disturbances occur before an upcoming broadcast slot. Then, Vc simply generates a dynamic schedule considering all rhythmic tasks triggered by the two disturbances to handle them together. The second case is when a subsequent disturbance arrives at Vc after the dynamic schedule information for handling the first disturbance has been broadcast. In this case, Vc must update the dynamic schedule starting from the next broadcast slot. The readers are referred to [24] for the details. Algorithm 3 Extra Slots Assignment. In this section, we discuss how to support the RD-PaS framework for the packet-based scheduling (PBS) model. At the highest level, reliable scheduling for PBS has three main differences from that for TBS. First, since each time slot is assigned to a specific packet instead of a dedicated hop, retry vector ??Ri,j and its function ??Ri?(?) are no longer needed. Second, the computation for PDR function ?i?(?) is different because the time slot allocation mechanism has changed. Third, the retransmission mechanism of the broadcast task for TBS, i.e., keep transmitting using all assigned slots at each hop, does not work for PBS since each slot allocation is not dedicated to a hop but a packet. Since PDR function is a key parameter in checking reliability, we first describe how to compute the PDR value for a task with a given number of slots in PBS. Let P ri(0, w) denote the probability of a packet of ?i staying in the source node within w slots; P ri(h, w) denote the probability of a packet of ?i being transmitted to the receiver of the h-th hop along the routing path (1 ? h ? Hi), and have not been successfully forwarded, within w slots. P ri(h, w) can be computed by: ??1 ? ? ????LLi[h?1]P ri(h ? 1, w ? 1) ? P ri(h, w) = ?(1 ? ?LLi[h])P ri(h, w ? 1) ???P ri(h, w ? 1) + ?LLi[h?1]P ri(h ? 1, w ? 1) ? ? ???(1 ? ?LLi[h])P ri(h, w ? 1) + ?LLi[h?1]P ri(h ? 1, w ? 1) otherwise. h = Hi, w 6= h In Fig. 3, we use an example task with 2 hops (links a and b with PDR ?aL and ?L, b respectively) and 4 slots to describe the computation of P ri(h, w). As shown in the figure, P ri(h, w) can be either reached by P ri(h ? 1, w ? 1), followed by a successful transmission (?LLi[h?1]), or P ri(h, w ? 1), followed by a failed transmission (1 ? ?LLi[h]), except for boundary conditions. These boundary conditions include the following: Case 1: When h = 0, w = 0, the source node generates a packet (P ri(0, 0) = 1). Case 2: When h 6= 0, w = h, it is not possible for P ri(h, w) to be reached by P ri(h, w ? 1) (P ri(1, 1), P ri(2, 2) in the figure). Thus only P ri(h ? 1, w ? 1) is considered. Case 3: When h = 0, w 6= 0, it is not possible for P ri(h, w) to be reached by P ri(h ? 1, w ? 1) (P ri(0, 1), P ri(0, 2) in the figure). Thus only P ri(h, w ? 1) is considered. h = 0, w = 0 h 6= 0, w = h h = 0, w 6= 0 (4) L 1 ??a P ri(0, 2) Link a Failure L 1 ??a P ri(0, 1) ?aL L 1 ??b P ri(1, 3) Link b ?aL ?bL P ri(0, 0) L 1 ??b P ri(1, 2) Algorithm 4 PDR Table Computation under PBS for Task ?i. Input: G = (V, E), ?i, ?R Output: The PDR function of ?i and wi+ 1: w ? 0; 2: while ?i(w) < ?R or w < Hi do 3: w ? w + 1; 4: for h = 0 to Hi do 5: Compute P ri(h, w) following Eq.(4); 6: end for 7: if w >= Hi then 8: ?i?(w) ? P ri(Hi, w); 9: end if 10: end while 11: wi+ ? w Case 4: When h = Hi, w 6= h, P ri(h, w ? 1) always reaches P ri(h, w) (P ri(2, 3), P ri(2, 4) in the figure). Different from TBS, which finds the optimal PDR values by using retry vectors for a given w, the PDR values in PBS is solely determined by w, i.e., ?i?(w) = P ri(Hi, w). Based on Eq.(4), we propose a dynamic programming algorithm (Alg. 4) to compute P ri(h, w) and finally ?i?(w). In Alg. 4, the iteration starts from w = 1, and stops when ?R is reached. In each iteration, it computes all P ri(h, w) for 0 ? h ? Hi, and stores them to ?i?(?) if w ? Hi. After the PDR function is computed, we can apply the same method proposed in Section 2.2 and 3 to generate reliable static and dynamic schedule, respectively. More specifically, we use Observation 1 with computed PDR function to generate a reliable static schedule, and use Alg. 2 and Alg. 3 to determine the updated W in the rhythmic mode. Now let us consider the broadcast task. Because the link layer multicast does not have ACK and in PBS each slot is allocated to a packet instead of a hop, it is not possible for the broadcast task to track its progress. Thus the broadcast task still needs to follow the TBS model. That is, for the broadcast task, we adopt the lowest link PDR for each hop among all the receivers, and use Alg. 1 to compute ??Ri?(?) and ?i?(?). 5 Testbed Implementation and Validation To validate the functionality of the proposed RD-PaS framework in real-life RTWNs, we implemented RD-PaS on a 7-node RTWN testbed (see Fig. 4) running the 6TiSCH protocol. 90.9% V1 The testbed consists of seven CC2538 evaluation boards. One of these boards is configured as the controller node, while the others are configured as device nodes. A 16-channel 802.15.4 sniffer and an 8-channel logic analyzer are used to capture and analyze the activities of each device node. Our modified 6TiSCH stack utilizes 5KB more ROM and 2KB more RAM space for implementing RD-PaS (in TBS and PBS). These are relatively small compared to the original 6TiSCH stack which needs 69KB ROM and 6KB RAM. Due to the page limit, the implementation details of the RD-PaS framework is omitted. Below, we focus on discussing the functional validation of RD-PaS on the testbed. The testing topology is shown on the right side of Fig. 4. To attain the link PDRs as specified in the topology, we implemented a random packet dropper at the MAC layer of each device node. Six tasks are installed in the testbed and the task specifications are summarized in Table 3. The desired e2e PDR for all the tasks, ?R, is set to 99%. ?0, ?1, ?2 and ?3 are unicast tasks, ?5 is a broadcast task, and ?4 is a task that handles all network management packets. Since we always allocate two shared slots at the beginning of ?4?s period, we set D4 = 2. For simplicity, only ?0 enters the rhythmic state when a rhythmic event occurs. Validation of reliable static scheduling To validate the static schedule construction in RD-PaS, we run the specified task set on the testing topology in the nominal mode under both TBS and PBS models. The PDR tables computed by the testbed are exactly the same as those obtained from simulation. The PDR table for task ?1 is given in Table 4 (while others are not shown due to the page limit). The highlighted rows indicate the corresponding wi+?s for TBS (wi+ = 13) and PBS (wi+ = 7) when ?R is reached. We further test 5000 packets for each unicast and broadcast task under both models, and compare the actual e2e PDR values collected from the testbed with the simulated values from Alg. 1 and Alg. 4. These results are summarized in Table 5. ?4 is omitted in the table since it is a task dedicated for network management packets. It can be concluded from the table that the reliable static scheduling function in RD-PaS executes correctly as the actual e2e PDRs are improved to the desired values (? 99%) in both models in the presence of specified packet loss. The slight differences between the measured and predicted e2e PDR values are expected due to the limited sample size. 5.2 Validation of reliable dynamic scheduling To validate the functional correctness of reliable dynamic scheduling in RD-PaS on our testbed, we let the network trigger rhythmic events, and use the logic analyzer to capture the radio activities through a physical pin on each device node and plot the waveforms. We configure the network to enter the rhythmic mode at slot 720. The hyperperiod of the task set is 360 according to Table 3. (Rhythmic events can happen at any time. We chose this integer multiple of the hyperperiod to simplify the waveform demo.) Fig. 5 illustrates a sample waveform for 240 consecutive slots (slot 600-840) in the TBS model. (Both TBS and PBS models are validated. We present the results in the TBS model here for ease of explanation.) The network runs in the nominal mode for the first 120 time slots (Fig. 5b) and then switches to the rhythmic mode in the next 120 slots (Fig. 5c). Seven waveforms represent the radio activities, either transmitting, receiving, or listening, for all the 7 nodes, as labeled on the left side of the figures. Each rising and falling edge in the Slot row (lower part of the figures) mark the start of a new time slot. In the schedule row (lower part of the figures), slot assignments are indicated using different colors. From Fig. 5b, we observe that each task ?i releases its packets according to Pi, and Release (?0/?1/?2/?3/?4/?5) Schedule (?0/?1/?2/?3/?4/?5) Transmission (?0/?1/?2/?3/?5) wi+ number of slots are allocated to each packet before its deadline (shown in the schedule row). In each scheduled slot, the sender attempts to transmit the packet and may succeed (marked by the arrows). Although some attempts fail, all the packets are still delivered to the destination node because of the right amount of retransmission slots as determined by the reliable static scheduling function. In Fig. 5c, ?0 enters the rhythmic state, and its period is reduced according to ??P0 given in Table 3. Also as shown in the schedule row, the Wi,j values for ?0 do not change, while those for ?1, ?2, ?3, ?5 are reduced to [9, 9, 9], [4, 5, 5], [4, 4], [7], respectively. The ??Ri,j vectors are also selected correctly by the updated Wi,j values in the rhythmic mode, and all the packets from the rhythmic task (?0) are successfully delivered to the destination. The captured results match the results from the simulation, and this validates the correctness of the reliable dynamic scheduling function in RD-PaS. 6 Simulation-based Performance Evaluation In this section, we evaluate the performance of RD-PaS through extensive simulations and compare RD-PaS with a state-of-the-art dynamic approach, D2-PaS.6 The first three sets of simulations compare packet delivery ratio, network bandwidth usage and number of extra slots produced by RD-PaS with those by D2-PaS. The last set of simulations studies 6 [26] shows that D2-PaS has a clear advantage in packet dropping performance compared to the fully distributed scheduling framework FD-PaS, so we omit the comparison between RD-PaS and FD-PaS. Also, since we have proved the optimality of our retransmission slots assignment in Sec. 3.1, we omit to compare with the retransmission mechanism in [2] in the static setting. ito 1 a R y r iev 0.5 l e D t e 80 60 +wi40 20 0 0.5 D2-PaS RD-PaS-TBS RD-PaS-PBS 8 6 4 H 2 2 )SP 0.4 P ( t puh 0.2 g u o hT 00.5 r ) % ( R D20 R D P e g revA 00.4 a cakP 00.5 0.6 0.7 0.8 0.9 Average ?L Figure 6 PDR in D2-PaS framework. D2-PaS RD-PaS-TBS RD-PaS-PBS Figure 8 Comparison of wi+ in TBS and PBS. Figure 9 Comparison of the PDR degradation rate. the behavior of the rhythmic mode. We evaluate the reliability degradation by comparing RD-PaS with D2-PaS on handling disturbances in RTWNs. 6.1 Comparison of Packet Delivery Ratio As RD-PaS utilizes retransmission slots to guarantee the required e2e PDR value for each task, there is no doubt that the system reliability will be improved compared with a traditional scheduling framework not considering reliablity. To quantify such improvements, we calculate the e2e PDR resulted from applying D2-PaS in lossy links with randomly generated link PDRs. Since the e2e PDR for each task is independent, we use different settings to randomly generate tasks and compute the PDR value for each task. The number of hops for a task, H, is drawn from the uniform distribution over {1, 2, ..., 10} and the PDR value of each link on the routing path is randomly generated by controlling the average value of link PDR, ?L, following a uniform distribution in {0.5, 0.55, ..., 1}. As periods and deadlines do not affect the packet delivery ratio, we only study PDR?s dependcy on H and ?L. Fig. 6 shows the e2e PDR of a task as a function of ?L and H. Because RD-PaS can always guarantee the required PDR value, its results are always at the ceiling (above 99%) of the figure and are thus omitted. From Fig. 6, we can observe the large gap between RD-PaS and D2-PaS (60.6% on average) in guaranteeing the e2e PDR of the task. 6.2 Comparison of Network Bandwidth Usage Allocating extra retransmission slots can significantly improve the reliability of packet delivery. However, higher network bandwidth is required which may affect system schedulability. In this set of experiments, we study the efficiency of using time slots to deliver packets, in different scheduling frameworks, according to the performance metric throughput. Throughput is defined as the number of packets delivered per slot (PPS) and is the ratio between the e2e PDR value and the number of allocated slots assigned to the task, i.e. ?i?(w) . The parameter w settings of this set of experiments are the same as that in Section 6.1. Fig. 7 summarizes throughputs for different scheduling frameworks with varied average link PDR ?L and the number of hops, H, for the generated task. From the results, we can observe that D2-PaS has a higher throughput when H is small and when ?L is close to 1. However when the link PDR drops and H increases, RD-PaS (in both TBS and PBS models) gains better throughput. This is mainly due to the fact that using a time slot for retransmission can gain more throughput than transmitting a new packet in these cases. The simulation results also show that RD-PaS in the PBS model can always achieve a better throughput than in the TBS model. The reason is that the PBS model can always achieve same PDR with less number of slots, compared to the TBS model due to the PBS?s ability in sharing slots among transmissions of a packet. 6.3 Comparison of Required Numbers of Slots In this set of experiments, we make further evaluation on RD-PaS in TBS and PBS models. As discussed in Section 4, the PBS model provides more flexibility on the retransmission slot assignment, and a less number of slots, w+, is required to achieve the same ?R as compared i to the TBS model. Fig. 8 gives the comparison on the required number of slots under different settings of average ?L and H, and the required end-to-end PDR value ?R is set to 99%. As can be observed, tasks in PBS model require less number of slots than in TBS model, when H > 1. The required number of slots in the PBS model is 55.0% less on average compared to that in TBS model. This is consistent with the observation that one packet requires less number of slots to achieve the same ?R in the PBS model. 6.4 Effectiveness in Handling Rhythmic Events To evaluate the performance of RD-PaS in handling rhythmic events, we compare the degradation rate (DR) between RD-PaS and D2-PaS. DR is defined as the ratio between the sum of reliability degradation (i.e., ?i,j) from all periodic packets and the total number of generated periodic packets in the rhythmic mode. As D2-PaS does not consider unreliable wireless links, we first extend D2-PaS to support reliable transmission, denoted as eD2-PaS. Specifically, all packets in eD2-PaS are reliably transmitted using wi+ slots in the static schedule. In the dynamic schedule, transmission and retransmission slots assigned for each packet are not differentiated, i.e., each packet can either be reliably scheduled or dropped. To better control the system workload, we vary the nominal utilization of the task set. Specifically, we use a random periodic task set generated according to a target nominal utilization U ?. The generation of each random task ?i is controlled by the following parameter settings: i) the number of hops Hi is drawn from the uniform distribution over {2, 3, ..., 16}, ii) the nominal period Pi is equal to deadline Di and follows a uniform distribution in {50, 51, ...100}. As the simulation results in the last sub-section have shown, the PBS model requires less total number of slots to achieve the same transmission reliability. Thus, here we use the PBS model to generate the PDR function ?i?(?) for each task ?i. After a task set is generated, we randomly select two tasks to be the rhythmic tasks. To better control the workload of the rhythmic event, we assume that all the rhythmic periods (deadlines) are the same in ??Pi(??Di) and the number of elements in ??Pi equals to 10. The value of each element Pi,R is thus controlled by the rhythmic period ratio, ? = PPi,iR . Fig. 9 shows the results of DR as a function of both the nominal task set utilization U ? and the rhythmic period ratio ?. Each data point is the average value of 1, 000 trials. From Fig. 9, we can observe that RD-PaS has a lower PDR degradation rate (58.4% on average) over eD2-PaS. The main reason is that eD2-PaS either schedules or drops any packet ?i,j, i.e. Wi,j ? {0, wi+}. However, RD-PaS has more flexibility on tuning the number of slots assigned to ?i,j, i.e. Wi,j ? {0, Hi, . . . , wi+}. 7 Conclusion and Future Work In this paper, we present RD-PaS, a reliable dynamic packet scheduling framework for RTWNs. RD-PaS provides guaranteed reliability of packet delivery in RTWNs for both transmission-based scheduling model and packet-based scheduling model in a hybrid manner. In the presence of unexpected disturbances, RD-PaS makes dynamic schedule adjustment judiciously to guarantee timely and reliable delivery of the critical rhythmic packets while minimizes reliability degradation for noncritical packets. A provably optimal algorithm (for the static case) as well as a heuristic (for the dynamic case) are introduced for realizing RD-PaS. Extensive testbed and simulation based experiments are conducted to validate the correctness and effectiveness of RD-PaS. Our experimental results show that RD-PaS can significantly improve the QoS (in terms of reliability) compared with the state-of-theart approaches. As future work, we will extend RD-PaS to further support RTWNs with multi-channel scheduling and multi-path routing capabilities, and evaluate its performance in large-scale RTWN testbeds. 1 2 3 4 5 6 24 25 26 27 Johan ?kerberg , Mikael Gidlund, and Mats Bj?rkman . Future research challenges in wireless sensor and actuator networks targeting industrial automation . In 2011 9th IEEE International Conference on Industrial Informatics , pages 410 - 415 , July 2011 . doi: 10 .1109/INDIN. 2011 . Ryan Brummet , Dolvara Gunatilaka, Dhruv Vyas, Octav Chipara, and Chenyang Lu . A Flexible Retransmission Policy for Industrial Wireless Sensor Actuator Networks . In 2018 IEEE International Conference on Industrial Internet (ICII) , pages 79 - 88 , October 2018 . doi:10 .1109/ICII. 2018 . 00017 . Yu Chen , Hongwei Zhang, Nathan Fisher, Le Yi Wang, and George Yin . Probabilistic PerPacket Real-Time Guarantees for Wireless Networked Sensing and Control . IEEE Transactions on Industrial Informatics , 14 ( 5 ): 2133 - 2145 , May 2018 . doi: 10 .1109/TII. 2018 . 2795567 . Octav Chipara , Chengjie Wu, Chenyang Lu, and William Griswold . Interference-Aware RealTime Flow Scheduling for Wireless Sensor Networks . In 2011 23rd Euromicro Conference on Real-Time Systems , pages 67 - 77 , July 2011 . doi: 10 .1109/ECRTS. 2011 . 15 . Li Da Xu , Wu He , and Shancang Li . Internet of Things in Industries: A Survey . IEEE Transactions on Industrial Informatics , 10 ( 4 ): 2233 - 2243 , November 2014 . doi: 10 .1109/TII. Diego Dujovne , Thomas Watteyne, Xavier Vilajosana, and Pascal Thubert . 6TiSCH: deterministic ip-enabled industrial internet (of things) . IEEE Communications Magazine , 52 ( 12 ): 36 - 41 , December 2014 . doi: 10 .1109/ MCOM . 2014 . 6979984 . Vehbi C Gungor , Gerhard P Hancke , et al. Industrial Wireless Sensor Networks: Challenges, Design Principles, and Technical Approaches. IEEE Transactions on Industrial Electronics , 56 ( 10 ): 4258 - 4265 , October 2009 . doi: 10 .1109/TIE. 2009 . 2015754 . Song Han , Xiuming Zhu, Aloysius K Mok, Deji Chen , and Mark Nixon . Reliable and Real-Time Communication in Industrial Wireless Mesh Networks . In 2011 17th IEEE RealTime and Embedded Technology and Applications Symposium , pages 3 - 12 , April 2011 . doi: 10 .1109/RTAS. 2011 . 9 . Shengyan Hong , Xiaobo Sharon Hu, Tao Gong, and Song Han. On-Line Data Link Layer Scheduling in Wireless Networked Control Systems . In 2015 27th Euromicro Conference on Real-Time Systems , pages 57 - 66 , July 2015 . doi: 10 .1109/ECRTS. 2015 . 13 . ISA Standard . Wireless systems for industrial automation: process control and related applications . ISA-100.11 a-2009 , 2009 . Junsung Kim , Karthik Lakshmanan, and Ragunathan Raj Rajkumar. Rhythmic Tasks: A New Task Model with Continually Varying Periods for Cyber-Physical Systems . In 2012 IEEE/ACM Third International Conference on Cyber-Physical Systems , pages 55 - 64 , April 2012 . doi: 10 .1109/ICCPS. 2012 . 14 . Bo Li , Lanshun Nie , Chengjie Wu, Humberto Gonzalez, and Chenyang Lu . Incorporating Emergency Alarms in Reliable Wireless Process Control . In Proceedings of the ACM/IEEE Sixth International Conference on Cyber-Physical Systems, ICCPS '15 , pages 218 - 227 , New York, NY, USA, 2015 . ACM. doi: 10 .1145/2735960.2735983. Chung Laung Liu and James W Layland. Scheduling algorithms for multiprogramming in a hard-real-time environment . Journal of the ACM , 1973 . Chenyang Lu , Abusayeed Saifullah, Bo Li , Mo Sha , Humberto Gonzalez, Dolvara Gunatilaka, Chengjie Wu, Lanshun Nie, and Yixin Chen . Real-Time Wireless Sensor-Actuator Networks for Industrial Cyber-Physical Systems . Proceedings of the IEEE , 104 ( 5 ): 1013 - 1024 , May 2016 . doi:10 .1109/JPROC. 2015 . 2497161 . Silvano Martello , David Pisinger, and Paolo Toth . New trends in exact algorithms for the 0-1 knapsack problem . European Journal of Operational Research , 123 ( 2 ): 325 - 332 , 2000 . doi:10 .1016/S0377- 2217 ( 99 ) 00260 - X . Abusayeed Saifullah , Dolvara Gunatilaka, Paras Tiwari, Mo Sha, Chenyang Lu, Bo Li , Chengjie Wu , and Yixin Chen . Schedulability Analysis under Graph Routing in WirelessHART Networks . In 2015 IEEE Real-Time Systems Symposium , pages 165 - 174 , December 2015 . doi:10 .1109/RTSS. 2015 . 23 . Abusayeed Saifullah , You Xu, Chenyang Lu , and Yixin Chen . Real-Time Scheduling for WirelessHART Networks . In 2010 31st IEEE Real-Time Systems Symposium , pages 150 - 159 , November 2010 . doi: 10 .1109/RTSS. 2010 . 41 . Abusayeed Saifullah , You Xu, Chenyang Lu , and Yixin Chen . End-to-End Communication Delay Analysis in Industrial Wireless Networks . IEEE Transactions on Computers , 64 ( 5 ): 1361 - 1374 , May 2015 . doi: 10 .1109/TC. 2014 . 2322609 . Emiliano Sisinni , Abusayeed Saifullah, Song Han, Ulf Jennehag , and Mikael Gidlund . Industrial Internet of Things: Challenges, Opportunities, and Directions . IEEE Transactions on Industrial Informatics , 14 ( 11 ): 4724 - 4734 , November 2018 . doi: 10 .1109/TII. 2018 . 2852491 . WirelessHART: Applying wireless technology in real-time industrial process control . In 2008 IEEE Real-Time and Embedded Technology and Applications Symposium , pages 377 - 386 , April 2008 . doi: 10 .1109/RTAS. 2008 . 15 . Federico Terraneo , Paolo Polidori, Alberto Leva, and William Fornaciari . TDMH-MAC: Realtime and multi-hop in the same wireless mac . In 2018 IEEE Real-Time Systems Symposium (RTSS) , pages 277 - 287 , December 2018 . doi: 10 .1109/RTSS. 2018 . 00044 . Haibo Zhang , Pablo Soldati, and Mikael Johansson . Performance Bounds and Latency-Optimal Scheduling for Convergecast in WirelessHART Networks . IEEE Transactions on Wireless Communications , 12 ( 6 ): 2688 - 2696 , June 2013 . doi: 10 .1109/TWC. 2013 . 050313 .120543. Tianyu Zhang , Tao Gong, Chuancai Gu, Huayi Ji, Song Han, Qingxu Deng , and Xiaobo Sharon Hu. Distributed Dynamic Packet Scheduling for Handling Disturbances in Real-Time Wireless Networks . In 2017 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS) , pages 261 - 272 , April 2017 . doi: 10 .1109/RTAS. 2017 . 11 . IEEE Transactions on Mobile Computing , pages 1 - 1 , 2018 . doi: 10 .1109/TMC. 2018 . 2877681 . Tianyu Zhang , Tao Gong, Song Han, Qingxu Deng , and Xiaobo Sharon Hu. Fully Distributed Packet Scheduling Framework for Handling Disturbances in Lossy Real-Time Wireless Networks , 2019 . arXiv: 1902 . 02023 . Tianyu Zhang , Tao Gong, Zelin Yun, Song Han, Qingxu Deng , and Xiaobo Sharon Hu . FD-PaS: A fully distributed packet scheduling framework for handling disturbances in real-time wireless networks . In 2018 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS) , pages 1 - 12 , April 2018 . doi: 10 .1109/RTAS. 2018 . 00007 . Adaptive Real-Time Communication for Wireless Cyber-Physical Systems . ACM Transactions on Cyber-Physical Systems , 1 ( 2 ):8: 1 - 8 : 29 , February 2017 . doi: 10 .1145/3012005.


This is a preview of a remote PDF: http://drops.dagstuhl.de/opus/volltexte/2019/10748/pdf/LIPIcs-ECRTS-2019-11.pdf

Tao Gong, Tianyu Zhang, Xiaobo Sharon Hu, Qingxu Deng, Michael Lemmon, Song Han. Reliable Dynamic Packet Scheduling over Lossy Real-Time Wireless Networks, LIPICS - Leibniz International Proceedings in Informatics, 2019, 11:1-11:23, DOI: 10.4230/LIPIcs.ECRTS.2019.11