

Journais.tubitak.gov.tr/elektr

Research Article

# Designing and implementing a reliable thermal monitoring system based on the 1-wire protocol on FPGA for a LEO satellite

# Reza Omidi GOSHEBLAGH\*, Karim MOHAMMADI

Department of Electrical Engineering, Iran University of Science and Technology (IUST), Narmak, Tehran, Iran

| Received: 03.01.2013 | ٠ | Accepted: 26.03.2013 | • | Published Online: 12.01.2015 | • | <b>Printed:</b> 09.02.2015 |
|----------------------|---|----------------------|---|------------------------------|---|----------------------------|

Abstract: Thermal control and monitoring is one of the most important factors in the design of satellite systems. An appropriate thermal design should make sure that the satellite's sensitive components remain in their nominated range, even under the vacuum condition of outer space. To achieve this purpose, a reliable and stable monitoring system is required. This paper proposes a monitoring system based on the 1-wire protocol, which provides the reliability requirements in the sensor networking and bus controller sections. In the networking section, we outline some practical topologies and discuss on their complexity and reliability. Despite the fact that the point-to-point topology is very robust for communication structures, the reliability analyses show that the loop-tree topology is the best structure for 1-wire networking. In addition, this paper proposes a robust bus controller based on combined time redundancy and triple modular redundancy on field-programmable gate arrays. The fault injection experiments reveal that the proposed time-based redundancy represents better outcomes alternative to hardware redundancy. Furthermore, the experiments show that the capability of tolerating single-event upset effects in the proposed method increases up to 7.8-fold with respect to a regular design.

Key words: Telemetry subsystem, 1-wire protocol, FPGA, satellite

# 1. Introduction

There is growing interest in using commercial off-the-shelf (COTS) devices within space systems due to their lower cost and availability compared to space-qualified ones. These devices, however, are sensitive to the harsh space environment. The safe use of these devices requires careful design considerations and effective mitigation techniques.

One of the major groups of COTS devices is 1-wire–based devices, which are widely used for monitoring purposes in ground level applications. The 1-wire protocol is a simple, low-cost field bus that networks together various devices and sensors through only a single wire [1,2]. A monitoring system based on this protocol, as shown in Figure 1, describes a method of data communication that requires both the physical and data link layers. The network wiring plays the physical layer role and the bus controller performs as data link master. A reliable monitoring system requires a dependable design in both layers.

In the wiring section, a failure in the nodes results in losing a portion of the monitoring system. This can occur for several reasons, such as launch stresses, the collision of space debris with the satellite, and depreciation of the nodes. To cope with this problem, one can use redundant wiring, but the efficiency of the redundancy and the wiring complexity require more investigation. Previous works have been limited to the implementation and use of the 1-wire protocol. The aspect of redundant wiring is still poorly understood for the 1-wire protocol.

<sup>\*</sup>Correspondence: rezaomidi@iust.ac.ir

#### GOSHEBLAGH and MOHAMMADI/Turk J Elec Eng & Comp Sci



Figure 1. 1-Wire monitoring topology.

In the controller or data layer, the 1-wire protocol is protected by a cyclic redundancy check (CRC) code, which efficiently declines the data communication errors. However, this section is also affected by unwanted space radiation faults, like single-event upset (SEU) and single-event transient (SET) faults. To mitigate these faults in commercial field-programmable gate arrays (FPGAs), hardware redundancy is the base of all of these approaches. The proposed time-based redundancy efficiently overcomes single event faults.

In comparison with point-to-point distributed data acquisition approaches, 1-wire-based systems reduce cost, volume, and weight due to the number of wires. Unfortunately, multipoint-based topologies, like 1-wire, have limited robustness compared to point-to-point forms. This implies a compromise between the wiring complexity and mean-time to failure (MTTF) for 1-wire networking.

To the best of our knowledge, few researchers have addressed the reliability of the 1-wire network. Therefore, this paper outlines some practical topologies for the networking of 1-wire objects and discusses their reliability. Depreciation, which is calculated based on the MIL-217 standard, is supposed to be a major threat in the 1-wire networking section. Despite the fact that point-to-point topology is very robust for communication structures [2], we found that the loop-tree topology is the best structure for 1-wire communication.

Subsequent to launching a satellite in earth orbit, both the thermal vacuum and space radiations are the most important threats to the satellite mission. Therefore, the satellite's thermal design should guarantee that each part of the satellite remains within its nominal thermal range. To achieve this goal, a reliable monitoring system is required. Various temperature sensors and monitoring systems are available, while each kind of these appliances has a specific character and corresponding application scope [3]. A typical spacecraft uses dozens of temperature sensors to monitor and control the health status of its various components. Thermal sensing ranges from the coarse indications of the box temperature to the high-precision measurement for instrument calibration [4].

Analog-based thermal monitoring often requires more power and wiring complexity in comparison to digital-based systems. Digital temperature sensors are more reliable than the analog kind, because the signal conditioning is performed at a monitoring location. However, common digital sensors are incapable of creating a shared network, and therefore the wiring complexity remains the next issue. The 1-wire thermal sensors, such as digital ones, have the capability of being networked. Consequently, these sensors can be the best choice for aerospace applications. However, as a drawback, software complexity somewhat increases for 1-wire–based thermal monitoring systems, especially when FPGAs are used as a bus master.

The 1-wire protocol for thermal monitoring has received much attention in recent years. However, as mentioned before, previous works have been limited to developing a practical thermal monitoring system based on the 1-wire protocol. The dependability aspect of this protocol for critical applications, like satellite systems, is still poorly understood. The contribution of this paper is in providing a reliability-oriented monitoring system based on the 1-wire protocol. For the physical layer, a comprehensive reliability analysis is provided to determine the most robust networking topology. For the data layer, this paper actually uses triple-modular redundancy (TMR) accompanied by time redundancy. The efficiency of the proposed combined redundancy is verified through the fault injection approach. However, as a drawback, the proposed hardening technique requires about 3-fold more time to perform the data acquisition.

The remainder of this paper is organized as follows. In the next section, we outline the key features of 1-wire–based devices and sensors. The practical topologies for the networking section of the 1-wire–based monitoring system and their reliabilities are demonstrated in Section 3. Next, to develop a reliable bus master on the FPGA, we demonstrate a regular design in Section 4, which is the base module to implement the proposed method. To evaluate the efficiency of the time redundancy-based approach, the SEU injection experimental results are presented in Section 5. Finally, Section 6 concludes the paper.

# 2. The 1-wire protocol

The 1-wire is a device communication protocol designed by Dallas Semiconductor Corporation to ensure signal integrity. This protocol is similar to  $I^2C$ , but it has lower speed and power. Therefore, it can be appropriate for applications in which power reduction is preferred to the data sampling rate, such as temperature monitoring in satellites. This protocol is widely used in ground level applications. A temperature and humidity instrument based on 1-wire sensors was presented in [5]. In addition, a wireless form of this work was investigated in [6]. In addition, the application of 1-wire bus technology in the temperature monitoring of the rolling-mill was considered in [7].

Although, 1-wire reduces the wiring complexity of multichannel measuring systems, it implies more efforts to realize the bus master or controller [8]. Each 1-wire device has a unique 64-bit serial code, which allows multiple devices to establish a network. The 1-wire network is implemented as an open drain bus, and so a single pulled-up resistor is shared among all devices.

Several signal types are defined by this protocol, such as the reset pulse, presence pulse, write 0, write 1, read 0, and read 1. The bus master initiates all of these signals, except for the presence pulse.

All communications in this protocol start with an initialization sequence that includes the master reset pulse and slave(s) presence pulse. The master reset pulse means pulling the 1-wire bus low for a minimum of 480  $\mu$ s. The bus master then should release the bus and go to the wait state. This state has to be from 15 to 60  $\mu$ s in length. In this condition, the pull-up resistor returns the 1-wire bus high. After that, any slave device, if present, transmits the presence pulse by polling the 1-wire bus low for a minimum of 60 to 240  $\mu$ s.

### 3. Arrangement and reliability of the 1-wire network

Traditional wiring for the 1-wire consists of only 1 wire, and all of the devices and sensors are connected to the bus controller through this wire. We call this topology the bus structure, which requires minimum wiring complexity. Moreover, to complete communication between the devices and the controller, other structures can be used, such as star and loop topologies. The star topology creates a point-to-point wiring between each device and controller. The wiring complexity is large in this structure. For the loop topology, the end of the bus is connected to the bus controller. This structure creates a redundant path for each device to connect the controller and approximately doubles the wiring complexity. Based on these basic structures, other combinational topologies can be extracted, such as the star-bus and star-loop, which are illustrated in Figure 2.



Figure 2. FSM of the 1-wire module.

To evaluate the reliability of the 1-wire network, the failure rate is calculated at each node. Each node consists of 2 connections: the wire to connector and wire to board connections. According to Eq. (1), the failure rate of the wire to connector connections is calculated [9].

$$\lambda_p = \lambda_b \pi_T \pi_K \pi_Q \pi_E^{Failures/10^6 hours} \tag{1}$$

Here,  $\lambda_b$  is the base failure rate, and for materials in class A (aluminum and ceramic)  $\lambda_b$  is equal to 0.001. In addition,  $\pi_T$  is the temperature factor of environment and here set for 40 °C (estimated maximum temperature for a sample low Earth orbit satellite) is equal to 1.5. The  $\pi_K$  factor is the mating–unmating factor. The  $\pi_Q$ is the quality factor, for military specification, which is equal to 1. Finally,  $\pi_E$  is considered the environment factor, and we set this parameter in the mission flight condition, which is equal to 0.5 [9].

According to Eq. (2), the failure rate of the wire to board connection is calculated as [9]:

$$\lambda_{W2B} = \lambda_b \left[ N_1 \pi_C + N_2 (\pi_C + 13) \right] \pi_Q \pi_E^{Failures/10^6 hours} \tag{2}$$

in which  $\lambda_b$  is the base failure rate, where the discrete wiring with electro-less deposited PTH is equal to 0.00026. N<sub>1</sub> is the quantity of the wave solder, which is not used. N<sub>2</sub> represents the quantity of hand soldered, which is set to 10.  $\pi_C$  is the complexity factor, according to the standard; in discrete wiring, this factor is equal to 1.  $\pi_Q$  is the quality factor; for lower quality, it is equal to 2.  $\pi_E$  is considered the environment factor; we set this parameter in the mission flight condition, which is equal to 0.5 [9].

Finally, the total failure rate is equal to the sum of the failure rates in these 2 connections, which is represented in Eq. (3). The numeric value of the total failure rate of the 1-wire connections is represented in Table 1.

|                              | Failure rate                                        |             |         |       |       |         |         |         |         |         |           |
|------------------------------|-----------------------------------------------------|-------------|---------|-------|-------|---------|---------|---------|---------|---------|-----------|
| Failure mechanism            | $(Failures/10^6 hours)$                             | $\lambda_b$ | $\pi_P$ | $N_1$ | $N_2$ | $\pi_C$ | $\pi_T$ | $\pi_K$ | $\pi_Q$ | $\pi_E$ | $\lambda$ |
| Male to female               |                                                     |             |         |       |       |         |         |         |         |         |           |
| $\operatorname{connector}^1$ | $\lambda_{M2F} = \lambda_b \pi_T \pi_K \pi_Q \pi_E$ | 0.001       | -       | -     | -     | -       | 1.3     | 4       | 1       | 0.5     | 0.0026    |
|                              | $\lambda_{C2W} = \lambda_b [N_1 \pi_C +$            |             |         |       |       |         |         |         |         |         |           |
| Wire to $board^2$            | $+N_2(\pi_C+13)]\pi_Q\pi_E$                         | 0.00026     | -       | 0     | 10    | 1       | -       | -       | 2       | 0.5     | 0.0364    |
| Total                        | $\lambda_t = \lambda_{M2F} + \lambda_{W2B}$         | -           | -       | -     | -     | -       | -       | -       | -       | 0.039   | 0.00414   |

Table 1. Total failure rate of the 1-wire connections (according to [9]).

<sup>1</sup>Circular, T = 40 °C, MIL-SPEC, mating/unmating >50, environment =  $S_F$ 

<sup>2</sup> Discrete wiring, quantity of hand soldered = 10, number of circuit planes = discrete wiring, lower quality, environment =  $S_F$ .

$$\lambda_{total} = \lambda_{Wire.2Con.} + \lambda_{Wire2Board} \tag{3}$$

Assuming the 1-wire network contains n sensors, the reliability of the network can be calculated according to Eq. (4). The  $R_k(t)$  represents the reliability of the  $S_K$ , which is the kth sensor in the network.

$$R(t) = \prod_{k=1}^{n} R_k(t) \tag{4}$$

Assuming that the failure rate of each node is equal to a constant rate  $\lambda$ , the  $e^{-\lambda t}$  expression represents its reliability [10]. In general, the reliability of each sensor in the network depends on its location in the wiring. In this paper, as mentioned previously, we calculate the network reliability for the bus, star, loop, bus-tree, and loop-tree topologies.

For the bus topology, by increasing the sensor distance from the master, the number of effective nodes increases. The reliability is decreased in each node by a factor of  $e^{-\lambda t}$ . For the first sensor in the bus, only 1 node affects the reliability and so the reliability of the first sensor is equal to  $e^{-\lambda t}$ . For the second sensor, we have 2 nodes and its reliability is equal to  $e^{-2\lambda t}$ . In this topology, the reliability of the S<sub>k</sub> is represented by  $e^{-k\lambda t}$ . Thus, the reliability of the single line (bus) topology is represented in Eq. (5).

$$R_{Bus}(t) = \prod_{k=1}^{n} e^{-k\lambda t} = e^{-\lambda t (\sum_{k=1}^{n} k)} = e^{-(\frac{n(n+1)}{2})\lambda t}$$
(5)

As mentioned previously, in the star topology, each sensor is connected to the bus master through a single node, and therefore their reliability is equal to  $e^{-k\lambda t}$ . The reliability of the star topology can be calculated as in Eq. (6).

$$R(t)_{Star} = \prod_{k=1}^{n} e^{-\lambda t} = e^{-n\lambda t}$$
(6)

In the loop strategy, each sensor on the network has 2 paths to the bus master. According to the 1-wire protocol, at least one path between master and slave is adequate. The reliability of a sensor on the network depends on

both the left and right paths. At least one of them is required to hold the 1-wire sensor and master connection. The reliability of each path, independent of each other, is equal to the value of the bus topology. Therefore, for this topology, the reliability of the  $S_k$  is equal to

 $e^{-k\lambda t} e^{-(n-k+1)\lambda t}$ . Finally, the reliability of the loop topology is represented by Eq. (7).

$$R(t)_{Loop} = \prod_{k=1}^{n} P(p_{left} \cup p_{right}) = \prod_{k=1}^{n} \left\langle e^{-k\lambda t} + e^{-(n-k+1)\lambda t} - e^{-n\lambda t} \right\rangle$$
(7)

The bus-tree and loop-tree structures are based on the bus and loop topologies, respectively. These structures merge m branches of the bus or loop topologies with l sensors, in the way that  $m \times l$  is equal to the total number of sensors, which are assumed as n. With this in mind, the reliability of the bus-tree and loop-tree structures can be calculated by Eqs. (8) and (9).

$$R(t)_{Bus-Tree} = \prod_{k=1}^{m} R(t)_{Bus}]_{\forall :l-sensor}$$
(8)

$$R(t)_{Loop-Tree} = \prod_{k=1}^{m} R(t)_{Loop}]_{\forall:l-sensor}$$
(9)

To compare the dependability of the proposed topologies, we address the MTTF factor. This factor is determined by Eq. (10).

$$MTTF = \int_{t=0}^{\infty} R(t)dt \tag{10}$$

By substituting Eq. (5) into Eq. (10), we have the MTTF of the bus topology in Eq. (11).

$$MTTF_{bus} = \int_{t=0}^{\infty} R(t)dt = \int_{t=0}^{\infty} e^{-\frac{n(n+1)}{2}\lambda t} dt \xrightarrow{x=\lambda t} = \frac{1}{\lambda} \left( \int_{x=0}^{\infty} e^{-\frac{n(n+1)}{2}x} dx \right) = \frac{1}{\lambda} \left( \frac{2}{n(n+1)} \right) = \frac{1}{\lambda} \delta_b(x)$$
(11)

Similarly, the MTTFs of the star and loop topologies are, respectively, presented in Eqs. (12) and (13).

$$MTTF_{star} = \int_{t=0}^{\infty} R(t)dt = \int_{t=0}^{\infty} e^{-n\lambda t}dt \xrightarrow{x=\lambda t}{\longrightarrow} = \frac{1}{\lambda} \left( \int_{x=0}^{\infty} e^{-nx}dx \right) = \frac{1}{\lambda} \left( \frac{1}{n} \right) = \frac{1}{\lambda} \delta_s(x)$$
(12)

$$MTTF_{Loop} = \int_{t=0}^{\infty} R(t)dt = \int_{t=0}^{\infty} \prod_{k=1}^{n} \left( e^{-k\lambda t} + e^{-(n-k+1)\lambda t} - e^{-(n+1)\lambda t} \right)dt$$

$$\xrightarrow{x=\lambda t} = \frac{1}{\lambda} \left( \int_{x=0}^{\infty} \prod_{k=1}^{n} \left( e^{-kx} + e^{-(n-k+1)x} - e^{-(n+1)x} \right) dx \right) = \frac{1}{\lambda} \delta_l(x)$$
(13)

The comparison of MTTF coefficients for the loop, bus, and star structures, as is illustrated in Figure 3, shows that when the number of sensors on the network is less than 5, the loop topology represents the best reliability. By increasing the number of sensors, the reliability of the star topology is dominant. Moreover, independent of the network complexity, the bus topology represents the worst MTTF.



Figure 3. Coefficient of the MTTF for the bus, loop, and star topologies vs. the 1-wire network complexity.

To calculate the MTTF for the tree-loop structure, in a similar way, we use Eqs. (10) and (8), and we have Eq. (14).

$$MTTF_{Tree-Loop} = \int_{t=0}^{\infty} \prod_{b=1}^{m} R_l(t) dt = \int_{t=0}^{\infty} \prod_{b=1}^{m} \prod_{k=1}^{l} \left( e^{-k\lambda t} + e^{-(n-k+1)\lambda t} - e^{-(n+1)\lambda t} \right) dt \xrightarrow{x=\lambda t}$$

$$= \frac{1}{\lambda} \left( \int_{x=0}^{\infty} \left( \prod_{k=1}^{n} \left( e^{-kx} + e^{-(n-k+1)x} - e^{-(n+1)x} \right) \right)^m dx \right) = \frac{1}{\lambda} \delta_{tl}(x)$$

$$(14)$$

It is necessary to mention that the closed-form solution cannot be obtained for Eqs. (13) and (14), and the adaptive quadrature approximation is used to calculate this equation.

Finally, based on Eqs. (9) and (10), the MTTF of the tree-bus topology can be calculated by Eq. (15).

$$MTTF_{Tree-bus} = \int_{t=0}^{\infty} \left(\prod_{b=1}^{m} R_b(t)\right) dt = \int_{t=0}^{\infty} \left(\prod_{b=1}^{m} e^{-\frac{n(n+1)}{2}\lambda t}\right) dt \xrightarrow{x=\lambda t}$$

$$= \frac{1}{\lambda} \left(\int_{x=0}^{\infty} e^{-\frac{nm(n+1)}{2}x} dx\right) = \frac{1}{\lambda} \left\langle \frac{1}{m} \left(\frac{2}{n(n+1)}\right) \right\rangle = \frac{1}{\lambda} \delta_{tb}(x) = \frac{1}{\lambda} \left\langle \frac{1}{m} \delta_b(x) \right\rangle$$
(15)

Figure 4 illustrates the coefficient of different topologies versus the network complexity. For the tree basic bus, loop, and star structures, 3 states are considered (i.e. the number of branches is assumed as 2, 4, and 8). By increasing the number of branches, the MTTF coefficient becomes greater.



Figure 4. Coefficient of the MTTF for different network topologies vs. the complexity of the 1-wire network.

In order to represent a better demonstration, we address the comparative results of the MTTF coefficients. For a given network complexity (i.e. with an equal number of sensors), the reliability improvement of the loop-tree topology in comparison with the tree structure is presented in Figure 5. Similar comparisons among other topologies are illustrated in Figures 6–9. By increasing the number of sensors, the inclination of the comparative MTTF curves is declined and tended to a constant ratio.







Figure 7. Improvement of the MTTF of the bus-tree topology compared with the loop structure.



Figure 6. Improvement of the MTTF of the bus-tree topology compared with the bus structure.



Figure 8. Improvement of the MTTF of the loop-tree topology compared with the bus structure.

For instance, if the number of sensors is equal to 32 and the failure rate is assumed as calculated previously, the MTTF of each topology is as presented in Table 2.

These results, especially in Figure 4, reveal that the loop-tree topology is the best structure for 1-wire communication. In summary, despite the fact that the point-to-point topology is very robust for communication structures, we find that the loop-tree topology is the best structure for 1-wire communication.

In the following sections, we focus on the design and implementation of the 1-wire protocol on the FPGA. Moreover, we discuss the basic structure of the proposed time-based redundancy to realize a fault-tolerant bus controller.



Figure 9. Improvement of the MTTF of the loop-tree topology compared with the bus-tree structure.

| Topology            | MTTF (year) | Topology                          | MTTF (year) |  |  |
|---------------------|-------------|-----------------------------------|-------------|--|--|
| Loop                | 38.8        | Bus                               | 5.54        |  |  |
| Loop-tree $(t = 2)$ | 72.41       | Bus-tree $(t = 2)$                | 10.76       |  |  |
| Loop-tree $(t = 4)$ | 133.12      | Bus-tree $(t = 4)$                | 20.32       |  |  |
| Loop-tree $(t = 8)$ | 229.4       | Bus-tree $(t = 8)$                | 36.58       |  |  |
| Star                | 91.47       | Failure rate = $0.039$ , N = $36$ |             |  |  |

Table 2. MTTF of different topologies of the 1-wire network.

# 4. Design and implementation of the 1-wire protocol on the FPGA

#### 4.1. Regular implementation

To realize the bus controller, aside from the processors, the FPGAs can be used. Static random access memory (SRAM)-based FPGAs are interesting within space systems due to their low nonrecurring engineering costs, computational efficiency benefits over general purpose processors, and reconfigurability [11]. However, to develop a 1-wire thermal monitoring system, the processor-based approaches are presented in [12–22].

In our work, the 1-wire bus controller is supposed as a module of the satellite telemetry and tele-command (TT&C) subsystem, which samples temperature data from all parts of the satellite. Our TT&C program has a hierarchy structure and all of the modules are controlled by the top module. Each submodule has the same signaling as the top module. This signaling is based on an interrupt request that includes address, data, and handshaking signals. Figure 10 illustrates the signaling and block diagram of the 1-wire module in the TT&C subsystem.

To access a sensor through the 1-wire port, 4 steps are required: initialization, ROM-function command, memory function command, and data transfer [23]. At the beginning of the communication session, the bus master needs to know which 1-wire devices are available and ready to operate. As mentioned in Section 2, this stage is accomplished through initialization signaling. Next, the ROM-function command allows a particular device present on the 1-wire bus to be selected on the basis of its unique 64-bit serial identification (ROM) number. Reading and writing into the device memory as well as performing specific functions on the 1-wire device are done in the memory function command stage. Finally, the actual data transaction progresses either from the bus master to the slave or vice versa.

To determine the ROM code, the search ROM procedure is used. However, we acquire the ROM codes separately and do not implement the search function in the very high-speed integrated circuits hardware description language (VHDL) module due to the fact that it is necessary to specify the sensor location in satellites; moreover, the search ROM procedure requires more resources. Consequently, the ROM codes are separately determined and stored in the VHDL program as constant data. For a detailed description of the search ROM procedure, one can refer to the iButton Book of Standards at www.maxim-ic.com/ibuttonbook [24].



Figure 10. Block diagram of the 1-wire module.

The gathered temperature data from the sensors are stored on the local RAM of the VHDL module. The CRC check block accomplishes the CRC on communicating data. The states of the finite-state machine (FSM) of the 1-wire module are represented in Figure 11. The state machine follows the 1-wire protocol routine. After initialization of the external signals and internal variables, the number of sampled sensors is checked. If all of the sensors were sampled, the ready signal would be activated. The local RAM could be controlled externally through the top-level VHDL program. Table 3 shows the desired ranges and our implementation result for 1-wire timing.

| Parameter         | Protocol range                           | Implemented   |  |  |
|-------------------|------------------------------------------|---------------|--|--|
| Reset pulse       | $> 480 \mu s$                            | $560 \ \mu s$ |  |  |
| Wait for presence | 15 $\mu \mathrm{s}60~\mu \mathrm{s}$     | $30 \ \mu s$  |  |  |
| Presence pulse    | $60~\mu\mathrm{s}{-}240~\mu\mathrm{s}$   | 115 $\mu s$   |  |  |
| High in logic '0' | $> 1 \mu s$                              | $24 \ \mu s$  |  |  |
| Low in logic '0'  | $60~\mu \mathrm{s}{-}120~\mu \mathrm{s}$ | $73 \ \mu s$  |  |  |
| High in logic '1' | Unlimited                                | $84 \ \mu s$  |  |  |
| Low in logic '1'  | $> 1 \mu s$                              | $12 \ \mu s$  |  |  |

Table 3. Implementation results for the timing of the 1-wire protocol.

# 4.2. Reliable implementation (the proposed time-based redundancy)

A single 1-wire bus controller provides all of the functionality for temperature measurement purposes. However, its susceptibility to space radiation effects, especially SEU, suggests that additional considerations may be required. Space applications must consider the effect that energetic particles (radiation) can have on electronic components. In particular, SEUs may alter the logic-state of any static memory element (latch, flip-flop, or

#### GOSHEBLAGH and MOHAMMADI/Turk J Elec Eng & Comp Sci

RAM cell) or cause transient pulses in combinatorial logic paths. Since the user-programmed functionality of an FPGA depends on the data stored in millions of configuration latches within the device, a SEU in the configuration memory array might have adverse effects on the expected functionality of the user-implemented design. Similarly, SETs have a high probability for recognition at flip-flop inputs, where, if registered, cause a soft-error in the user data [25]. On the other hand, in a SRAM-based FPGA, both the user's combinational and sequential logic are implemented by customizable logic memory cells, in other words, SRAM cells. When an upset occurs in the combinational logic synthesized in the FPGA, it corresponds to a bit flip in one of the LUTs cells or in the cells that control the routing [26].



Figure 11. FSM of the 1-wire module.

To achieve a reliable controller, one can use traditional approaches including a duplex with comparison, TMR, and so on. However, these approaches are based on hardware redundancy. This paper proposes a novel architecture to realize a reliable monitoring system based on time redundancy. In the following sections, we discuss this architecture in more detail. The fault injection experiments reveal that the capability of tolerating SEU effects in the time redundancy technique increases up to 7.8-fold with respect to a regular hardware redundancy.

In time-based redundancy, as shown in Figure 12, to mitigate the single event effects (SEEs) of radiation, we sequentially triplicate the 1-wire modules. In other words, temperature data are gathered and stored in the local RAM of each module in different and sequential time. Moreover, the timing requirements among the 1-wire modules are illustrated in Figure 13. As shown in Table 4, this approach requires more resources compared to the single regular method. However, time-based redundancy can overcome the SET effects of space radiation.



Figure 12. Block diagram of the proposed reliable 1-wire module.

Figure 13. Timing of the internal signaling in the TMR structure.

| Parameters  | Single module             | TMR & compare | Overhead (%) |
|-------------|---------------------------|---------------|--------------|
| Flip-flops  | 164                       | 546           | 2.43         |
| LUTs        | 167                       | 560           | 2.35         |
| FPGA I/O    | 1                         | 3             | 2            |
| Max freq.   | 114 MHz                   | 98 MHz        | -0.14        |
| Sample rate | $35 \mathrm{\ ms/sensor}$ | 110 ms/sensor | 2.14         |
| Power       | 54 mW                     | 65  mW        | 0.2037       |

Table 4. Device (Xilinx-XCV300) utilization summary.

The delay time between 2 samplings is equal to the response of the previous module (i.e. sampling time  $t_s$ ). The sampling time of the modules depends on the complexity of the 1-wire network. The implementation results show that the average time for the sampling of 1 sensor is equal to 35 ms. Consequently, the sampling time of a network including 32 sensors takes 1120 ms. This period of time removes the transient faults of space radiation.

Moreover, to develop SEU immunity, the sensor duplication can be performed in the critical sections of the satellite. The unique 64-bit code of the new sensors is set in the second module. The truth table of the compare & vote block in a single network (with and without sensor duplication) is illustrated in Table 5. It is worth mentioning that the comparison of the data is accomplished in a high-order section of the data register. In other words, in the temperature register, the compression is done on BIT8 $\sim$ BIT3, which means that 3.5 °C temperature variations are acceptable.

| Comparison result |             | With        | out | sensor duplication | Sensor duplication |   |   |                |
|-------------------|-------------|-------------|-----|--------------------|--------------------|---|---|----------------|
| $D_2 = D_1$       | $D_2 = D_0$ | $D_1 = D_0$ | CE  | S                  | Fault nature       |   | S | Fault nature   |
| 0                 | 0           | 0           | 1   | Х                  | SET                | 1 | Х | SET            |
| 0                 | 0           | 1           | 0   | 0                  | _                  | 0 | 0 | -              |
| 0                 | 1           | 0           | 0   | 0                  | _                  | 1 | Х | SEU            |
| 0                 | 1           | 1           | 1   | Х                  | FPGA or SENSOR     | 1 | Х | FPGA or SENSOR |
| 1                 | 0           | 0           | 0   | 1                  | _                  | 0 | 1 | _              |
| 1                 | 0           | 1           | 1   | Х                  | FPGA or SENSOR     | 1 | Х | FPGA or SENSOR |
| 1                 | 1           | 0           | 1   | Х                  | FPGA or SENSOR     | 1 | Х | FPGA or SENSOR |
| 1                 | 1           | 1           | 0   | 0                  | _                  | 0 | 0 | -              |

 Table 5. Truth table of the compare and vote block.

The proposed time-based redundancy can handle all of the SEE faults in the bus master and slaves. However, the management of the critical error (CE) signal depends on the system designing consideration. Moreover, turning the 1-wire network off on the slave side, and partial reconfiguration on the master side (FPGA), could be a reasonable solution for the CE. In the following sections, we discuss the SEU immunity of the proposed time-based redundancy compared to traditional hardware redundancy.

# 5. Experimental results

For FPGA-based designs, SEU fault injection methods fall into 3 broad groups: radiation accelerator tests, emulation methods, and simulation or software approaches [27].

To validate the proposed design, we use the emulation technique. Our hardware setup to emulate the faults in FPGAs consists of 3 parts, as represented in Figure 14. The first part is a personal computer that provides a complete graphic user interface for emulating different models of SEE faults. The second part is a fault controller based on the LPC2368 microcontroller. It receives fault injection mode from the PC and controls the configuration memory of the FPGA. All of the required signaling and timings are managed through the external microcontroller. The third part is a FPGA platform, a Xilinx XC4VFX12 device, which hosts the designs under test (DUT) and other required modules.



Figure 14. SEU fault injection set-up.

#### GOSHEBLAGH and MOHAMMADI/Turk J Elec Eng & Comp Sci

The fault injection area is constrained to the design under test section. This section contains 5 configurable logic block columns in the platform of the FPGA, which means that the total number of configuration bits is equal to  $5 \times 22 \times 1312 = 144,320$ . As mentioned previously, a SEU fault means an upset in the FPGA configuration bits. Hence, the available sample space to inject the SEU fault is equal to 144,320 bits. Figure 15 shows the flow diagram of the test operations to emulate the SEU faults [28]. The number of injected faults was selected to guarantee that the gathered results are statistically meaningful. For these purposes, we repeat the experiments with 1,000,000 randomly selected SEUs. Table 6 shows the fault injection results.



Figure 15. Flow diagram of the test operations for the SEU emulation environment [28].

| Circuit         | Resources  | (#)  | Wrong answor (#)    |  |  |
|-----------------|------------|------|---------------------|--|--|
|                 | Flip-flops | LUTs | wrong answer $(\#)$ |  |  |
| Single module   | 64         | 203  | 211.6               |  |  |
| Traditional TMR | 195        | 612  | 1185.4              |  |  |
| Proposed TMR    | 197        | 600  | 26.8                |  |  |

Table 6. Fault injection results.

The fault injection area is fixed; therefore, for a TMR version that engages more resources, the failure probability is more than that of the single module. Moreover, the redundant modules in the TMR version are disturbed by adjacent modules. Therefore, this version may suffer error propagation from issue among the replicated modules. Furthermore, the SEU fault in the voter increases the failure probability in the TMR version in comparison with the single module.

The fault injection experiments reveal that the proposed time-based redundancy represents better outcomes as an alternative to traditional triple hardware redundancy. Furthermore, the experiments show that the capability of tolerating SEU effects in the proposed method increases up to 7.8-fold with respect to a regular design.

### 6. Conclusion

The use of the l-wire protocol for distributed system monitoring, especially for satellite applications, provides several benefits. However, for aerospace applications, the reliability of this system should be at an acceptable level. To achieve this, we propose the reliability analysis of different 1-wire network topologies and a reliable implementation of this network on the FPGA. Despite the fact that the point-to-point topology is very robust for communication structures, the results show that the loop-tree topology is the best for 1-wire communication. In addition, we propose and validate a time-based redundancy approach that gives better results when compared to the traditional triple redundancy.

#### References

- E.S. Lee, D.L. Dibartolomeo, F.M. Rubinstein, S.E. Selkowitz, "Low-cost networking for dynamic window systems", Energy and Buildings, Vol. 36, 2004.
- [2] S. Speretta, D. Roascio, L.M. Reyneri, C. Sansoé, M. Tranchero, C. Passerone, D. Del Corso, "Optical-based COTS data communication bus for satellites", Acta Astronautica, Vol. 66, pp. 674–681, 2010.
- [3] X. Hongmei, "Research and development of an intelligent temperature-measuring system based on 1-wire bus", International Conference on Intelligent Computation Technology and Automation, pp. 30–33, 2008.
- [4] R. Downs, "Using 1-wire I/O for distributed system monitoring", Wescon/98, pp. 161–168, 1998.
- [5] J. Wang, C. Gong, "Research on 1-wire bus temperature monitoring system", 8th International Conference on Electronic Measurement and Instruments, pp. 3-722–3-726, 2007.
- [6] C. Lun, J. Ma, H. Fan, C. Liu, L. Sun, J. Yu, "Wireless monitoring system for granary based on 1-wire", International Conference on Computer Design and Applications, Vol. 4, pp. V4-496–V4-499, 2010.
- [7] J. Sun, Z. Huo, "The application of 1-wire bus technology in the temperature monitoring on the of rolling mill", International Conference on E-Learning, E-Business, Enterprise Information Systems, and E-Government, pp. 233– 236, 2009.
- [8] Z. Wu, F. Gao, Y. Lu, "Binary tree for 1-wire technology in the ROM search", International Forum on Information Technology and Applications, Vol. 3, pp. 545–547, 2009.
- [9] Military Handbook, Reliability prediction of electronic equipment, MIL-HDBK-217F, 1991.
- [10] I. Koren, C.M. Krishna, "Fault tolerant systems", Morgan-Kaufman Publishers, San Francisco, 2007.
- [11] A. Hossein, B.T. Mehdi, "Analytical techniques for soft error rate modeling and mitigation of FPGA-based designs", IEEE Transactions on Very Large Scale Integration Systems, Vol. Vol. 15, pp. 1320–1331, 2007.
- [12] C. Moi-Tin, T. Tatt-Huong, K. Ye-Chow, "Electrical power monitoring system using thermochron sensor and 1wire communication protocol", 4th IEEE International Symposium on Electronic Design, Test and Applications, pp. 549–554, 2008.
- [13] C. Yu, Z. Haijun, W. Na, "Body temperature monitor and alarm system used in hospital based on 1-wire and wireless communication technology", International Workshop on Geoscience and Remote Sensing and Education Technology and Training, pp. 401–404, 2008.
- [14] W. Shuxiao, K.W.E. Cheng, K. Ding, "Design of the temperature and humidity instrument based on 1-wire sensor for electric vehicle motors", 3rd International Conference on Power Electronics Systems and Applications, pp. 1–5, 2009.
- [15] W. Xi, L. Shuqing, "Multipoint temperature measurement system of hot pack based on DS18B20", International Conference on Information Engineering, pp. 26–29, 2010.
- [16] Z. Chengliang, F. Xianying, L. Lei, "The key technologies of a distributed temperature monitoring system based on 1-wire bus", 8th World Congress on Intelligent Control and Automation, pp. 7041–7045, 2010.

- [17] K. Marinko, M. Vražic, I. Gašparac, "Bluetooth wireless communication and 1-wire digital temperature sensors in synchronous machine rotor temperature measurement", 14th International Power Electronics and Motion Control Conference, pp. T7-25–T7-28, 2010.
- [18] R. Zhou, H. Xu, and G. Ren, "Design of temperature measurement system consisted of FPGA and DS18B20", International Symposium on Computer Science and Society, pp. 90–93, 2011.
- [19] M. Li, G. Huang, "Design of multipath temperature system based on 8051F020", International Conference on Electrical and Control Engineering, pp. 4435–4437, 2011.
- [20] B. Peiffer, A. Kruger, "Physical layer architecture for 1-wire sensor communication bus: binary channel code division multiple access", Sensors Applications Symposium, pp. 100–105, 2011.
- [21] I.A. Zualkernan, "InfoCoral: open-source hardware for low-cost, high-density concurrent simple response ubiquitous systems", 11th IEEE International Conference on Advanced Learning Technologies, pp. 638–639, 2011.
- [22] J.H. Zhu, J. Wang, D. Liu, "Design of a wireless sensor network node based on nRF2401", IEEE International Conference on Computer Science and Automation Engineering, pp. 203–206, 2011.
- [23] T. Kai-Xin, C. Moi-Tin, S. Demidenko, "An intelligent warehouse stock management and tracking system based on silicon identification technology and 1-wire network communication", 6th IEEE International Symposium on Electronic Design, Test and Application, pp. 110–115, 2011.
- [24] Maxim Corporation Integrated Products, "High-precision 1-wire digital thermometer", Maxim Integrated Products Inc., 2008.
- [25] C. Carmichael, "Triple module redundancy design techniques for Virtex FPGAs", Xilinx Application Notes, XAPP197 (v1.0.1) July 6, 2006.
- [26] F.L. Kastensmidt, L. Carro, R. Reis, Fault-tolerance techniques for SRAM-based FPGAs, Springer, 2006.
- [27] L. Sterpone, M. Violante, "A new partial reconfiguration-based fault-injection system to evaluate SEU effects in SRAM-based FPGAs", IEEE Transactions on Nuclear Science, Vol. 54, pp. 965-970, 2007.
- [28] P. Schumacher, "SEU Emulation Environment", Xilinx white paper, WP414 (v1.0) April 9, 2012.