Continuous Semi markov Processes and Their Applications

Semi-Markov Process

Introduction

Shun-Zheng Yu , in Hidden Semi-Markov Models, 2016

1.1.2 Semi-Markov Process

A semi-Markov process is equivalent to a Markov renewal process in many aspects, except that a state is defined for every given time in the semi-Markov process, not just at the jump times. Therefore, the semi-Markov process is an actual stochastic process that evolves over time. Semi-Markov processes were introduced by Levy (1954) and Smith (1955) in 1950s and are applied in queuing theory and reliability theory.

For an actual stochastic process that evolves over time, a state must be defined for every given time. Therefore, the state $S_{t}$ at time $t$ is defined by $S_{t} = X_{n}$ for $t \in [T_{n}, T_{n + 1})$ . The process ${(S_{t})}_{t \geq 0}$ is thus called a semi-Markov process. In this process, the times 0= $T_{0} < T_{1} < \dots < T_{n} < \dots$ are the jump times of ${(S_{t})}_{t \geq 0}$ , and $τ_{n} = T_{n} - T_{n - 1}$ are the sojourn times in the states. Every transition from a state to the next state is instantaneously made at the jump times.

For a time-homogeneous semi-Markov process, the transition density functions are

$h_{i j} (τ) d τ \equiv P [τ \leq τ_{n + 1} < τ + d τ, X_{n + 1} = j | X_{n} = i],$

where $h_{i j} (τ)$ is independent of the jumping time $T_{n}$ . It is the probability density function that after having entered state i at time zero the process transits to state j in between time $τ$ and $τ + d τ$ . They must satisfy

$\sum_{j \in S} \int_{0}^{\infty} h_{i j} (τ) d τ = 1,$

for all $i \in S$ . That is, state i must transit to another state in the time $[0, \infty)$ .

If the number of jumps in the time interval $[0, T]$ is $N (T) = n$ , then the sample path $(s_{t}, t \in [0, T])$ is equivalent to the sample path $(x_{0}$ , $τ_{1}$ , $x_{1}$ , …, $τ_{n}$ , $x_{n}$ , $T - \sum_{k = 1}^{n} τ_{k})$ with probability 1. Then the joint distribution of the process ${(s_{t})}_{0 \leq t \leq T}$ is

$\begin{array}{l} P [x_{0}, {τ'}_{1} \leq τ_{1}, x_{1}, \dots, {τ'}_{n} \leq τ_{n}, x_{n}, T - \sum_{k = 1}^{n} {τ'}_{k} | N (T) = n] \\ = P [x_{0}] \cdot \int_{0}^{τ_{1}} \dots \int_{0}^{τ_{n}} (1 - W_{x_{n}} (T - \sum_{k = 1}^{n} {τ'}_{k})) \cdot \prod_{k = 1}^{n} h_{x_{k - 1} x_{k}} ({τ'}_{k}) \cdot d {τ'}_{k}, \end{array}$

where $W_{i} (τ) = \int_{0}^{τ} \sum_{j \in S} h_{i j} (τ') d τ'$ is the probability that the process stays in state $i$ for at most time $τ$ before transiting to another state, and $1 - W_{i} (τ)$ is the probability that the process will not make transition from state i to any other state within time $τ$ . The likelihood function corresponding to the sample path ( $x_{0}$ , $τ_{1}$ , $x_{1}$ , …, $τ_{n}$ , $x_{n}$ , $T - \sum_{k = 1}^{n} τ_{k}$ ) is thus

$L [x_{0}, τ_{1}, x_{1}, \dots, τ_{n}, x_{n}, T - \sum_{k = 1}^{n} τ_{k}] = P [x_{0}] \cdot (1 - W_{x_{n}} (T - \sum_{k = 1}^{n} τ_{k})) \cdot \prod_{k = 1}^{n} h_{x_{k - 1} x_{k}} (τ_{k}) .$

Suppose the current time is t. The time that has been passed since last jump is defined by $R_{t} = t - T_{N (t)}$ . Then the process $(S_{t}, R_{t})$ is a continuous time homogeneous Markov process.

The semi-Markov process can be generated by different types of random mechanisms (Nunn and Desiderio, 1977), for instances:

1.

Usually, it is thought as such a stochastic process that after having entered state $i$ , it randomly determines the successor state $j$ based on the state transition probabilities $a_{i j}$ , and then randomly determines the amount of time $τ$ staying in state $i$ before going to state $j$ based on the holding time density function $f_{i j} (τ)$ , where $a_{i j} \equiv P [X_{n + 1} = j | X_{n} = i] = \int_{0}^{\infty} h_{i j} (τ) d τ$ is the transition probability from state $i$ to state $j$ , s.t. $\sum_{j \in S} a_{i j} = 1$ , and

$f_{i j} (τ) d τ \equiv P [τ \leq τ_{n + 1} < τ + d τ | X_{n} = i, X_{n + 1} = j] = h_{i j} (τ) d τ / a_{i j}$

is the probability that the transition to the next state will occur in the time between $τ$ and $τ + d τ$ given that the current state is $i$ and the next state is $j$ . In this model,

$h_{i j} (τ) = a_{i j} f_{i j} (τ) .$

2.

The semi-Markov process can be thought as a stochastic process that after having entered state $i$ , it randomly determines the waiting time $τ$ for transition out of state $i$ based on the waiting time density function $w_{i} (τ)$ , and then randomly determines the successor state j based on the state transition probabilities $a_{(i, τ) j}$ , where $w_{i} (τ)$ is the density function of the waiting time for transition out of state i defined by

$w_{i} (τ) d τ = P [τ \leq τ_{n + 1} < τ + d τ | X_{n} = i] = \sum_{j \in S} h_{i j} (τ) d τ,$

and

$a_{(i, τ) j} \equiv P [X_{n + 1} = j | X_{n} = i, τ_{n + 1} = τ]$

is the probability that the system will make the next transition to state $j$ , given time $τ$ and current state i. In this model,

$h_{i j} (τ) = w_{i} (τ) a_{(i, τ) j} .$

3.

The semi-Markov process can also be thought as such a process that after having entered state $i$ , it randomly draws the pair $(k, d_{i k})$ for all $k \in S$ , based on $f_{i k} (τ)$ , and then determines the successor state and length of time in state $i$ from the smallest draw. That is, if $d_{i j} = \min_{k \in S} {d_{i k}}$ , then the next transition is to state $j$ and the length of time the process holds in state $i$ before going to state $j$ is $d_{i j}$ . In this model,

$h_{i j} (τ) = f_{i j} (τ) \prod_{k \neq j} (1 - F_{i k} (τ)),$

where $F_{i k} (τ) = \int_{0}^{τ} f_{i k} (τ') d τ'$ , and $\prod_{k \neq j} (1 - F_{i k} (τ))$ is the probability that the process will not transit to another state except $j$ by time $τ$ . This type of semi-Markov process is applied to such as reliability analysis (Veeramany and Pandey, 2011). An example of this type of semi-Markov process is as follows.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128027677000012

General Hidden Semi-Markov Model

Shun-Zheng Yu , in Hidden Semi-Markov Models, 2016

2.1 A General Definition of HSMM

An HSMM allows the underlying process to be a semi-Markov chain with a variable duration or sojourn time for each state. State duration d is a random variable and assumes an integer value in the set $D = {1, 2, \dots, D}$ , where D is the maximum duration of a state and can be infinite in some applications. Each state can emit a series of observations, and the number of observations produced while in state i is determined by the length of time spent in the state, that is, the duration d. Now we provide a unified description of HSMMs.

Assume a discrete-time semi-Markov process with a set of (hidden) states $S = {1, \dots, M}$ . The state sequence $(S_{1}, \dots, S_{T})$ is denoted by $S_{1 : T}$ , where $S_{t} \in S$ is the state at time $t$ . A realization of $S_{1 : T}$ is denoted as $s_{1 : T}$ . For simplicity of notation in the following sections, we denote:

•: $S_{t_{1} : t_{2}} = i$ —state i that the system stays in during the period from $t_{1}$ to $t_{2}$ . In other words, it means $S_{t_{1}} = i, S_{t_{1} + 1} = i, \dots,$ and $S_{t_{2}} = i$ . Note that the previous state $S_{t_{1} - 1}$ and the next state $S_{t_{2} + 1}$ may or may not be i.
•: $S_{[t_{1} : t_{2}]} = i$ —state $i$ that starts at time $t_{1}$ and ends at $t_{2}$ with duration $d = t_{2} - t_{1} + 1$ . This implies that the previous state $S_{t_{1} - 1}$ and the next state $S_{t_{2} + 1}$ must not be $i$ .
•: $S_{[t_{1} : t_{2}} = i$ —state $i$ that starts at time $t_{1}$ and lasts till $t_{2}$ , with $S_{[t_{1}} = i, S_{t_{1} + 1} = i, \dots, S_{t_{2}} = i$ , where $S_{[t_{1}} = i$ means that at $t_{1}$ the system entered state $i$ from some other state, that is, the previous state $S_{t_{1} - 1}$ must not be $i$ . The next state $S_{t_{2} + 1}$ may or may not be $i$ .
•: $S_{t_{1} : t_{2}]} = i$ —state $i$ that lasts from $t_{1}$ to $t_{2}$ and ends at $t_{2}$ with $S_{t_{1}} = i, S_{t_{1} + 1} = i, \dots, S_{t_{2}]} = i$ , where $S_{t_{2}]} = i$ means that at time $t_{2}$ the state will end and transit to some other state at time $t_{2} + d t_{2}$ , that is, the next state $S_{t_{2} + 1}$ must not be $i$ . The previous state $S_{t_{1} - 1}$ may or may not be $i$ .

Based on these definitions, $S_{[t]} = i$ means state $i$ starting and ending at $t$ with duration 1, $S_{[t} = i$ means state $i$ starting at $t$ , $S_{t]} = i$ means state $i$ ending at $t$ , and $S_{t} = i$ means the state at $t$ being state $i$ .

Denote the observation sequence $(O_{1}, \dots, O_{T})$ by $O_{1 : T}$ , where $O_{t} \in V$ is the observation at time $t$ and $V = {v_{1}, v_{2}, \dots, v_{K}}$ is the set of observable values. For observation sequence $O_{1 : T}$ , the underlying state sequence is $S_{1 : d_{1}]} = i_{1}$ , $S_{[d_{1} + 1 : d_{1} + d_{2}]} = i_{2}, \dots, S_{[d_{1} + \dots + d_{N - 1} + 1 : d_{1} + \dots + d_{N}} = i_{N}$ , and the state transitions are $(i_{n}, d_{n}) \to (i_{n + 1}, d_{n + 1})$ , for $n = 1, \dots, N - 1$ , where $\sum_{n = 1}^{N} d_{n} = T$ , $i_{1}, \dots, i_{N} \in S$ , and $d_{1}, \dots, d_{N} \in D$ . Note that the first state $i_{1}$ is not necessarily starting at time 1 associated with the first observation $O_{1}$ and the last state $i_{N}$ is not necessarily ending at time $T$ associated with the last observation $O_{T}$ . As the states are hidden, the number N of hidden states in the underlying state sequence is also hidden/unknown.

We note that the observable values can be discrete, continuous, or have infinite support, and the observation $O_{t} \in V$ can be a value, a vector, a symbol, or an event. The length T of the observation sequence can be very large, but is usually assumed to be finite except in the case of online learning. There are usually multiple observation sequences in practice, but we do not always explicitly mention this fact unless it is required. The formulas derived for the single observation sequence usually cannot be directly applied for the multiple observation sequences because the sequence lengths are different with different likelihood functions. Therefore, while applying the formulas derived for the single observation sequence into the case of multiple observation sequences, the formulas must be divided by the likelihood functions $P [o_{1 : T_{l}}^{(l)}]$ if they have not yet appeared in the formulas, where $o_{1 : T_{l}}^{(l)}$ is the lth observation sequence of length T _l.

Suppose the current time is $t$ , the process has made $n - 1$ jumps, and the time spent since the previous jump is $X_{t} = t - \sum_{l = 1}^{n - 1} d_{l}$ . As explained in Section 1.1.2, the process ${(S_{t}, X_{t})}_{t \geq 1}$ is a discrete-time homogeneous Markov process. Its subsequence ${(i_{n}, d_{n})}_{n \geq 1}$ is also a Markov process based on the Markov property. Then we can define the state transition probability from state i having duration h to state $j \neq i$ having duration d by

$a_{(i, h) (j, d)} \equiv P [S_{[t + 1 : t + d]} = j | S_{[t - h + 1 : t]} = i],$

which is assumed independent of time $t$ , for $i, j \in S$ , $h, d \in D$ . The transition probabilities must satisfy $\sum_{j \in S \ {i}} \sum_{d \in D} a_{(i, h) (j, d)} = 1$ , for all given $i \in S$ and $h \in D$ , with zero self-transition probabilities $a_{(i, h) (i, d)} = 0$ , for all $i \in S$ and $h, d \in D$ . In other words, when a state ends at time t, it cannot transit to the same state at the next time t+1 because the state durations are explicitly specified by some distributions other than geometric or exponential distributions. From the definition we can see that the previous state $i$ started at $t - h + 1$ and ended at $t$ , with duration $h$ . Then it transits to state $j$ having duration $d$ , according to the state transition probability $a_{(i, h) (j, d)}$ . State $j$ will start at $t + 1$ and end at $t + d$ . This means both the state and the duration are dependent on both the previous state and its duration. While in state $j$ , there will be $d$ observations $O_{t + 1 : t + d}$ being emitted. Denote this emission/observation probability by

$b_{j, d} (o_{t + 1 : t + d}) \equiv P [o_{t + 1 : t + d} | S_{[t + 1 : t + d]} = j]$

which is assumed to be independent of time $t$ , where $o_{t + 1 : t + d}$ is the observed values of $O_{t + 1 : t + d}$ . Let the distribution of the first state be

$Π_{j, d} \equiv P [S_{[1 : d]} = j]$

$Π_{j, d} \equiv P [S_{1 : d]} = j]$

depending on the model assumption that the first state is starting at $t = 1$ or before. We can equivalently let the initial distribution of the state be

$π_{j, d} \equiv P [S_{[t - d + 1 : t]} = j], t \leq 0 .$

It represents the probability of the initial state and its duration before time $t = 1$ or before the first observation $o_{1}$ obtained. The relationship between the two definitions of initial state distribution is $Π_{j, d} = \sum_{τ = d - D + 1}^{1} \sum_{i, h} π_{(i, h)} a_{(i, h) (j, d - τ + 1)}$ , where if the starting time of the first state must be $t = 1$ then $τ = 1$ ; otherwise, if the first state can start at or before $t = 1$ , then $1 \geq τ \geq - (D - d - 1)$ . Usually, the second definition of the initial state distribution, ${π_{j, d}}$ , makes the computation of the forward variables in the HSMM algorithms simpler. Then the set of model parameters for the HSMM is defined by

$λ \equiv {a_{(i, h) (j, d)}, b_{j, d} (v_{k_{1} : k_{d}}), π_{j, d} : i, j \in S; h, d \in D; v_{k_{d}} \in V},$

$λ \equiv {a_{(i, h) (j, d)}, b_{j, d} (v_{k_{1} : k_{d}}), Π_{j, d} : i, j \in S; h, d \in D; v_{k_{d}} \in V},$

where $v_{k_{1} : k_{d}}$ represents an observable substring of length $d$ for $v_{k_{1}} \dots v_{k_{d}} \in V^{d} = V \times \dots \times V$ . This general HSMM is shown in Figure 1.6.

The general HSMM is reduced to specific models of HSMM depending on the assumptions made. For instances,

1.

If the state duration is assumed to be independent of the previous state, then the state transition probability can be further specified as $a_{(i, h) (j, d)} = a_{(i, h) j} p_{j} (d)$ , where

(2.1) $a_{(i, h) j} \equiv P [S_{[t + 1} = j | S_{[t - h + 1 : t]} = i]$

is the transition probability from state $i$ that has stayed for duration $h$ to state $j$ that will start at $t + 1$ , and

(2.2) $p_{j} (d) \equiv P [S_{t + 1 : t + d]} = j | S_{[t + 1} = j]$

is the probability of duration $d$ that state $j$ will take. This is the model proposed by Marhasev et al. (2006). Compared with the general HSMM, the number of model parameters is reduced from $M^{2} D^{2}$ to $M^{2} D + M D$ , and the state duration distributions $p_{j} (d)$ can be explicitly expressed using probability density functions (e.g., Gaussian distributions) or a probability mass function.

2.

If a state transition is assumed to be independent of the duration of the previous state, then the state transition probability from (i,h) to (j,d) becomes $a_{(i, h) (j, d)} = a_{i (j, d)}$ , where

(2.3) $a_{i (j, d)} \equiv P [S_{[t + 1 : t + d]} = j | S_{t]} = i]$

is the transition probability that state $i$ ended at $t$ and transits to state $j$ having duration $d$ . If it is assumed that a state transition for $i \neq j$ is $(i, 1) \to (j, d)$ and a self-transition is $(i, d + 1) \to (i, d)$ , for $d \in D$ , then the model becomes the residual time HMM (Yu and Kobayashi, 2003a). In this model, the starting time of the state is not of concern, but the ending time is of interest. Therefore, d represents the remaining sojourn (or residual life) time of state $j$ . This model is obviously appropriate to applications for which the residual life is of the most concern. The number of model parameters is reduced to $M^{2} D$ . More importantly, if the state duration is further assumed to be independent of the previous state, then the state transition probability can be specified as $a_{i (j, d)} = a_{i, j} p_{j} (d)$ . In this case, the computational complexity will be the lowest among all HSMMs. The number of model parameters is further reduced to $M^{2} + M D$ .

3.

If self-transition $(i, d) \to (i, d + 1)$ is allowed and the state duration is assumed to be independent of the previous state, then the state transition probability becomes

$a_{(i, h) (j, d)} = a_{(i, h) j} (\underset{τ = 1}{\prod^{d - 1}} a_{j j} (τ)) [1 - a_{j j} (d)],$

where $a_{(i, h) j} \equiv P [S_{[t + 1} = j | S_{[t - h + 1 : t]} = i]$ ; $a_{j j} (d)$ is the self-transition probability when state $j$ has stayed for $d$ time units, that is,

$a_{j j} (d) \equiv P [S_{t + d + 1} = j | S_{[t + 1 : t + d} = j],$

and $1 - a_{j j} (d) = P [S_{t + d]} = j | S_{[t + 1 : t + d} = j]$ is the probability that state $j$ ends with duration $d$ . This is the variable transition HMM (Krishnamurthy et al., 1991; Vaseghi, 1991). In this model, a state transition is either $(i, d) \to (j, 1)$ for $i \neq j$ or $(i, d) \to (i, d + 1)$ for a self-transition. This process is similar to the standard discrete-time semi-Markov process. The concept of the discrete-time semi-Markov process can thus be used in modeling an application. This model has $M^{2} D + M D$ model parameters. The computational complexity is relatively high compared with other conventional HSMMs.

4.

If a transition to the current state is independent of the duration of the previous state and the duration of the current state is only conditioned on the current state itself, then

$a_{(i, h) (j, d)} = a_{i j} p_{j} (d),$

where $a_{i j} \equiv P [S_{[t + 1} = j | S_{t]} = i]$ is the transition probability from state $i$ to state $j$ , with the self-transition probability $a_{i i} = 0$ . This is the explicit duration HMM (Ferguson, 1980), with $M^{2} + M D$ model parameters and lower computational complexity. This is the simplest and the most popular model among all HSMMs, with easily understandable formulas and modeling concepts.

Besides, the general form $b_{j, d} (v_{k_{1} : k_{d}})$ of observation distributions can be simplified and dedicated to applications. They can be parametric (e.g., a mixture of Gaussian distributions) or nonparametric (e.g., a probability mass function), discrete or continuous, and dependent on or independent of the state durations. The observations can be assumed dependent or conditionally independent for given states, that is, $b_{j, d} (v_{k_{1} : k_{d}}) = \prod_{τ = 1}^{d} b_{j} (v_{k_{τ}})$ . The conditional independence makes HSMMs simpler and so is often assumed in the literature.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128027677000024

Stochastic Modeling Techniques for Secure and Survivable Systems

Kishor S. Trivedi , ... Selvamuthu Dharmaraja , in Information Assurance, 2008

Computations of Steady-State Probabilities

The embedded DTMC for the above discussed SMP is shown in Figure 7.2. As mentioned earlier in subsection 7.2.2, we need the information of sojourn times in each state and the transition probabilities. The corresponding parameters are listed as follows:

h_G : Mean time for a system to resist becoming vulnerable to attacks.

h_V : Mean time for a system to resist attacks when vulnerable.

h_A : Mean time taken by a system to detect an attack and initiate triage actions.

h_MC : Mean time a system can keep the effects of an attack masked.

h_UC : Mean time that an attack remains undetected while doing damage.

h_TR : Mean time a system takes to evaluate how best to handle an attack.

h_FS : Mean time a system operates in a fail-secure mode in the presence of an attack.

h_GD : Mean time a system is in the degraded state in the presence of an attack.

h_F : Mean time a system is in the failed state despite detecting an attack.

p_a : Probability of injecting a successful attack, given that a system is vulnerable.

p_u : Probability that a successful attack has remain undetected.

p_m : Probability that a system successfully masks an attack.

p_g : Probability that a system resists an attack by graceful degradation.

p_s : Probability that a system responds to an attack in a fail-secure manner.

For computing the security attributes in terms of availability, confidentiality, and integrity, we need to determine the steady-state probabilities {π _i , i ∈ S} of the SMP states. These probabilities are to be determined in terms of the steady-state probabilities $π_{i}^{d}$ and the mean sojourn time h_i of the states of the DTMC.

The matrix P that describes the state transition probabilities for this DTMC is written as:

$P = [\begin{matrix} 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ {\tilde{p}}_{a} & 0 & p_{a} & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & p_{m} & p_{u} & {\tilde{p}}_{m u} & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & p_{s} & p_{g} & {\tilde{P}}_{s g} \\ 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{matrix}]$

where ${\tilde{p}}_{a} = 1 - p_{a}$ , ${\tilde{p}}_{m v} = 1 - p_{m} - p_{u}$ , and ${\tilde{p}}_{s g} = 1 - p_{s} - p_{g}$ On solving the equation

(7.4) $π^{d} = π^{d} \times P, \sum_{i} π_{i}^{d} = 1, i \in S$

where $π^{d} = [π_{G}^{d}, π_{V}^{d}, π_{A}^{d}, π_{M C}^{d}, π_{U C}^{d}, π_{T R}^{d}, π_{F S}^{d}, π_{G D}^{d}, π_{F}^{d}]$ the steady-state probabilities $π_{i}^{d}$ are obtained.

Next, we compute the mean sojourn time h_i in each state i. The quantity h_i is determined by a random time the process spends in state i. The attacker behavior is described by the transitions G → V and V → A. To model the wide range of attacks (from amateur mischief to cyber attacks), it is necessary to consider a variety of probability distributions. On the other hand, system response to an attack is algorithmic and automated. Based on the sojourn time distribution of state i, the mean sojourn time h_i for this state is calculated. For example, if sojourn time distribution for state G is hypoexponential with parameters λg ₁ and λ g ₂, then its mean sojourn time h_G is given as $h_{G} = (\frac{1}{λ_{g 1}} + \frac{1}{λ_{g 2}})$ . Similarly, the mean sojourn times for the other states h_A, h_V, h_MC, h_GD, h_TR, h_FS, h_GD, h_F are also computed.

Steady-state probability for SMP states is expressed in terms of the steady-state probabilities $π_{i}^{d}$ of the DTMC and their sojourn times h_i using Eq. 7.3. Substituting for $π_{i}^{d}$ and h_i in Eq. 7.3, we get the steady-state probabilities for the SMP as:

(7.5) $\begin{matrix} π_{G} = \frac{h_{G}}{h_{G} + h_{V} + p_{a} [h_{A} + p_{m} h_{M C} + p_{u} h_{G D} + (1 - p_{m} - p_{u})]}, \\ π_{V} = h_{V} \frac{π_{G}}{h_{G}}, π_{A} = h_{A} p_{a} \frac{π_{G}}{h_{G}}, π_{M C} = h_{M C} p_{m} p_{u} \frac{π_{G}}{h_{G}}, \\ π_{F S} = h_{F S} p_{a} p_{s} (1 - p_{m} - p_{u}) \frac{π_{G}}{h_{G}}, π_{T R} = h_{T R} p_{a} (1 - p_{m} - p_{u}) \frac{π_{G}}{h_{G}}, \\ π_{G D} = h_{G D} p_{a} p_{g} (1 - p_{m} - p_{u}) \frac{π_{G}}{h_{G}}, \\ π_{F} = h_{F} p_{a} (1 - p_{a} - p_{g}) (1 - p_{m} - p_{u}) \frac{π_{G}}{h_{G}} . \end{matrix}$

Once the steady-state probabilities are known, we now compute the security attributes such as availability, confidentiality, and integrity.

For calculating availability, we observe that a system is not available in states FS, F, and UC and is available in all the other states. Then, the availability is given as:

(7.6) $A = 1 - (π_{F S} + π_{F} + π_{U C})$

In case of a DoS attack, a system is made to stop functioning, that is, bringing it to the FS state will accomplish the goal of a DoS attack. Therefore, the states FS and MC will not be part of the state transition diagram. For this attack, system availability is given as:

(7.7) $A_{D o S} = 1 - (π_{F} + π_{U C})$

Similarly, confidentiality and integrity measures can be computed in the context of specific security attacks. For example, Microsoft IIS 4.0 suffered from the ASP vulnerability as documented in the Bugtraq ID 1002 [20]. Exploitation of this vulnerability allows an attacker to traverse the entire web server file system, thus compromising confidentiality. Therefore, in the context of this attack, states UC and F are identified with the loss of confidentiality. Similarly, if Code-Red worm is modified to inject a piece of code into a vulnerable IIS server to browse unauthorized files, states UC and F will imply loss of confidentiality. Therefore, the steady-state confidentiality measure is computed as:

(7.8) $C_{A S P} = 1 - (π_{F} + π_{U C})$

Consider another example, where a Common Gateway Interface (CGI) vulnerability present in the Samber server as reported in Bugtraq ID 1002 was reported [20]. Exploitation of this vulnerability permits an attacker to execute any MS-DOS command including deletion and modification of files in an unauthorized manner, thus compromising the integrity of a system. Here also, states UC and F indicate the loss of integrity, and thus the steady-state measure of integrity is given as:

(7.9) $I_{C G I} = 1 - (π_{F} + π_{U C})$

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123735669500094

Dependable and Secure Systems Engineering

Kishor Trivedi , ... Fumio Machida , in Advances in Computers, 2012

3.2.3 Job Completion Time

Yet another example of combined performance and reliability analysis is computation of the job completion time on a system subject to component failure and repair. The distribution of the job completion time on a computer system considering CPU failure and repair was originally studied in Ref. [51]. The CPU or the server model used in the study was a three-state SMP, and the job completion time was analyzed in a general manner [29,52]. Figure 35 shows the SMP for the CPU model where state 1 represents the server up state; state 2 is the state where the server is recovering from a nonfatal failure; and state 3 is the state where the server is recovering from a fatal failure.

The state 1 and the state 2 are categorized as pre-emptive resume (prs) states in which the job execution is resumed from the interrupted point. On the other hand, the state 3 is categorized as pre-emptive repeat identical (pri) state in which the job execution is restarted from the beginning. A job that started execution when the server is in state 1 may encounter a nonfatal error that leads to the server state change from 1 to 2. The job execution also faces a fatal error that causes the server state change from 1 to 3. Both of nonfatal error and fatal error are repairable and their times to recovery follow general distribution G ₂(t) and G ₃(t), respectively. Assuming that failures are exponentially distributed with rate λ and each failure is either nonfatal with probability p _nf or fatal with probability p _f = 1 − p _nf, the SMP kernel distributions are given by following expressions.

$Q_{12} (t) = p_{nf} \cdot (1 - e^{- λ t})$

$Q_{13} (t) = p_{f} \cdot (1 - e^{- λ t})$

$Q_{21} (t) = \int_{0}^{t} e^{- λ p_{f} τ} \frac{d}{d τ} G_{2} (τ) d τ$

$Q_{23} (t) = (1 - e^{- λ p_{f} τ}) - \int_{0}^{t} λ \cdot p_{f} \cdot e^{- λ p_{f} τ} \frac{d}{d τ} G_{2} (τ) d τ$

$Q_{31} (t) = G_{3} (t)$

Using the analysis method developed in Ref. [52] to the SMP model, we can obtain the Laplace–Stieltjes transforms (LSTs) of the job completion time distribution F ₁ ^∼(s, x) for fixed work amount x:

$F_{1}^{\sim} (s, x) = \frac{e^{- τ (s) x}}{1 - \frac{λ p_{f}}{s + λ p_{f}} \cdot [1 - e^{- τ (s) x}] \cdot G_{3}^{\sim} (s)}$

$τ (s) = s + λ (1 - p_{nf} \cdot Q_{21}^{\sim} (s))$

where Q ₂₁ ^∼(s) and G ₃ ^∼(s) are the LSTs of Q ₂₁(t) and G ₃(t).

Then LST can be numerically inverted, or by taking derivatives expected completion time determined.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123965257000010

Non-Markovian Queueing Systems

J. MEDHI , in Stochastic Models in Queueing Theory (Second Edition), 2003

6.3.4 Semi-Markov process approach

The system-size {N(t)} is not semi-Markovian. Consider that a transition occurs with a service completion (departure of a unit)— that is, t_n, n = 0,1,2,… are the nth departure epochs. Using the notation of Section 1.9,

$\begin{matrix} X_{n} = N (t_{n} + 0) = system size at the n th departure \\ (i .e ., number left behind by the n th departure) \\ Y (t) = \begin{matrix} X_{n}, & t_{n} \leq t_{n + 1} \end{matrix}, \end{matrix}$

we see that {Y(t), t ≥ 0} is a semi-Markov process that has { X_n, n ≥ 0} for its embedded Markov chain. We get

$\begin{matrix} Q_{i, j} (t) = \begin{matrix} 0, & i \geq 1, & j < 1, & j < i - 1 \end{matrix} \\ = \begin{matrix} \int_{0}^{t} \frac{e^{- λ y} {(λ y)}^{j - i + 1}}{(j - i + 1)!} d B (y), & i \geq 1, & j \geq i - 1 \end{matrix} \\ = \begin{matrix} \int_{0}^{t} {1 - e^{- λ (t - y)}} \frac{e^{- λ y} {(λ y)}^{j}}{j!} d B (y), & i = 0, & j \geq 0. \end{matrix} \end{matrix}$

One can proceed, as in Section 1.9, to find

$ν_{j} = \lim_{n \to \infty} p_{i j}^{(n)}$

and

$p_{j} = \lim_{t \to \infty} P r {Y (t) = j} .$

See Fabens (1961) for a relationship between ν_j and π_j , where

$π_{j} = \lim_{t \to \infty} P r {N (t) = j} .$

See also Neuts (1967) for semi-Markov analysis of the more general model M/G(a,b)/1.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780124874626500060

Spatial Choice Models

H. Timmermans , in International Encyclopedia of the Social & Behavioral Sciences, 2001

5 Complex Spatial Choice Behavior

The previous models are all typically based on single-purpose, single-stop behavior. Over the years, however, it became clear that an increasing proportion of trips involved multistop behavior. Moreover, Hägerstrand's time geography had convincingly argued that behavior does not reflect preferences only, but also constraints. Thus, various attempts have been made to develop models of trip chaining and activity-travel patterns.

Originally, most models relied on semi-Markov process models or Monte Carlo simulations. More recently, utility-maximizing models have dominated the field. An important contribution in this regard was made by Kitamura ( 1984), who introduced the concept of prospective utility. It states that the utility of a destination is not only a function of its inherent attributes and the distance to that destination, but also of the utility of continuing the trip from that destination. Dellaert et al. (1998) generalized Kitamura's approach to account for multipurpose aspects of the trip chain.

Trip chaining is only one aspect of multiday activity/travel patterns. Several models have recently been suggested to predict more comprehensive activity patterns. These models can be differentiated into the older constraints-based models (e.g., Lenntorp 1976), utility-maximizing models (e.g., Recker et al. 1986), rule-based or computational process models (e.g., Golledge et al. 1994, ALBATROSS – Arentze and Timmermans 2000).

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B0080430767025201

Markov chain models and applications

Kishor S. Trivedi , ... Dharmaraja Selvamuthu , in Modeling and Simulation of Computer Networks and Systems, 2015

2 Strengths of Markov models

The essential necessity for a stochastic process to be a homogeneous continuous time Markov chain (CTMC) is that the sojourn time in each state must be exponentially distributed. However, the sojourn times in each state may not follow exponential distribution while modeling practical or real-time situations. The existence of the non-exponentially distributed event time gives rise to non-Markovian models. A non-Markovian model can be modeled using phase-type approximation. However, phase-type expansion increases the already large state-space of a real system model. The problem becomes really severe when mixing deterministic times with exponential ones. The strict Markovian constraints are relaxed by using Markov regenerative processes (MRGP). A generalization of CTMC where the time spent by the process in a given state is allowed to follow non-exponential (general) distribution is a semi-Markov process (SMP). Further generalization is provided by MRGP. This concept is used in extending the CTMC model by allowing general distribution for all the event times other than failure times in the given examples. As a result the stochastic process under consideration becomes MRGP.

A Markov renewal process becomes a Markov process when the transition times are independent exponential and are independent of the next state visited. It becomes a Markov chain when the transition times are all identically equal to 1. It reduces to a renewal process if there is only one state and then only transition time becomes relevant. Renewal theory is used to analyze stochastic processes that regenerate themselves from time-to-time. The long-run behavior of a regenerative stochastic process can be studied in terms of its behavior during a single regeneration cycle. Semi-Markov processes are used in the study of certain queuing systems.

Numerous studies have described and reported the occurrence of "software aging" [2–4] in which the state of software degrades with time. This degradation is caused primarily by the exhaustion of operating system resources, data corruption and numerical error accumulation. If untreated, this may lead to performance degradation of the software or crash/hang failure, or both in the long run. Examples of software aging are memory bloating and leaking, unreleased file-locks, data corruption, storage space fragmentation and accumulation of round-off errors [3,4]. Aging has not only been observed in software used on a mass scale but also in specialized software used in high-availability and safety-critical applications [2,5]. To counteract software aging, a preventive maintenance technique called "software rejuvenation" has been proposed [2,6,7], which involves periodically stopping the system, cleaning up, and restarting it from a clean internal state. This "renewal" of software prevents (or at least postpones) a crash failure. The internal state of the software can be cleaned by techniques like garbage collection, flushing operating system kernel tables and reinitializing internal data structures.

Rejuvenation has been implemented in various types of systems, from telecommunication systems [2,8,9], operating systems [10], transaction processing systems [11], web servers [12–14], cluster servers [15–17], cable modem systems [18], spacecraft systems [19], safety-critical systems [5,20], to biomedical applications [21]. Preventive maintenance, however, incurs an overhead (lost transactions, downtime, additional resources, etc.) which should be balanced against the cost incurred due to unexpected outage caused by failure. This in turn demands a quantitative analysis, which in the context of software systems has only recently started to receive attention.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128008874000134

Implementation of HSMM Algorithms

Shun-Zheng Yu , in Hidden Semi-Markov Models, 2016

4.4.4 Unknown Observation Distribution

Usually, based on the empirical knowledge on the stochastic process, the observation distribution $b_{j, d} (v_{k_{1} : k_{d}})$ can be determined whether they are parametric or nonparametric. If they are assumed to be parametric, their probability density distribution functions can be correspondingly determined. When the parametric distribution is unknown, the most popular ones that are often used in practice are a mixture of Gaussian distributions.

Example 4.4

Parametric Distribution of Observations

Use Example 1.4 . Assume the observation distributions are parametric, and the request arrivals is characterized as a Poisson process modulated by an underlying (hidden state) semi-Markov process. The finite number of discrete states are defined by the discrete mean arrival rates. Let $μ_{j}$ be the mean arrival rate for given state $j \in S$ . Then the number of arrivals in a time interval and the Markov state are related through the conditional probability distribution

$b_{j} (k) = \frac{μ_{j}^{k}}{k!} e^{- μ_{j}},$

where $b_{j, d} (v_{k_{1} : k_{d}})$ is assumed conditionally independent.

Note that when the observation distributions are parametric, the new parameters ${\hat{θ}}_{j}$ for state j can be found by maximizing $f (θ_{j}) \equiv \sum_{v_{k_{1}}, \dots, v_{k_{d}}} {\hat{b}}_{j, d} (v_{k_{1} : k_{d}}) \log b_{j, d} (v_{k_{1} : k_{d}}; θ_{j})$ subject to the constraint $\sum_{v_{k_{1}}, \dots, v_{k_{d}}} b_{j, d} (v_{k_{1} : k_{d}}; θ_{j}) = 1$ . For instance, if the probability density function $b_{j} (v_{k})$ , for $v_{k} = 0, 1, \dots, \infty$ , is Poisson with mean $μ_{j}$ , then the parameter $μ_{j}$ can be estimated by ${\hat{μ}}_{j} = \sum_{k} {\hat{b}}_{j} (k) k$ or, equivalently, ${\hat{μ}}_{j} = \sum_{t = 1}^{T} γ_{t} (j) o_{t} / \sum_{j} \sum_{t = 1}^{T} γ_{t} (j) .$

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128027677000048

Mitosis detection in biomedical images

Yao Lu , ... Yu-Ting Su , in Computer Vision for Microscopy Image Analysis, 2021

3.3.7 Hidden conditional random field and semi-Markov model (HCRF and SMM)

The previous methods discussed in this chapter mainly focus on the identification of mitotic sequences. The graph model can also be used for cell state segmentation. A semi-Markov model (SMM) has been applied to model the mitosis process with different stages [6] . SMM is a random field model for sequence segmentation by modeling the state transition as a semi-Markov process. In the sequence segmentation framework, mitotic sequences are first identified by the HCRF method, as defined in Section 3.3.4. Based on the detection results from the HCRF model, SMM further segments the sequence into different stages, which are defined by the transition of cell appearance during the mitosis process. Generally, we can define four stages of the cell appearance transition: (1) interphase, (2) the start of mitosis, (3) formation of daughter cells, and (4) separation of daughter cells. The four stages of mitosis are illustrated in Fig. 6.14.

As shown in Fig. 6.15, given the input sequence X = {x _t}_{t = 1} ^T, SMM divides the sequence into segments S = {s _i}_{i = 1} ^s. Each segment s _i is represented by a pair of integers (u _i, l _i), where u _i indicates the position of the last frame in s_i and l _i represents the state of s _i.

The SMM also generates an overall prediction of the whole sequence as y ′ ∈ {0, 1}:

(6.33) $p (y^{'}| X| γ) = p (y^{'}| S| γ) \cdot p (S| X| γ)$

where γ is the parameter vector for the SMM learned during training.

The sequence is considered to be a valid mitosis process only if the sequence contains the complete state transition process from stage 1 to stage 4. Thus, we define y′ = 1 when S contains one (and only one) complete stage transition process, and y′ = 0 in other cases.

With this definition, Eq. (6.33) can be simplified as

(6.34) $p (y^{'}| X| γ) = p (S| X| γ) = \frac{exp (γ^{T} \cdot ψ (X, S))}{\sum_{S^{'}} exp (γ^{T} \cdot ψ (X, S^{'}))}$

The potential function γ ^T · ψ(X, S) can be defined as

(6.35) $\begin{matrix} γ^{T} \cdot ψ (X, S) = \sum γ^{T} \cdot ψ (l_{i - 1}, l_{i}, X_{u_{i - 1} : u_{j}}) \\ = \sum \sum_{a, b \in L} γ_{a, b}^{T} \cdot [\begin{array}{l} ψ_{1} (X_{u_{i - 1} . u_{i}}) \\ ψ_{2} (X_{u_{i - 1} . u_{i}}) \end{array}] \cdot 1_{a} (l_{i - 1}) \cdot 1_{b} (l_{i}) \end{matrix}$

where ψ ₁(·) indicates the averaged feature vectors, ψ ₂(·) indicates the standard deviation of feature vectors, and [·] represents vector concatenation operation.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128149720000060

Applications of HSMMs

Shun-Zheng Yu , in Hidden Semi-Markov Models, 2016

9.3 Network Traffic Characterization and Anomaly Detection

In this application, HSMMs are applied to characterize the network traffic. Measurements of real traffic often indicate that a significant amount of variability is presented in the traffic observed over a wide range of time scales, exhibiting self-similar or long-range dependent characteristics (Leland et al., 1994). Such characteristics can have a significant impact on the performance of networks and systems (Tuan and Park, 1999; Park et al., 1997). Therefore, better understanding the nature of network traffic is critical for network design, planning, management, and security. A major advantage of using an HSMM is the capability of capturing various statistical properties of the traffic, including the long-range dependence (Yu et al., 2002). It can also be used together with, for example, matrix-analytic methods to obtain analytically tractable solutions to queueing-theoretic models of server performance (Riska et al., 2002).

In this application, an observation in the observation sequence represents the number of user requests/clicks, packets, bytes, connections, etc., arriving in a time unit. It can also be the inter-arrival time between requests, packets, URLs, or protocol keywords. The observation sequence is characterized as a discrete-time random process modulated by an underlying (hidden state) semi-Markov process. The hidden state represents the density of traffic, mass of active users, or a web page that is hyperlinked with others.

Using the HSMM trained by the normal behavior, one can detect anomaly embedded in the network behavior according to its likelihood or entropy against the model (Yu, 2005; Li and Yu, 2006; Lu and Yu, 2006a; Xie and Yu, 2006a,b Xie and Yu, 2006a Xie and Yu, 2006b ; Xie and Zhang, 2012; Xie and Tang, 2012; Xie et al., 2013a,b Xie et al., 2013a Xie et al., 2013b ), recognize user click patterns (Xu et al., 2013), extract users' behavior features (Ju and Xu, 2013) for SaaS (Software as a Service), or estimate the packet loss ratios and their confidence intervals (Nguyen and Roughan, 2013).

For example, a web workload (requests/s) recorded in the peak hour is shown in Figure 1.7 (gray line). The arrival process for a given state j can be assumed as, for instance, Poisson process $b_{j} (k) = μ_{j}^{k} e^{- μ_{j}} / k!$ with one parameter $μ_{j}$ . The initial value of $μ_{j}$ is assumed to be proportional to its state index j, that is,

$μ_{j} = \max (o_{t}) \cdot j / M,$

so that higher state corresponds to higher arrival rate, where M is the total number of hidden states. In considering the range of the observable values (requests/s), the total number M of hidden states is initially assumed to be 30. During the re-estimation procedure, the states that are never visited will be deleted from the state space. To characterize the second order self-similar property of the workload, the duration of state j can be assumed as, for instance, a heavy-tailed Pareto distribution $p_{j} (d) = λ_{j} d^{- (λ_{j} + 1)}$ with one parameter $λ_{j}$ . The initial values of $λ_{j}$ can be assumed equal for all states. To reduce the computational amount, the maximum duration D of the states can be assumed to be finite with sufficiently large value to cover the maximum duration of any state in the given observation sequence, where D=500 s is assumed. As a reasonable choice, the initial values of the probabilities a _ij and $π_{j}$ are assumed uniform.

Given these initial assumptions for the explicit duration HSMM, the ML model parameters can be estimated using the re-estimation algorithm of Algorithm 3.1. The MAP states S _t, for t=1, …, T, can be estimated using Eqn (2.15). The results showed that there were 20 hidden states modulating the arrival rate of requests, and only 41 state transitions occurring during 3600 s. The maximum duration D went up to 405 and the process stayed in the same state for a mean duration of 87.8 s. There were two classes of states among the 20 states: 5 states in the middle played a major role in modulating the arrival streams in the sense that the process spent most time in these 5 states; and the remaining 15 states having the higher and lower indices represented the rare situations that had ultra high or low arrival rates lasting very short time.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128027677000097

walkerondur1958.blogspot.com

Source: https://www.sciencedirect.com/topics/computer-science/semi-markov-process

Continuous Semi markov Processes and Their Applications

Semi-Markov Process

Introduction

1.1.2 Semi-Markov Process

General Hidden Semi-Markov Model

2.1 A General Definition of HSMM

Stochastic Modeling Techniques for Secure and Survivable Systems

Computations of Steady-State Probabilities

Dependable and Secure Systems Engineering

3.2.3 Job Completion Time

Non-Markovian Queueing Systems

6.3.4 Semi-Markov process approach

Spatial Choice Models

5 Complex Spatial Choice Behavior

Markov chain models and applications

2 Strengths of Markov models

Implementation of HSMM Algorithms

4.4.4 Unknown Observation Distribution

Mitosis detection in biomedical images

3.3.7 Hidden conditional random field and semi-Markov model (HCRF and SMM)

Applications of HSMMs

9.3 Network Traffic Characterization and Anomaly Detection

0 Response to "Continuous Semi markov Processes and Their Applications"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel