7.3. Network-Malware Dynamic Game

7.3.1. Formulation

Following the setup described in Section  7.2, consider a system with two players Nimage (network) and Mimage (malware), specified by a system of nimage differential equations [146, P.83]: ẋ(t)=f(t,x(t),uN(t),uM(t))t[t0,T]image, where uN(t)UNRmimage, and uM(t)UMRsimage, and the initial condition x(t0)=x0image, and a damage function (the functional) J(uN,uM)=g(x(T))+t0Th(x,uN,uM,t)dtimage, where x(t)image is the nimage-dimensional state vector. Player Nimage seeks to minimize Jimage by choosing the mimage dimensional control function uN()image, and player Mimage seeks to maximize Jimage by choosing the simage-dimensional control function uM()image. The game is therefore referred to as a dynamic two-player zero-sum game. The players’ payoffs and the set of strategies available to them are called rules of the game. Both players know the rules of the game and each player knows that its opponent knows the rule and ad infinitum.3 As we mentioned in the Introduction to this chapter, we consider open-loop strategies, that is, the controls only depend on time (and not on the state, or the previous history). This is appropriate in the context of the security in networks, as the instantaneous state of the network (exact fraction of the nodes of each type) is impossible (or very costly) to follow, for both of the players.
In our context, (7.1) provides the f()image functions, the initial conditions are provided by (7.1), (7.3) provides the g(),h()image functions, (7.4a), (7.4b) provide UN,UMimage. Also, we have n=3,m=2image, s=1image. Note that the f(),h()image functions in our context depend on time timage only implicitly, that is through the state and control functions. Also, the formulation does not capture any other constraints on the state functions, and in our context it does not need to either, owing to Lemma 7.1.
We now consider the values of the game. The lower value denoted by Vimage is the overall damage when the minimizing player (Nimage) is given the upper-hand, i.e. selects its strategy after learning its opponent’s strategy. Mathematically, V=maxuMminuNJ(uN,uM)image. Conversely, the upper value of the game Vimage is defined as V=minuNmaxuMJ(uN,uM)image Thus, Vimage (Vimage, respectively) is the maximum (minimum, respectively) damage that the malware (network, respectively) can inflict (incur, respectively) if the other player has the upper-hand. Hence, VVimage. A pair of strategies (uN,uM)image is called a saddle-point if J(uN,uM)J(uN,uM)=VJ(uN,uM)image for any strategy uNimage of the network and uMimage of the malware, and then Vimage is the value of the game, and V=V=Vimage.
Thus, if the network selects its saddle-point strategy uNimage, irrespective of the strategy of the malware, the damage it incurs is at most Vimage, which is also the minimum damage that the malware can inflict if it has the upper-hand. Thus, the network’s saddle-point strategy is also its robust strategy, in the sense, that it minimizes the maximum possible damage it can incur. Conversely, the malware’s saddle-point strategy is also its robust strategy, since it maximizes the minimum possible damage it can inflict. Also, the network’s and the malware’s saddle-point strategies are their respective best responses to the other’s robust strategy. We prove the existence of saddle-point pair in the next theorem.

Theorem 7.1

The dynamic game defined above has a saddle-point pair of strategies.

Proof

This theorem directly follows from Theorem 2 in page 91 of [146]. The necessary conditions of the theorem are readily satisfied in our game. Namely,
(i) the system function f(t,x,uN,uM)image in this game is continuous in states and controls and is moreover bounded. Note that this is sufficient for the condition in page 83 of [146] to hold;
(ii) the instantaneous pay-off function h(t,x,uN,uM)image and the terminal pay-off function g(x(T))image are continuous in the states and controls (page 84 of [146]);
(iii) the system function f(t,x,uN,uM)image is linear in controls. The set defining controls are convex sets, and the instantaneous pay-off function h(t,x,uN,uM)image is linear in controls (page 91 of [146]).  

7.3.2. A Framework for Computation of the Saddle-point Strategies

Since the set of deterministic strategies of each player is uncountably infinite, the saddle-point strategies and the value of the game cannot be computed using convex or linear programming. We now present a framework for numerical computation of the saddle-point strategies.
Define the Hamiltonian for a given policy pair (uN,uM)image in an arbitrary two-person dynamic game as H(uN,uM)=λ,f(x,uN,uM,t)+h(x,uN,uM,t)image, where the state functions x()image are those that correspond to the strategy pair (uN,uM)image, and λimage, the costate (or adjoint) functions, are continuous and piecewise differentiable functions of time that satisfy the following system of differential equations wherever the controls (uN,uM)image are continuous: λ̇=xH(x,λ,t)image, and the final value (transversality) condition λ(T)=(g(x))(x)x=x(T)image. In our context,

H(uM,uN)=κII+κDD+uNiR0κruNr+(λIλS)β0uNrISλSβ1R0uNiSλIβ2R0uNiI+(λDλI)uMI,

image

where again the state functions (S(),I(),D())image are obtained from (7.1) with (uN(),uM())image as the control functions, and the costate functions (λS(),λI(),λD())image are obtained from the following system of differential equations (with uN(),uM()image as the control functions) with the final conditions,

λ̇S=HS=(λIλS)β0uNrI+λSβ1R0uNi,λS(T)=0,

image (7.5a)

λ̇I=HI=κI(λIλS)β0uNrS+λIβ2uNiR0(λDλI)uM,λI(T)=0,

image (7.5b)

λ̇D=HD=κD,λD(T)=KD.

image (7.5c)

Then, following [146, P.31], a necessary condition for the pair (uN,uM)image to be a saddle-point strategy pair is that for all t[0,T]image,

(uN,uM)argminũNmaxũMH(ũN,ũM)and

image (7.6a)

(uN,uM)argmaxũMminũNH(ũN,ũM).

image (7.6b)

Henceforth, we denote the saddle-point strategy pair as (uN(),uM())image, and (S(),I(),D())image, (λS(),λI(),λD())image as the corresponding state and costate functions and Himage as the corresponding Hamiltonian. We now express (uN(),uM())image in terms of (S(),I(),D()),(λS(),λI(),λD())image using necessary conditions (7.6). Define ψNr:=(λIλS)β0ISκrimage, and ψNi:=R0λSβ1R0SλIβ2R0Iimage, and ψM:=(λDλI)Iimage. Now, the Hamiltonian can be written as

H=κII+κDD+ψNruNr+ψNiuNi+ψMuM.

image (7.7)

Thus, the Hamiltonian is a separable function of different components of the defense controls (uNr(),uNi())image and the attack control uM()image, that is, each of these appears in different terms in the right-hand side of the above characterization. Now, from the necessary conditions in (7.6) subject to the control constraints in (7.4), the saddle-point strategies are derived as

uNr=uminNrif ψNr>0anduNr=unormNrif ψNr<0,

image (7.8)

uNi=0if ψNi>0anduNi=1if ψNi<0,

image (7.9)

uM=umaxMif ψM>0anduM=0if ψM<0.

image (7.10)

Since ψNr,ψNi,ψMimage are uniquely specified once the state and the costate functions are known, the above relations express the saddle-point strategies in terms of the state and costate functions. The strategies uNr(),uNi(),uM()image can be substituted by the above characterizations in (7.1) and (7.5), resulting in a system of siximage differential equations involving only the state and the costate functions. Using standard numerical methods for solving differential equations, this system can be solved (very fast) using initial and final conditions (7.1) and (7.5). The state and costate functions obtained as solutions will now provide the ψNr,ψNi,ψMimage functions, and thereby the saddle-point strategies via (7.8)(7.10). The resulting set of differential equations is nonlinear and a closed-form solution is unknown. However, as we will show in the next section, using novel techniques, even without access to the closed-form solution, we can establish the type of behavior that the saddle-point strategies exhibit.

7.3.3. Structural Properties of Saddle-point Defense Strategy

We establish that the saddle-point defense strategy has a simple threshold-based structure that ought to facilitate its implementation in a localized manner in resource constrained wireless devices. Specifically, we prove the following.

Theorem 7.2

For the saddle-point defense strategy uN()=(uNr(),uNi())image , there exists times t1image,t2image,0t1<Timage,0t2<Timage such that
uNr(t)=uminNrimage for 0<t<t1image , and uNr(t)=unormNrimage for t1<t<Timage ;
uNi(t)=1image for 0<t<t2image, and uNi(t)=0image for t2<t<Timage.
The overall strategy therefore has the following three phases: In the initial aggressive defense phase, i.e. during (0,min(t1,t2))image, the susceptibles select the minimum possible reception rate, and the dispatchers transmit the patches whenever they are in contact with any other node. Thus, the quarantining is the most stringent, and the recovery most rapid during this phase. Then, in the interim watchful phase, i.e. during (min(t1,t2),max(t1,t2))image, one of the defense controls subsides while the other continues as before. Finally, in the terminal relaxed phase, i.e. in (max(t1,t2),T)image, both defense controls subside, that is, the susceptibles select their normal reception rate and the dispatchers do not transmit the patches. Thus, the QoS in data traffic is back to its normal value and the resource consumption overhead due to patch transmission ends.
Note that the defense strategy always chooses either the maximum or the minimum values of the parameters except possibly at t1,t2image. Such strategies are referred to as bang-bang in the control literature. The durations of the phases (i.e. the values of the threshold times t1,t2image) and which defense subsides in the interim watchful period depend on the damage coefficients κI,κD,KD,κr,κiimage. We will shed more light on the latter in Theorem 7.3 later. But first, we conclude this subsection by proving Theorem 7.2.

Proof

The continuity and piecewise differentiability of ψNr(),ψNi()image follow from those of the costate functions. From the final conditions on the costate functions, i.e. (7.5), ψNr(T)=κr<0image, ψNi(T)=R0>0image. We show that ψNr()image (ψNi()image, respectively) is a strictly decreasing (increasing, respectively) function of time. Thus, each has at most one zero-crossing point in (0,T)image; denote these as t1,t2image. If ψNrimage (ψNiimage, respectively) has no zero-crossing point in (0,T)image, t1=0image (t2=0image, respectively). Thus, from the continuity of the ψ()image functions, and from their terminal values, (i) ψNr()image is negative in (t1,T)image and positive in (0,t1)image and (ii) ψNi()image is positive in (t2,T)image and negative in (0,t2)image. The theorem follows from (7.8) and (7.9). We prove the strict monotonicity of ψNr()image, ψNi()image, using the following.

Lemma 7.2

λS>0image and λI>λSimage,λD0t,0<t<Timage.
The lemma is intuitive since the shadow prices (i.e. costate variables) associated with the susceptibles and dead nodes ought to be positive, and also the shadow price associated with the infectives ought to be at least as high as that associated with susceptibles. However, as we will see next, the proof requires detailed analysis of the state and costate differential equations (7.1) and (7.5), respectively, and is less direct.

Proof

Since λD(T)=KD0image (from (7.5)), and ddtλD0image, λD0image. Now, for the rest, we argue in two steps.
Step 1:λS(T)=0image and λI(T)=KI=0image, also λ̇I(T)λ̇S(T)=λ̇I(T)=κIKDuM(T)<0image. Therefore, ϵ>0image such that on (TϵT)image we have λS>0image and (λIλS)>0image.
Step 2: Proof by contradiction. Let τimage be such that λS>0,(λIλS)>0image on (τT)image and λS(τ)=0image OR λI(τ)=λS(τ)image. From the continuity of the costate functions, (λI(τ)λS(τ))0image and λS(τ)0image.
We first prove that (λI(τ)λS(τ))>0image. Suppose not. Then, λI(τ)=λS(τ)image. Thus, λ̇I(τ)λ̇S(τ)=imageκI+λIβ2uNi(λDλI)uMλSβ1uNi=imageκIλSuNi(β1β2)(λDλI)uMimage. Here, (i) the first term is strictly negative,4 (ii) the second term is negative because λS(τ)0image and β2β1image, and (iii) the third term is negative because of (7.10). Thus, λ̇I(τ)λ̇S(τ)>0image. But, then both λI(τ)=λS(τ)image, and (λIλS)>0image on (τT)image cannot happen. Thus, (λI(τ)λS(τ))>0image.
Now, suppose λS(τ)=0image. λ̇S(τ)=(λIλS)β0uNrIτ<0image. The last inequality follows since (λI(τ)λS(τ))>0image, β0>0image, uNruminNr>0image and I(τ)>0image (Lemma 7.1). This again contradicts the assumptions that λS(τ)=0image and λS>0image on (τT)image. Thus, λS(τ)0image, and hence λS(τ)>0image. 

Strict monotonicity of ψNr()image

We show that ψ̇Nr(t)image is strictly negative at all t(0,T)image5

ψ̇Nr=tψNr=(λ̇Iλ̇S)β0IS+(λIλS)β0İS+(λIλS)β0IṠ

image

which after replacement and simplification yields

ψ̇Nrβ0IS=κI(λDλS)uMβ1λIuNi+β2λSuNi

image

=κI(λDλI)uM(λIλS)uM(β1β2)λIuNi(λIλS)β2uNi.

image

From (7.10), Lemma 7.2 and since κI>0image, β1β2image, uM(t)0,uNi(t)0image at all timage, the right-hand side is negative. The result follows since β0>0image and S(t)>0,I(t)>0image at all timage (Lemma 7.1).

Strict monotonicity of ψNi()image

ψ̇Ni=tψNi=(λ̇Iλ̇S)β0IS+(λIλS)β0İS+(λIλS)β0IṠ

image

ψ̇Niβ0I=κIβ2+β2uMλD+β0β1SuNr(λIλS).

image

The R.H.S is positive from Lemma 7.2 and since κI>0,β0>0image. Thus, ψ̇Ni>0image since β0>0image and I(t)>0image at all timage (Lemma 7.1).  
In the next theorem, we show that under a sufficient condition, the quarantining period (by reduction of communication rates) ends before the immunization/healing effort is stopped. This is in accordance with our intuition that the primary use of quarantining is buying time for the recovery process in the network.

Theorem 7.3

Let t1image and t2image be as defined inTheorem  7.2. If κrβ0S0β2image , then either t2=0image or t1<t2image .
Note that the condition of the theorem is quite intuitive. For instance, if β2=β0image, this condition is satisfied when κr1image. Recall that the coefficients of the costs were rescaled so that the coefficient of the cost of patching is normalized. Thus, this means when the instantaneous cost of per unit reduction of commutation rates (quarantining) is no less than per unit patching. The other parameters in κrβ2S0β0image refer to the cases of relatively small initial susceptible pool, and a relatively fast healing rate.

Proof

Recall from the proof of Theorem 7.2 that ψNr(t)image and ψNi(t)image each have at most one zero-crossing point and ψNr(t)image terminates in a negative, and ψNi(t)image terminates in a positive value at Timage. Now we show that at a potential zero-crossing point of ψNr(t)image, ψNi(t)image is strictly negative.

ψNr=(λIλS)β0ISκr=0λI=κrβ0IS+λS,then,

image

ψNi=R0λSβ1R0SλIβ2R0I=R0λSβ1R0Sβ2R0I(κrβ0IS+λS)=R0(1β2β0κrS)λSβ1R0SλSβ2R0I.

image

According to Lemmas 7.1 and 7.2, the last two terms are strictly negative. The first term is negative following the condition of the theorem (i.e. κrβ2S0β0image) and the fact that Simage is a nonincreasing function of time. Similarly, one can show that given κrβ2S0β0image, then at a potential zero-crossing point of ψNi(t)image, ψNr(t)>0image. The theorem follows from the continuity of ψNi(t)image and ψNr(t)image and by referring to (7.8) and (7.9).  
Note that the case of t2=0image occurs when ψNiimage does not have a zero-crossing point and is hence non-negative throughout [0,T]image. In this case, the immunization/healing effort is never launched.

7.3.4. Structure of the Saddle-point Attack Strategy

The saddle-point attack has a simple first-amass, then slaughter structure in the special case that the worm benefits from killing only through the final tally of the dead (i.e. κD=0image), and the patches can only immunize the susceptibles, but cannot heal the infectives (i.e. β2=0image). Specifically, we have the following.

Theorem 7.4

When β2=κD=0image , for the saddle-point attack strategy uM()image , there exists a time t3image , 0t3<Timage such that uM(t)=0image for 0<t<t3image , and uM(t)=umaxMimage for t3<t<Timage.
Thus, the worm does not kill any infective during the initial amass period of (0,t1)image when it uses them to spread the infection; it slaughters them at the maximum rate subsequently. The intuition behind this structure is as follows. Once the worm infects a host, it never loses it to the recovery process, and thus, since it benefits from killing a host only because this enhances the final tally of the dead, it ought to kill hosts toward the end and utilize them before. The proof follows.

Proof

Note that ψM(T)=KII(T)>0image (because of Lemma 7.1). Thus, as in the proof of Theorem 7.2, the result follows if we can show that ψM(t)image crosses zero at most once. We establish this slightly differently: we show that ψ̇Mimage is strictly positive at its zero-crossing point (as opposed to showing it for all timage). But this is also sufficient to conclude ψMimage has at most one zero-crossing point,

ψ̇M=I(λ̇Dλ̇I)+İψM=κIκD+uM(λDλI)β2λIuNi+Sβ0uNr(λIλS)+İψM.

image

At a zero-crossing point of ψMimage, the last term vanishes. Also, κD=β2=0image. The third and fifth terms are non-negative because of (7.10) and Lemma 7.2, respectively. The result now follows since κI>0image.  
The saddle-point attack strategy may however be more involved when either β2>0image or κD>0image. For example, Fig. 7.2 depicts the saddle-point strategies and the state evolution in an example scenario where κD=13,β2=4.47image. The malware starts killing the nodes from the beginning, but then it stops the killing and infects the newly accessible susceptible nodes, boosting the fraction of the infective and toward the end, starts to kill them all again.
Fig. 7.2
FIGURE 7.2 State evolution and saddle-point strategies. The parameters of the game are as follows: κI=10image,κD=13image,κu=10image,κr=5image,KI=KD=0image,β2=β1=β0=4.47image,π=1image, and initial fractions I0=0.15image,R0=0.1image,D0=0image, and T=4imageFrom©  2012 IEEE. Reprinted, with permission, from Khouzani MHR, Sarkar S, Altman E. Saddle-point strategies in malware attack. IEEE J Sel Areas Commun 2012;30(1):31–43.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.217.105.174